Cycling in Munich: An Analysis of Bike Activity in the City

Txomin Basterra Chang
6 min readFeb 27, 2024

--

Created with deepai.org

Within the resources provided by the Open Data Portal of the City of Munich, a diverse array of datasets is available for exploration. Notably, it provides access to data collected from bicycle counting stations, offering valuable insights into the dynamics of cycling activity within the city.

Observations are made at various counting stations distributed throughout the city center. Counts are taken at 15-minute intervals to determine how many cyclists pass by during each time period.

The dataset that I will present in this analysis consists of daily values. The 15-minute measurements are cumulated for each day. In addition to daily bicycle activity, this dataset also includes weather data such as temperature or sun hours. The time series begins in June 2008 and extends until December 2022.

First couple of entries of the dataset

Cycling Activity

Let’s first get an overview of the total activity.

We can clearly see that activity increases over the years and exhibits a cyclical pattern: Activity rises throughout the year and then declines towards the end. As expected, activity is lower in winter compared to summer. However, measurement numbers appear to be similar for autumn and spring.

The above graph describes the distribution of activity by season. For the warmer seasons, the distribution is somewhat stretched, meaning that there are similar numbers of days with high and low activity. In winter, the profile is right-skewed, indicating that there are many days on which little or no activity was measured.

Bicycle activity by counting stations

Munich has 6 counting stations, which are distributed throughout the city. We can see on the map below that the stations are all more or less centrally located. Not all stations started measuring in the same year

The graph below illustrates how the activity of the different stations has evolved over the years. Similar to the total activity, most counting stations also exhibit a cyclical growth pattern. Noticeable are the frequency differences between the stations. In some, the activity appears to be consistently higher than in others.

Also of interest are outliers such as at the Olympia counting station (between the years 2012 and 2015). These are likely generated by the 24-hour mountain bike race in the Olympic Park (An analysis can be found on SOMTOMS Blog).

The activity profile can also be well represented with boxplots. The horizontally hatched line represents the total median. Erhardt and Margareten are the stations with the highest activity, notably surpassing the population median.

And we can also see how the activity evolves over the week (from Sunday (1) to Saturday (7)). Less activity is measured on weekends.

Impairments

The data contains information about disruptions and impairments to operations. Roughly 93% of all days went smoothly and no disruptions were reported.

Among the impairments, there are various classifications, with ‘Counting station not yet in operation’ and ‘Construction site’ being the most common reasons for impairment. For 10% of the impairments, there is no classification.

We can also have a look at the operational impairments over time.

Wether Data

The seasons appear to be strongly associated with daily activity. To examine this more closely, it makes sense to take a closer look at the distribution of weather variables.

In Munich, there seem to be many days with little rain, low sunshine, and heavy cloud cover. The temperature typically ranges between 0–20 degrees Celsius. Now, Let’s examine the relationship between the weather and bicycle activity.

The minimum and maximum temperature are positively associated with activity. The relationship appears to be linear, meaning that an increase in temperature by n degrees results in approximately an increase in activity by x cyclists. Similar trends are observed for sun hours. However, with cloud cover, we observe a negative association.

At first glance, the strength of precipitation does not seem to have a significant impact on bicycle activity. This may be because Munich generally receives little rainfall, and the trend is distorted by large rainfall outliers. Let’s restrict the observation to the 95th percentile of the rainfall data.

At lower values, there appears to be a slight negative trend. However, overall, the trend is weak.

Correlations

To conclude our overview of the dataset, let’s examine the correlations between the variables. This step can be particularly important for later modeling purposes.

We can observe that min.temp and max.temp are strongly correlated. Since we don’t want overly strong correlations between our features, we should consider removing one of these variables during modeling.

Of particular interest is understanding the extent to which different variables contribute to the volume of activity. The graph above highlights that the Erhardt counting station has the most significant positive impact on activity. Additionally, variables such as max.temp and sun_hours also contribute significantly. Conversely, if activity was measured at the Kreuther counting station, the likelihood of lower activity levels is relatively high. As previously mentioned, precipitation appears to have no significant influence on measured bicycle mobility.

Regression

A simple model for understanding the influencing factors is linear regression:

model <- lm(total ~ min.temp + precipitation + cloudiness + sun_hours + wday, data = data )

We can see that sun_hours have the greatest impact on daily activity, even greater than temperature. The sunnier it is in Munich, the more activity is measured at the stations.

Interestingly, precipitation now appears to have a significant negative impact on activity. The reason this relationship may not have been apparent before could be because precipitation is positively correlated with the positive influencing factor temperature, and the negative effect was thus overshadowed.

--

--