How Does a Bike-Share Navigate Speedy Success?

Google Data Analytics Professional Certificate Case Study

Thayalan Sundralingam
6 min readAug 4, 2021

Hi there! I have recently completed the Google Data Analytics Professional Certificate program on Coursera. As part of this course, students are required to complete a data analysis case study to showcase the skills learned throughout this course and here is my take on this project. For this case study, I used Microsoft SQL Server for the data preparation and Tableau for the data visualization and analysis.

Background

For this case study, I am a data analyst working for Cyclistic, a fictional bike-share company based in Chicago. Since its inception in 2016, Cyclistic has since grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

Cyclistic has 3 flexible pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members.

In order to increase revenue, the company wants to come up with a new marketing strategy to convert casual riders into annual members. Therefore I am tasked with analysing bike usage data to understand how casual riders and annual members use Cyclistic bikes differently, to help create a data driven marketing strategy.

Data Source

The dataset used in this case study is actual public data (view license) made available by Motivate International Inc. who operates the City of Chicago’s Divvy bicycle sharing service. Analysis for this case study is made using data from April 2020 to May 2021, which is stored in separate files for each month. The data contains the following columns:

Data Preparation

Firstly, all the data is imported into a Microsoft SQL Server database using SSIS to ensure that all the imported files have the correct data structure with no truncation errors. I also used conditional split to ensure that the rows where the “started_at” and “ended_at” columns are filtered out if they are not correctly formatted as a datetime, as the data in these columns are crucial for analysis. There were no errors in these two columns and all the rows were successfully imported.

Next I combined the data for all 14 months into one table and created the following new columns:

Some notes about the data:

  1. The dataset contains a total of 4,358,611 rows.
  2. 208,637 rows have a trip duration of less than 3 minutes. For the analysis, only trips 3 minutes or greater will be taken into account to remove any potential bookings made in error by the user.
  3. 201,975 rows have null values in the start_station_name columns. These rows will still be used in the overall analysis but will be filtered out when analysing the most popular stations.

Analysis

Firstly, we can see an interesting trend whereby the total number of trips made by members are slightly more than half of the total number of trips in the 14 months under review. However, the average trip duration of casual riders (45 mins) is more than double that of members (17 mins). This is possibly because members use the bikes just to get from point A to point B, while casual riders use them for leisure.

Next let’s look at the hourly usage trends. Here we can see that members’ usage has two peaks, the first around 8 a.m. and the second around 5 p.m. corresponding with the start and end of the workday. Casual riders on the other hand, start using the bikes more beginning from mid-day until evening.

In terms of daily usage, the data shows that members’ usage trend remain fairly consistent throughout the week. However casual riders use the bikes more during the weekend, with the number of trips on Saturday and Sunday even surpassing that of members. Their average trip duration does not change much throughout the week though, with only a 10 minutes difference between the peak on Sunday and the low on Wednesday.

In this monthly usage chart, we can see that both members and casual riders show a similar trend with more trips made in the warmer months, peaking in August, and less trips during winter, with the least trips in February.

Finally, the following three charts show the most popular starting stations in Cyclistic’s network. It is interesting to note that Streeter Dr & Grand Ave, Lake Shore Dr & Monroe St and Millennium Park are significantly more popular among casual riders than they are among members. The fact that these stations are located in a tourist area overlooking Lake Michigan could be a reason behind this observation.

View the dashboard for this case study at my Tableau Public profile here.

Conclusion

To guide the marketing campaign to convert casual riders into annual members, we now have some data driven insights on how casual riders and annual members use Cyclistic bikes differently. The key findings and my recommendations for the marketing campaign are as follows:

1. Casual riders prefer to take longer trips averaging 45 minutes per trip compared to members who average only 17 minutes.

  • Use this statistic to show casual riders how they could save more money in the long run by becoming a member instead of paying for rides based on trip duration.
  • Introduce a member only rewards program based on trip duration to incentivize casual riders to sign up as members and be eligible for the rewards.

2. Casual riders prefer to use Cyclistic bikes on the weekends where the number of users are almost twice as much as users in the middle of the week.

  • Develop a weekend membership plan whereby rides on the weekends are included in the base price while members have the option to book weekday rides at a lower rate.

3. Bicycles in tourist areas are more likely to be used by casual riders than members.

  • Develop partnerships with the Chicago tourism department or businesses at these locations to offer promotions for Cyclistic members.
  • Produce advertisements targeting users who frequent stations located in tourist areas.

--

--