Cyclistic: How does a bike-share navigate speedy success?

A detailed analysis on Cyclistic, a bike-share program with the aim to generate insights and make informed decisions in order to maximize profitability using Microsoft Excel, SQL and Tableau for the analysis.

Lanre Akinbo
6 min readOct 23, 2022

Introduction

The framework for this project was designed and given as a case study for the Google Data Analytics Professional Certificate. I was introduced to Cyclistic, a bike-share program based in Chicago where I was assigned the role of a junior data analyst in the marketing analytics team. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, the marketing team needs to understand how casual riders and annual members use Cyclistic bikes differently in order to generate insights and design a new marketing strategy to convert casual riders into annual members.

The Company

Cyclistic is a successful bike-share offering that was launched in 2016. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. The company has a versatile customer fleet on various pricing plans. Customers who purchase single-ride or full-day passes are referred to as casual riders and customers who purchase annual memberships are Cyclistic members.

iStock.com/SERCAN ERTÜRK

The Dilemma

Cyclistic’s finance analysts have concluded that annual memberships are much more profitable than that of casual riders and therefore arises the need to maximize the number of annual members. The goal is to design marketing strategies aimed at converting casual riders into annual members. In order to do that, the marketing analytics team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics.

My Role

I am a junior data analyst in the marketing analytics team and was assigned a question to answer: How do annual members and casual riders use Cyclistic bikes differently? In order to answer this question, I had to analyze and identify trends in the previous 12 months of Cyclistic bike trip data and draw inferences from them. The above data spanned from September 2021 to August 2022. The dataset was made available by Motivate International Inc. under this license.

Data Preparation

I downloaded the 12 datasets for September 2021 to August 2022 and stored them with appropriate name conventions in a folder on my Local Disk. The datasets were zipped files in Microsoft Excel Comma Separated Values (.csv) formats. I unzipped the folders, opened the individual datasets and had a glance at each in order to get familiar with the data I’ll be interacting with for the project.

Data Cleaning and Manipulation

I used Microsoft Excel to carry out some preliminary cleaning such as removing duplicate records but I noticed the run time was a tad slow due to the large number of rows I was processing. Therefore, I had to import the data into MySQL server using the load infile function for better performance. I made use of both tools in order to complete the data cleaning process; a number of processes which involved:

· changing some column names and formats to better suit my clarity and comprehension.

· calculating ride length into a new column by subtracting start time from end time and deleting records that returned zero or negative values.

· creating separate columns to show start day, start hour of day, month and year.

Analysis

I used MySQL Workbench to carry out my analysis on the project. I applied the GROUP BY, COUNT, DISTINCT, SUM and AVG functions on various columns to summarize the data for each month and used UNIONS to combine them into one table since there were no unique identifiers after my summaries. These processes reduced about 6 million records to 10,029 rows. I exported the table into an Excel Workbook and used pivot tables to summarize the data. It was done alongside using Tableau to summarize the data even further. The analysis was carried out based on:

Number of Rides

There were 5.9 million rides in total of which casual riders had 2.5 million rides which amounted to 41.96% while annual members had 3.4 million rides which is equivalent to 58.04% of total rides.

Ride Length

Cyclistic users rode for an average of 20 minutes per ride. Casual riders rode for an average of 29 minutes while annual members rode for an average of 13 minutes. Therefore, casual riders rode for about 16 minutes longer than annual members.

Ride Type

Casual riders used electric bikes as their preferred ride type although they were used for the shortest average trip length. Docked bikes were used for longer trips by casual riders and were also the least preferred ride type. On the other hand, annual members used classic bikes as their preferred ride type and also used them for longer duration than other ride types. Docked bikes were not used by members at all.

Rides By Hour of Day

There was a relatively steady increase in the number of rides for both casual riders and annual members as the hours of the day went by and it began to fall more exponentially after it peaked. Peak demand for casual riders and annual members was between 4pm and 6pm.

Rides By Weekday

Casual riders demand was relatively low from Monday to Friday but there was an increased demand on weekends and had its peak on Saturdays. Meanwhile, members rides were higher and relatively stable from Monday to Friday. There was a slight decrease in demand on weekends especially on Sundays. Peak demand for members was on Wednesdays. From this observation, it seems casual riders are more inclined towards leisure and recreational activities on weekends while annual members have more routine patterns on work days.

Most Active Start Stations

Streeter Dr & Grand Ave was found out to be the most active start station for casual riders while Kingsbury St & Kinzie St was the most active start station for annual members.

Data Visualization

I used Tableau to create my dashboard and also borrowed a little help from my Adobe Illustrator skills to add some elements to the background of my dashboard.

You can interact with my dashboard here (use Desktop site).

Recommendations

After completing my analysis and generating key findings, here are the recommendations I am presenting to the marketing analytics team:

1. Prices for single-ride and full-day passes on weekends should be increased. Also, a trip duration cap on single-ride passes should be introduced; for example restricting them to just 24 hours per ride whereby longer rides become exclusive to annual members.

2. Priority access for members should be introduced during peak hours (4pm — 6pm) to have more bikes readily available to members over casual riders during rush hours.

3. Paid print and digital media should be employed at the most active start stations and its environs with greater emphasis on weekends. Leisure and recreational parks should also be targeted during the marketing campaign.

Thanks for taking the time out to look at my work. Constructive feedback will be appreciated.

Connect with me on LinkedIn here and check out the Github repository for the SQL code here. You can also access the dataset used for this project here.

--

--

Lanre Akinbo

Data enthusiast here! I'm Lanre, a curious explorer of the countless ways data illuminates our world.