Cyclistic bike-share analysis case study

Julio Vargas
6 min readDec 13, 2022

--

Cyclistic is a bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day.

Scenario

I am a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, our team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, our team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve our recommendations, so they must be backed up with compelling data insights and professional data visualizations.

Lily Moreno is the director of marketing and our manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels.

Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.

In order to answer the key business questions, I will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act.

Ask

Three questions will guide the future marketing program:

  1. How do annual members and casual riders use Cyclistic bikes differently?
  2. Why would casual riders buy Cyclistic annual memberships?
  3. How can Cyclistic use digital media to influence casual riders to become members?

Moreno has assigned me the first question to answer: How do annual members and casual riders use Cyclistic bikes differently?

Prepare

  • Downloaded the previous 12 months of Cyclistic trip data from October 2021 to September 2022 here. (Note: The datasets have a different name because Cyclistic is a fictional company. The data has been made available by Motivate International Inc. under this license.)
  • Created a copy of the original data to organize and rename the CSV files in different folders.

Process

  • Excel was chosen to clean, transform, and check data errors in each file to be analyzed.
  • Changelog:
  • Summarize after clean null and negative values on the ride_length column:

Analyze

Microsoft SQL Server was chosen to perform the analysis of the data.

Followings steps:

  • Import data.
  • Merge all tables to analyze in a single data frame.
  • Create, and explore different statements using this query.
-- Union ALL
SELECT * INTO tripdata_12months
FROM (SELECT *
FROM tripdata_202110
UNION ALL
SELECT *
FROM tripdata_202111
UNION ALL
SELECT *
FROM tripdata_202112
UNION ALL
SELECT *
FROM tripdata_202201
UNION ALL
SELECT *
FROM tripdata_202202
UNION ALL
SELECT *
FROM tripdata_202203
UNION ALL
SELECT *
FROM tripdata_202204
UNION ALL
SELECT *
FROM tripdata_202205
UNION ALL
SELECT *
FROM tripdata_202206
UNION ALL
SELECT *
FROM tripdata_202207
UNION ALL
SELECT *
FROM tripdata_202208
UNION ALL
SELECT *
FROM tripdata_202209)
AS temp;


-- Total rides/minutes per user
SELECT member_casual,
COUNT(member_casual) AS total_rides,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS total_minutes
FROM tripdata_12months
GROUP BY member_casual;

-- Rides / minutes by bike type
SELECT rideable_type,
COUNT(rideable_type) AS total_rides,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS total_minutes
FROM tripdata_12months
GROUP BY rideable_type;

-- Rides / Minutes bike type member
SELECT rideable_type,
COUNT(member_casual) AS rides_member,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_member
FROM tripdata_12months
WHERE member_casual = 'member'
GROUP BY rideable_type;

-- Rides / Minutes by bike type casual
SELECT rideable_type,
COUNT(member_casual) AS rides_casual,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_casual
FROM tripdata_12months
WHERE member_casual = 'casual'
GROUP BY rideable_type;

-- MODE by day_of_week
SELECT day_of_week,
COUNT(day_of_week) AS rides_total,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_total
FROM tripdata_12months
GROUP BY day_of_week
ORDER BY
CASE
WHEN day_of_week = 'Sunday' THEN 1
WHEN day_of_week = 'Monday' THEN 2
WHEN day_of_week = 'Tuesday' THEN 3
WHEN day_of_week = 'Wednesday' THEN 4
WHEN day_of_week = 'Thursday' THEN 5
WHEN day_of_week = 'Friday' THEN 6
WHEN day_of_week = 'Saturday' THEN 7
END ASC;

-- MODE rides / minutes day_of_week member
SELECT day_of_week,
COUNT(member_casual) AS rides_member,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_member
FROM tripdata_12months
WHERE member_casual = 'member'
GROUP BY day_of_week
ORDER BY
CASE
WHEN day_of_week = 'Sunday' THEN 1
WHEN day_of_week = 'Monday' THEN 2
WHEN day_of_week = 'Tuesday' THEN 3
WHEN day_of_week = 'Wednesday' THEN 4
WHEN day_of_week = 'Thursday' THEN 5
WHEN day_of_week = 'Friday' THEN 6
WHEN day_of_week = 'Saturday' THEN 7
END ASC;

-- MODE rides / minutes day_of_week casual
SELECT day_of_week,
COUNT(member_casual) AS rides_casual,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_casual
FROM tripdata_12months
WHERE member_casual = 'casual'
GROUP BY day_of_week
ORDER BY
CASE
WHEN day_of_week = 'Sunday' THEN 1
WHEN day_of_week = 'Monday' THEN 2
WHEN day_of_week = 'Tuesday' THEN 3
WHEN day_of_week = 'Wednesday' THEN 4
WHEN day_of_week = 'Thursday' THEN 5
WHEN day_of_week = 'Friday' THEN 6
WHEN day_of_week = 'Saturday' THEN 7
END ASC;

-- Rides / minutes by months
SELECT months,
COUNT(months) AS rides_total,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_total
FROM tripdata_12months
GROUP BY months
ORDER BY
CASE
WHEN months = 'January' THEN 1
WHEN months = 'February' THEN 2
WHEN months = 'March' THEN 3
WHEN months = 'April' THEN 4
WHEN months = 'May' THEN 5
WHEN months = 'June' THEN 6
WHEN months = 'July' THEN 7
WHEN months = 'August' THEN 8
WHEN months = 'September' THEN 9
WHEN months = 'October' THEN 10
WHEN months = 'November' THEN 11
WHEN months = 'December' THEN 12
END ASC;



-- Rides / minutes by months member
SELECT months,
COUNT(member_casual) AS rides_member,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_member
FROM tripdata_12months
WHERE member_casual = 'member'
GROUP BY months
ORDER BY
CASE
WHEN months = 'January' THEN 1
WHEN months = 'February' THEN 2
WHEN months = 'March' THEN 3
WHEN months = 'April' THEN 4
WHEN months = 'May' THEN 5
WHEN months = 'June' THEN 6
WHEN months = 'July' THEN 7
WHEN months = 'August' THEN 8
WHEN months = 'September' THEN 9
WHEN months = 'October' THEN 10
WHEN months = 'November' THEN 11
WHEN months = 'December' THEN 12
END ASC;

-- Rides / minutes by months casual
SELECT months,
COUNT(member_casual) AS rides_casual,
SUM(DATEDIFF(MINUTE, started_at, ended_at)) AS minutes_casual
FROM tripdata_12months
WHERE member_casual = 'casual'
GROUP BY months
ORDER BY
CASE
WHEN months = 'January' THEN 1
WHEN months = 'February' THEN 2
WHEN months = 'March' THEN 3
WHEN months = 'April' THEN 4
WHEN months = 'May' THEN 5
WHEN months = 'June' THEN 6
WHEN months = 'July' THEN 7
WHEN months = 'August' THEN 8
WHEN months = 'September' THEN 9
WHEN months = 'October' THEN 10
WHEN months = 'November' THEN 11
WHEN months = 'December' THEN 12
END ASC;

-- MAX ride per user
SELECT member_casual,
MAX(DATEDIFF(MINUTE, started_at, ended_at)) AS max_minutes
FROM tripdata_12months
GROUP BY member_casual;

-- AVG minutes per user
SELECT member_casual,
AVG(DATEDIFF(MINUTE, started_at, ended_at)) AS avg_minutes
FROM tripdata_12months
GROUP BY member_casual;


-- AVG minutes per user statistic
SELECT member_casual,
AVG(DATEDIFF(MINUTE, started_at, ended_at)) AS avg_minutes
FROM tripdata_12months
WHERE DATEDIFF(MINUTE, started_at, ended_at) >= 1
AND DATEDIFF(MINUTE, started_at, ended_at) <= 1440
GROUP BY member_casual;

Share

PowerBI was chosen to create the data visualization and share the key findings.

The analysis was focused on 2 areas rides and, minutes.

Rides:

Minutes:

Full presentation here.

Act

Conclusions based on my analysis are the following:

  • Casual riders are more active on weekends, the preferred bike is the electric one, and spend more minutes per ride than de member rider.
  • Both users match in seasonal months from the end of spring to mid-autumn.
  • Digital media can be used to influence Casual user to become Members users if marketing focus on Weekend and seasons from spring to autumn

--

--

No responses yet