Uber Trips Visualization

Rupal Bhatt
Rupal Bhatt
Published in
5 min readApr 16, 2018

Uber has changed the market for taxi business and has become one of the favourite model for the riders. The cabs arrive promptly and are more comfortable and convenient compared to regular taxies. However, there are times when the customers are left unsatisfied when there are not enough number of vehicles. It is very difficult to forecast number of riders at a given point of time and it could get even more complicated when the weather is uncooperative.

To predict the trend of how many Uber vehicles will be needed in one place at a given time is not an easy task. The demand for Uber services may depend on several factors some of which are weather, time of the day, day of the week and many more.

The purpose of the project is to analyze the ridership and effect of weather on ridership.

Here we have data for Uber trips in New York city for April 2014. The original data is in a .csv file. There are two files. The first one is related to the Uber rides.

This file has only four fields. Date/Time, Lat (Latitude), Lon (Longitude) and Base.

The second file is New York City Weather Data. This file is in excel.

This file has several columns with details of different weather phenomena like Temperature, Humidity, Sea Level pressure, Visibility, and event like Rain, Thunder storm, and Fog.

Data Preprocessing

The file about Uber rides has data for April only but the original Weather file has data folr April, May, June and July. Thus, our first job is to adjust the data and take data only for the month of April.

The date format for the files were different so they were copied in a different field and modified. This was the only modification required in the original file. It was doin using Excel. Rest all the work of joining the tables and slicing the columns can be easily done in Tableau.

Once we have the file for April weather data only. We can start working on Tableau.

The first step in Tableau is to import the files we need. Here is the snap shot of Uber file imported in Tableau.

And here is the weather data file in Tableau.

As you many notice here there are two extra columns #Calculation Month and #Calculation Day_April. These new columns are made using the Tableau. They are created to deal with the difference in the date settings of the tables.

The next step is to prepare the sheet on Tableau.

The sheet below is about relationship of Uber rides with the temperature. The red line shows the high temperature of a day and the blue bar shows the number of Uber rides for that day. NewYork City extreme weather conditions and in such events the number of requests for Uber rides are much higher than normal. As you can observe from the chart below, when the temperature on April 4th is lower than usual there is higher demand for the Uber services.

Tableau has Analytics tab with several analytics tools at the tip of your finger tips.

The sheet below uses Cluster tool. It shows clusters of areas where there is higher demand compared to other areas. Compared to the rest of the city such clusters show more activity and requests for rides are much more. Airports are obviously included in these clusters. You will observe that more clearly in the story part of this article. You can also use forecast tool to forecast the change in demand.

It is easy to pull together different dashboards and combine them into one with the story feature of Tableau.

The story dashboard below is one such dashboard. Here you can see the dashboard with the clusters where there are more demands for rides. You can also see the days of the week when the demand is high. Plus, the variation in the demand for Uber services in response to the changes in weather. As expected people in NewYorkCity prefer to take Uber when the weather gets cooler.

Data Analysis:

It’s time to Analyze what we have found so with the given set of data. When the temperature hovers around 18 or higher and when there is no rain, fog or other such even the demand for Uber rides go down. Certain days and times of hours create more demand for rides. Certain clusters like Airports have a constant high demand. It appears that in April 2014, Tuesdays were particularly busy. This could be due to some weather event or some other factor. This needs further investigation.

Conclusion:

Tableau is a data visualization software that lets you see and understand data in minutes. Tableau is a professional tool for spotting trends and outliers. You can set up filters and isolate areas that require special focus. You can put different sheets together on story board and visualize the story behind the data.

--

--