Q#59 Chipotle total revenue per item
Suppose you are given a data set of Chipotle orders. Using these data, can you make a plot showing total revenue per menu item?
TRY IT YOURSELF
This question tests our ability to wrangle data and plot it using Python, more specifically the data science friendly package Pandas.
The first step is to look at the data you are about to use and identify two potential problems. First, the price has a ‘$’ sign and it will load in as an object and second, this data is tab separated. So, let’s load the data using the standard .read_csv() function in Pandas but with the extra argument sep=’\t’ for tab separation. Next, lets address the ‘$’ sign issue for the item_price column. We can use the .str.replace() function with the arguments ‘$’ and ‘“”’ and then chain it with .astype(‘float’) to make it numeric.
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv', sep = '\t')df['item_price'] = df['item_price'].str.replace('$', '').astype('float')
Now, we are ready to create the plot. To do this, we can use the .groupby() function to group by the item_name and then take the sum of the item_price with .sum() and sort the result with .sort_values(). Finally, let’s chain this with .plot() and the argument kind = ‘barh’ for a horizontal bar plot and we are done!
df.groupby('item_name').item_price.sum().sort_values().plot(kind = 'barh', figsize = (12,15))