Humble Bumble — Data Analyst Interview Challenge for Data Queens🐝🍯

Humble Bumble — Data Analyst Interview Challenge using Python, Pandas, and Matplotlib 🐝🍯

Code with Corgis
Code with Corgis

--

Dear 👋💻🌎 Data Queen,

MAKING LEARNING HOW TO CODE

✧・゚:* CUTE(◕‿◕✿) and INFORMATIVEᕙ(⇀‸↼‶)ᕗ!!!

Question 1: Count the number of unique words.

Question 1: Please complete the below shell function so that, given a string s, it will count the number of unique words, which is case insensitive and ignores punctuation.

  • The answer should be printed and should be printed in alphabetical order.
  • No libraries outside of the python standard libraries can be used (ie, no pandas, no sklearn, no nltk etc).

Example:

“I’m smart, I’m educated. It would have been a disservice to every woman to go away or hide.” — Whitney Wolfe Founder of Bumble

Input: "I'm smart I'm educated. It would have been a disservice to every woman to go away or hide." Ouput: 
[
('a', 1),
('away', 1),
('been', 1),
('disservice', 1),
('educated', 1),
('every', 1),
('go', 1),
('have', 1),
('hide', 1),
("i'm", 2),
('it', 1),
('or', 1),
('smart', 1),
('to', 2),
('woman', 1),
('would', 1)
]

Code:

punctuations = [',', '.', '!', '"', '?']def word_count(s): 
sentence = s.lower()
for punctuation in punctuations:
words = sentence.replace(punctuation, '')
word_list = words.split()
word_dict = {word : word_list.count(word) for word in word_list}
return sorted(word_dict.items())
Question 2: Given a pandas data frame calculate the ratio of messages sent to messages received.

Question 2: Using the given pandas dataframe, please calculate the ratio of messages sent to messages received (messages_sent / messages_received) split by country and gender, and visualize this in a way that is easy to digest and understand. Please use any libraries you wish.

Step 1: Cleaning the NaNs with Zeros

# Filling NaNs with zerosmessages_df = messages_df.fillna(0)
messages_df
Cleaning the NaN with zeros

Step 2: Creating a grouped table for Messaged Received

Creating a grouped table for Messaged Received

Code:

# Creating a Grouped table for Messaged Receivedtotal_messages_received_df = messages_df.groupby(['country', 'gender']).\
messages_received.\
sum().\
to_frame().\
reset_index().\
rename(columns = {'': 'messages_received'})
total_messages_received_df

Step 3: Creating a grouped table for Messaged Sent

Creating a grouped table for Messaged Sent

Code:

# Creating a Grouped Table for Messaged Senttotal_messages_sent_df = messages_df.groupby([‘country’, ‘gender’]).\
messages_sent.\
sum().\
to_frame().\
reset_index().\
rename(columns = {‘’: ‘messages_sent’})
total_messages_sent_df

Step 4: Merging the 2 Tables

Code:

# Merging the two tablesbumble_df = pd.merge(total_messages_received_df, total_messages_sent_df, how = 'outer', left_on = ['country', 'gender'],  right_on=['country', 'gender'])bumble_df

Step 5: Calculating Message Ratio

message ratio = (messages sent) / (messages received)
# Calculating the Messaged Ratiobumble_df['messages_ratio'] = (bumble_df['messages_sent'] / bumble_df['messages_received']) * 100bumble_df

Step 6: Analysis of the Data

  • French males have the highest send/receive ratio with 92 messages sent and only 11 received back.
  • French females and UK males seem very popular with ratios of 26% and 30%.
  • Both have received a lot more messages than they have sent.

Step 7: Plotting the Data

# Plotting the data on a chart
c = 3
# Converting Ration to percent
female_ratio = list(bumble_df[bumble_df["gender"] == "F"].messages_ratio)
male_ratio = list(bumble_df[bumble_df["gender"] == "M"].messages_ratio)country = np.arange(c)
width = 0.4
fig = plt.figure(figsize = (8, 8))
ax = fig.add_subplot()
bar_1 = ax.bar(country, female_ratio, width, color = 'gold', label = 'Female')bar_2 = ax.bar(country + width, male_ratio, width, color = '#F9B007', label = 'Male')ax.set_ylabel('Percent')
ax.set_xticks(country + width / 2)
ax.set_xticklabels(['BR', 'FR', 'UK'])
ax.legend((bar_1, bar_2), ('Female', 'Male'), loc = 'upper right')
ax.set_title('Bumble Ratio of messages received by country and gender')

Thank you for reading my Data Journey ❤ ,

Kody the Coding Corgi & Bits the Adorable A.I.

kody@codewithcorgis.com

P.S.

If you enjoyed this comic strip and could help you in any way, sign up for our newsletter, or buy me a boba, which means a lot, and send your thoughts and feelings about this work.

Are you interested in collaborating? Follow us on LinkedIn.

D.M. us on Instagram or tweet us on Twitter or connect us on LinkedIn.

Please share this with your data friends, corgis friends, and coding corgis friends so we can make more comics in the future with your support. Thank You!

--

--

Code with Corgis
Code with Corgis

🍑 We make CODING CUTE(◕‿◕✿) and INFORMATIVEᕙ(⇀‸↼‶)ᕗ!