5 Question Series — AI & Data Science 6

Asitdubey
Analytics Vidhya
Published in
3 min readJul 29, 2022

Hi, this is the continuation of 5 Question Series in basic Python MCQs. Hope you’ll enjoy it. These are in pandas and numpy library.

Q1. In Dataframe ‘df’; how many times the “Steak Burrito” in item_name column is ordered? which option will give incorrect output?

a) df[df.item_name==’Steak Burrito’][‘quantity’].sum()

b) df.groupby(‘item_name’)[‘quantity’].sum()

c) df = df.groupby(‘item_name’)[‘quantity’].sum().reset_index()

df[df[‘item_name’]==’Steak Burrito’]

d) df.groupby[[‘item_name’, ‘quantity’]==’Steak Burrito’].sum()

ans : d

Q2. Considering Dataframe ‘df’, which is the correct option to show most expensive item and its quantity?

a) df.sort_values(by = “item_price”, ascending = False)

b) df.sort_values[by = “item_price”, descending = True]

c) df.item_price = df.item_price.apply(lambda x: float(x[1:-1]))

df.sort_values(by = “item_price”, ascending = False).head(1)

d) df.sort_values(by = “item_price”, ascending = False).head(1)

ans: c

Q3.

using the Dataframe ‘dA’, which of the following option will give correct output for selecting all the even rows with column ‘Year’, ‘Population’ and ‘Murder’?

a) dA.loc[dA.index[::2],[‘Year’,’Population’,’Murder’]]

b) dA.iloc[dA.index[::2],[‘Year’,’Population’,’Murder’]]

c) dA.loc[dA.index[0:2:],[‘Year’,’Population’,’Murder’]]

d) dA.iloc[dA.index[0:15:2],[‘Year’,’Population’,’Murder’]]

ans: a

Q4. using the Dataframe ‘dA’ which option will give correct output for sorting the values in ‘Year’ column by descending and selecting rows with murder in between 10000 and 20000?

a) [dA.sort_values(by = ‘Year’, ascending = False)][‘Murder’].between(10000, 20000)

b) dA.sort_values(by = ‘Year’, ascending = False)[dA.sort_values(by = ‘Year’, ascending = False)[‘Murder’].between(10000,20000)]

c) dA[dA[‘Murder’].between(10000,20000)]

d) dA.sort_values(by = ‘Year’, ascending = False)[dA.sort_values([‘Murder’].between(10000,20000))]

ans: b

Q5.

considering the Dataframe ‘tips’. which is the correct option to plot the relation between ‘total_bill’ and ‘tip’ depending on ‘time’ and ‘smoker’ column?

a) sns.scatterplot(x = ‘total_bill’, y = ‘tip’, data = tips, hue = ‘smoker’, style = ‘time’)

b) sns.relplot(x = ‘total_bill’, y = ‘tip’, style = ‘size’, hue = ‘smoker’, data = tips)

c) sns.regplot(x = ‘total_bill’, y = ‘tip’, data = tips, hue = ‘smoker’, style = ‘time’, size = ‘size’)

d) sns.relplot(x = ‘total_bill’, y = ‘tip’, data = tips, hue = ‘time’, style = ‘smoker’, size = ‘size’)

ans) d

--

--

Asitdubey
Analytics Vidhya

Started my interest in Data Science and Machine Learning and want to learn more about it.