Q#48: Judging Smiles

It’s often hypothesized (and backed in some studies) that smiling can increase leniency, or reduce the effects of wrongdoing among other benefits.

A 1995 study by Marianne LaFrance & Marvin Hecht produced a dataset containing 4 different types of smiles, as well as the judge’s leniency against judging wrongdoing when seeing these smiles.

The dataset can be interpreted as follows:

Smile:

  • 1 — false smile
  • 2 — is felt smile
  • 3 — is miserable smile
  • 4 — is neutral control

Leniency: a measure of how lenient the judgments were, higher means the judges were more lenient

Given the above information:

  • Plot the leniency by smile type in a parallel box plot
  • Based on the box plot above, which smile condition resulted in the highest leniency?
  • Is the median leniency for the false smile lower than the 75th percentile leniency score for the neutral expression?

TRY IT YOURSELF

https://colab.research.google.com/drive/1vwYkpIl0ichJqq-Qw1gRNytvYCwVUw1N?usp=sharing

ANSWER

This question tests our ability to wrangle data in python and create a visual to answer a series of questions, standard Data Science day-to-day. As we have seen with similar questions the best way to wrangle data in python is through the Pandas library and we can utilize the Seaborn package to create our visualization.

The first step in any data analysis/plotting task is to retrieve the data and examine its first few rows. To do this we can use Pandas .read_csv() and .head(). Additionally, it is good practice to get a summary of the data with .describe(), see if there are missing values with .isna().sum(), and check the data types with .dtypes.

df = pd.read_csv('https://raw.githubusercontent.com/erood/interviewqs.com_code_snippets/master/Datasets/smile_leniency.csv')
df.head()

Here we see that we need to replace the smile column with the text for what each current placeholder key represents in order to create an easy to decipher visual. To do this, let’s use a dictionary and some list comprehension.

# Replace smile column with text
smile_dict = {1: 'false smile',2: 'felt smile',3: 'miserable smile',4: 'neutral control'}
df['smile'] = [smile_dict[i] for i in df['smile']]

Now that the data is prepared, let’s create the visual requested and get our answer for the questions. We will use the Seaborn package and its plotting structure to make a boxplot. To do so, we first import the package and utilize the .boxplot() method, specifying arguments y = ‘smile’, x= ‘leniency’, data = df, and orient = ‘h’ (to produce a parallel boxplot).

# Boxplot of leniencysns.boxplot(y = 'smile', x= 'leniency', data = df, orient = 'h').set(title = 'Leniency by Smile Type');

Using this graph, we see that the false smile has the highest median leniency and it is larger than the 75% of the neutral control.

--

--