Health Insurance Premium

Jeffri
5 min readDec 30, 2022

--

A. Introduction:

Health insurance provides people with a much needed financial backup at times of medical emergencies. To earn health insurance, you need to pay insurance premium to the company. The premium will be managed by the company to cover the risk of loss, damage, or death of the individual that may occur due to unexpected incidents.

Goal: In this project, Medical Insurance Payout dataset will be used to analyze what factors that may correlate with the charges. This analysis will help the company to set the “best” premium for each individual to help them make profit and keep competitive in the marketplace.

B. Description of Dataset

This dataset Medical Insurance Payout contains 1.338 rows and 7 columns:

  1. Age: Age of primary beneficiary (18–64 years old).
  2. Sex: Insurance contractor gender, female, male.
  3. BMI: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg/m2) using the ratio of height to weight, ideally 18.5 to 24.9
  4. Children: Number of children covered by health insurance / Number of dependents (0–5 children).
  5. Smoker: Categorizes whether the user smoking or not.
  6. Region: The beneficiary’s residential area in the US, northeast, southeast, southwest, northwest.
  7. Charges: Individual medical cost that the insurance company should cover.

C. Variable Analysis

1. Charges Analysis

Charges Distribution

Charges Distribution

The distribution of charges is positively skewed, it means the charges distribution are concentrated on the left side. So, most of the beneficiaries have relatively small charges (around $ 9.382).

2. Age Analysis

Age Distribution

Age Distribution

Based on the plot, there are more 18–19 years old beneficiaries. Is it the leading cause of relatively small charges in the charges distribution? Age and Charges variables will be plotted together to see if there is correlation between them.

Correlation between Age and Charges

Correlation between Age and Charges

There is small correlation between age and charges, older benefeciaries tend to have higher charges. So, having more young beneficiaries is one of the reason why the charges are relatively small.

3. Sex Analysis

Sex Proportion

Sex Proportion

There are little more male beneficiaries than female beneficiaries (676 male vs 662 female). How about the charges? which sex has higher charges?

Charges Distribution by Sex

Charges Distribution by Sex

Both distribution are positively skewed, so median will be used as a representative value. Based on the median value, Female beneficiaries have higher charges than Male beneficiaries.

4. Smoker Analysis

Smoker Proportion by Sex & Charges Distribution by Smoker and Sex

Smoker Proportion by Sex & Charges Distribution by Smoker and Sex

Based on the Smoker proportion, there are less beneficiaries with smoking habit, yet they have much higher average charges than non smoker beneficiaries. So, it is clear that smoking habit drives the charges up.

What if we include Sex variable? which Sex has higher average charges given that they are smoking? Based on the plots, Male beneficiaries with smoking habit have higher average charges, they also have higher smoking proportion than Female beneficiaries.

5. BMI Analysis

BMI Classification

#6 BMI Classification (https://wecapable.com/body-mass-index-bmi-importance/)

BMI Distribution by Smoker

BMI Distribution by Smoker

There are no clear BMI difference for beneficiaries with and without smoking habit. So, smoking habit does not increase someone’s BMI.

Correlation between BMI and Charges

#8 Correlation between BMI and Charges

There are correlation between BMI and Charges, but the correlation (0,198) is too small to said that beneficiaries with higher BMI will have higher Charges. So, for deeper analysis BMI will be splitted into 2 groups, BMI ≥ 25 and BMI < 25 to see if higher BMI have higher Charges.

Groupped BMI and Charges

Beneficiaries with BMI ≥ 25 generally has higher Charges than beneficiaries with BMI < 25. So, it is clear that beneficiaries with higher BMI tend to have higher charges.

6. Children Analysis

Children Proportion

#9 Children Proportion

Based on the plot, people with less children are more likely to have an insurance. Is having less children will increase the charges? average charges by children will be plotted to answer that question.

Average Charges by Children

#10 Average Charges by Children

Based on the plot, Beneficiaries with 2 & 3 children relatively have higher average charges, despite having less members than 0 & 1 children. But, the increasing charges trend stop after 3 children category. Let’s dive deeper to see what is happening.

Average Charges by Smoker and Children & Smoker Proportion by Children

Average Charges by Smoker and Children & Smoker Proportion by Children

There is a pattern between the average charges and smoking habit proportion, Children category with higher smoking proportion tend to have higher average charges.

What about average Charges by Children given that they do not have smoking habit? Based on the plot, The average charges tend to be higher when the beneficiaries have more children (given that they do not have smoking habit)

7. Region Analysis

Region Proportion

Region Proportion

Generally, each region has same proportion, only Southeast has little higher proportion than the rest of region. What about the Charges? which region has higher Charges?

Charges Distribution by Region

Charges Distribution by Region

East sides (Northeast and Southeast) have higher median value, So, generally East sides have higher charges. Let’s dive deeper by plotting smoking proportion for each region to know why it can be happened.

Smoking Proportion by Region

Smoking Proportion by Region

There are higher smoking proportion for both East sides, that is why beneficiaries from East sides have higher charges than the West sides.

D. Conclusion

  1. Old beneficiaries have higher charges than young beneficiaries.
  2. Female beneficiaries have higher charges than male beneficiaries.
  3. Smoking habit increase charges.
  4. Beneficiaries with higher BMI have higher charges.
  5. Beneficiaries with more children have higher charges.

--

--