Member-only story
Prediction in Various Logistic Regression Models (Part 1)
Statistics in R Series
Introduction
We have covered various types of logistic regression in the past several articles. The goal of all these models is to predict future data points as well as intermediate data points as accurately as possible. In this article, we will go through how this prediction analysis can be done in R for simple and multiple logistic regression using both binary and ordinal data.
Dataset
The Adult Data Set that is available in the UCI Machine Learning Repository will be used as a case study as part of our research. The data collected in this dataset includes the demographic data of more than 30000 individuals. The data includes each individual’s race, education, job, gender, salary, hours worked per week, number of jobs held, as well as the amount of income they earn.
A refresher on the dataset:
- Bachelors: 1 means the person has a bachelor’s degree and 0 means the person doesn’t have a bachelor’s degree
- Income_greater_than_50k_code: 1 means the total family income is greater than $50k and 0 means the total family income is less than $50k
- Marital_status_code: 1 means the person is married and 0 means the person is not married or…