A Study on New Jersey

Sainath Gopathi
5 min readApr 28, 2022

--

Moving to USA from India, my first priority is to secure a job after completing my master. Hence, I made it a priority to study the USA job market fairness. I want to know are there any opportunists for people who are not American citizens. For this project I selected the New Jersey state teachers data set. Being one of the states with highest Immigrant population I felt this is going to be the perfect data set to study the US job market. Considering the fact that I cannot infer same outcomes with all the industries, I still feel this could be a great start to study the US job market.

New Jersey Teachers Data Set

The New Jersey Data set is sourced from the Kaggle website (Obviously). The data set is comprehensive, consists of nearly 180,000 entries of the all the New Jersey school employees. The data set has 15 variable which are

First and Last name of the employees, County (Employee’s school county), District (Employee’s school district), School (Name of employee’s school), Primary Job (Description of employee’s primary job), fte (Full time status indicator), Salary (Annual Salary), Certificate (Teaching certificate description), Subcategory (Subcategory of teaching certificate earned by employee), Teaching Rout (Traditional or Alternate teaching methods), Highly Qualified (Information about teaching qualification), Experience district, Experience NJ, Experience Total which are self explanatory.

To study the New Jersey teaching job market I chose to follow the employees salary (because money, that’s where all the facts are hidden). Correlate the available variables with the employee salary.

Geographical impact on Employee Salary

Considering the fact that teaching jobs in India are not highly influenced by the geographical location, I was curious if it is same in New Jersey. So I went for county wise geographical plot of New Jersey state for the average salary. The reason I went with average salary because, the count of teachers might be different for different counties. I used tableau to plot the graph. I used flips code to connect the geographical locations on tableau.

Average NJ Teacher’s Salary Across NJ Counties

Map signifies the average salary is higher for the urban regions than the rural regions.

Experience Vs Salary

To find the correlation in between salary and experience I had to plot a scatter plot. I had a reference line to find the strength of the correlation.

Though it is quite obvious (salary increases with experience) i feel it is important to have the correlation in between them. I have used the following code to plot the scatter plot.

ggscatter(
data= nj_teachers_salaries_2016,
x = ‘experience_total’,
y = ‘salary’,
size=1,
color=’grey’,
add= ‘reg.line’,
cor.coef = “T”,
add.params = list(color=’black’),
ggtheme = theme_pubclean(),
title = “Relation Between Employee Salary and Experience”,
subtitle=’There is a strong positive correlation between Employee salary and Experience’
)+
theme(legend.position = “none”)+
scale_y_continuous(labels = scales::comma)+
easy_labs(x= “Employee Total Experience”, y=”Employee Salary”)

Correlation in between Certificate and Salary

Certificate has the data regards to the employee certified to be a teacher. Basically there are 5 certificates.

Standard Certificate: It is a permanent certificate given to the employees who are met all the requirements

CE: CE certificate is a permanent certificate given to the employees who did not complete teaching preparation program but who met all requirements

CESA: The Certificate of Eligibility with Advanced Standing (CEAS) is a credential issued to an individual who HAS completed a teacher preparation program and has met the basic requirements for certification including academic

Provisional: This two-year certificate is requested by the employing school district for a newly hired teacher after an individual obtains a CE or CEAS and a full-time teaching position.

Non-Citizen Standard: A five-year certificate issued to an individual who has met all requirements for state certification, but is not a US citizen.

Emergency: Emergency certificate are simply CE certificate issues on emergency basis.

Since the certificate is an ordinal variable, I have used the box plot.

Following is the code I have used to plot the graph.

ggboxplot(
data = nj_teachers_salaries_2016,
x=’certificate’,
y=’salary’,
color = ‘black’,
fill=’certificate’,
ggtheme = theme_pubclean(),
title = “Employee Salary Spread Across Certifications”

)+
scale_y_continuous(labels = scales::comma)+
theme(legend.position = “none”)+
easy_labs(x=”Employee Certificate”, y= “Employee Salary”)

Surprisingly, the second highest median salary is attained by Non-citizen certificate holders. This explains that unlike Indian job market, US job market is open to even non citizens. Which is a positive sign. However, I want to find if the Non-citizen employees are limited to single job role.

Density plot of Certificates to Salary

Here is another reinforcement to a well organized job market. If you see the density plot, you could see the salary of Non-citizen employees is well distributed. Which means even Non-citizen certificate holders are treated same across all the roles. More importantly, the spread is similar to standard certificate holders.

The code I have used is

ggdensity(
data = nj_teachers_salaries_2016,
x = ‘salary’,
fill = ‘certificate’,
facet.by = ‘certificate’,
legend = ‘right’,
color = “certificate”,
ggtheme = theme_pubclean(),
title = “Salary dristibution across all certifications”,
subtitle = “CEAS & Provisional Certificate holders are specific to single job”
)+
theme(legend.position = “none”)+
scale_y_continuous(labels = scales::comma)+
scale_x_continuous(labels = scales::comma)+
easy_labs(x=”Salary Dristibution”, y=’Density’)

Highest Paid Job Roles In NJ Schools

Finally, to find the highest paid job roles in the NJ schools I had plotted a graph in between employee sub category and employee salary.

following is the code I have used to plot the graph

ggboxplot(
data = nj_teachers_salaries_2016,
x=’subcategory’,
y=’salary’,
color = ‘black’,
fill=’subcategory’,
ggtheme = theme_pubclean(),
title = “Highest Paid Job in New Jersy Schools”,
subtitle=’On average admin jobs are highest paid roles’


)+
scale_y_continuous(labels = scales::comma)+
theme(legend.position = “none”)+
easy_labs(x=”Eployee Roles”, y= “Employee Salary”)

Design Choice

Considering my learning from story telling with data, I haven’t done a better job at having a consistent coloring of the plots but I have made the graphs minimalist with no axis. I have used pubclean theme thorough all the plots to keep the design consistent. Nevertheless, I should have made a better job at highlighting the specific plot.

--

--