Placement Outcomes (Data Analysis using Orange GUI)

Ruchika Parag Barman
8 min readSep 5, 2020

--

Objective : Analysing of several factors influencing the recruitment of students and extracting information through plots.

Description : The following analysis presents the different plots that attempts to link students’ placement prospects, made possible through student perceptions of recruiting organisations to certain academic parameters such as percentage obtained in secondary and higher secondary school, undergraduate degree and post graduation degree.

Miscellaneous factors such as the gender of the candidate, the choice of board for and the stream opted for in high school and secondary education, undergraduate degree specialisation and post graduate degree specialisation have also been taken into account to predict placement status as well as salary offered.

Several colleges offer employ-ability tests which serve as a way of helping the employers evaluate their workforce, analyse and judge their skills and hence recruit the right talent. Thus, performance of students in such tests conducted by the college and their previous work experience have also been analysed to deduce their relation with recruitment opportunities.

Hypothesis : Students with better scores in secondary education and undergraduate degree have better prospects of getting placed.

Understanding the Project :

Going through the analysis, a reader shall be able to infer :

  1. How the choice of board of education influences placement prospects.
  2. The relative importance of scores obtained in various degrees and streams in campus recruitment procedure.
  3. The relation between gender and work experience with salary offered by corporate on campus placements.

Acknowledgements:

Myself Ruchika Parag Barman and my team mate Prafful Chauhan created this notebook/blog as part of the course work under “Pandas, bamboolib & Orange workshop” at Suven, under mentor-ship of Rocky Jagtiani .

Learned from https://datascience.suvenconsultants.com.

Mentored by Rocky Jagtiani.

Dataset:

This data set consists of Placement data of students in a XYZ campus. It includes secondary and higher secondary school percentage and specialization. It also includes degree specialization, type and Work experience and salary offers to the placed students.

We have taken 60 observations (no of rows) from which we are extract information through exploratory data analysis and visualization. There are 8 categorical features and 6 numerical features.

Histograms :

Inference : Male students are getting more placements than female students and the ratio of male to female in placements is almost around 2:1.

Inference : We can inspect that with respect to high school education, Central board students have wider range of salary than the other board students but placement ratio central to others is less than 1.

Inference : We can inspect that with respect to secondary education, Central board students have wider range of salary than the other board students.

Inference : Commerce and Arts students have wider range of salary and number of placed students are more as compared to science or other stream.

From the above graphs, one can gather that gender plays quite an important role in whether or not a candidate will be hired. It is more likely for a male candidate to get placed at a corporate as compared to a female candidate. Similarly, the board of education and the stream chosen also determine salary offered. Students have been proposed higher amounts of pay that opted for Commerce and Management studies.

Correlations :

The correlations table gives us the following ideas :

  1. Students who have scored well in their secondary education are very likely to perform well in their undergraduate degree also.
  2. Students who have scored well in their high school education eventually perform well in their secondary education also.
  3. Again, students who have scored well in their high school education are very likely to perform well in their undergraduate degree also.
  4. Most students who have had a good academic record in their high school education also score high in their MBA degree.

Boxplots :

Inference : The above boxplot shows the relation between percentage obtained in the undergraduate degree and placement status. Students who get placed score higher than those who do not get placed. The mean score of placed students is given by 68.6925, standard deviation is 6.189 ,2nd quartile or median is 69.25 ,1st quartile is 64.50 and 3rd quartile is 72.1150.

Whereas, the mean percentage of students not placed is given by 60.8670, standard deviation is 7.045, 2nd quartile or median is 61.00, 1st quartile is 56.65 and 3rd quartile is 64.00.

From this analysis, undergraduate students/freshers can prioritise and prepare for their undergraduate/degree examinations keeping in mind the average score, as mentioned above, that the corporate companies generally perceive worthy of grabbing a placement in their establishment.

Inference : Male candidates get a higher pay than female candidates. The mean salary of placed male students is given by 302608.70 , standard deviation is 144726.4 , 2nd quartile or median is 264000, 1st quartile is 240000 and 3rd quartile is 300000.

On the other hand, the mean salary of placed female students is given by 267571.43, standard deviation is 41776.1, 2nd quartile or median is 250000 ,1st quartile is 240000 and 3rd quartile is 300000.

Thus, we can see that while the placement rate of females is lower than males, the salary offered to the placed female candidates is also relatively lower than that of the male candidates.

Pivot Table :

Inference : As more students opt for Commerce and Management, the no. of placed students as well as students not placed are much higher in it as compared to Science and other streams. Even the ratio of placed to students not placed is higher in Commerce and Management is higher than that in Science.

Readers can understand there are relatively more job opportunities for students who opt for Commerce and Management than other streams.

Scatterplots :

For scatterplots, we have used 60% of the data provided. A scatterplot with variables salary and percentage obtained in the degree examination is formed. Here,the different points have been coloured according to the different streams as shown in the legends table.

Inference : The higher salaries have been offered to students whose scores lie in the range 64–74. Moreover, from the point of stream, most of the students that have been offered a pay higher than 300,000 belong to Commerce and Management. Very few students of Science and even fewer students of other streams have crossed the threshold of 300,000 pay.

Inference : Students that specialise in Marketing and Finance and those in Marketing and HR score similarly in MBA percentage. However, the highest paid students generally have scores in the range 62–70, approximately. Very few students have been offered a pay higher than 400,000. Majority of students are offered salaries in the range of 250,000 to 350,000.

We can understand that maintaining an average score that falls in the above mentioned range shall suffice for a decent paying placement.

Mosaic Plot :

Other than academic parameters, some other factors may also be considered for placement by recruiting companies. Employablity tests conducted by colleges are key for establishing appropriate labour market linkages and ascertaining that the workforce is industry ready.

Inference: From the plot above, we can see that of all the students that did not get placed, very few scored above 83.5. Most of the unemployed candidates scored below 83.5.

Moreover, the plot suggests that students having prior work experience are considered more deserving than freshers. Nearly all the sections of students not placed did not have a prior work experience, whereas those having work experience are on the placed students section on the right.

From this, students can comprehend that having an experience in a work environment before campus recruitment proves to be beneficial. Thus, they can plan and prepare accordingly for their future.

Classification Tree :

This classification tree has placement status (placed) as target .It has the following parameters:

It is an induced binary tree.

Minimum no. of instances in leaves : 2.

Do not split subsets more than :5.

Limit the maximal tree depth to : 100.

Classification stops when majority reaches 95%.

Students can acquire a detailed analysis about the dependence of the various academic and other factors on whether or not a candidate gets placed based on the data provided. This tree gives a clear explanation of how the different attributes of a particular student shall influence their placement status.

This classification tree has salary offered as target .It has the following parameters:

It is an induced binary tree.

Minimum no. of instances in leaves : 2.

Do not split subsets more than :5.

Limit the maximal tree depth to : 100.

Classification stops when majority reaches 95%.

Students can acquire a detailed analysis about the dependence of the various academic and other factors on the salary offered to a candidate. This tree gives a clear explanation of how the different attributes of a particular student shall influence their pay.

Vote of Thanks :

I would like to humbly and sincerely thank my mentor Rocky Jagtiani. He is more of a friend to me than mentor .The data analytics taught by him and various assignments we did and are still doing is the best way to learn and skill in Data Science field.

Recommended https://datascience.suvenconsultants.com/

--

--