Dataset for calculus-based introductory physics unit test scores

Benjamin Obi Tayo Ph.D.
Modern Physics
Published in
3 min readJun 13, 2018

Having taught calculus-based introductory physics at Pittsburg State University (Southeast Kansas) for 8 semesters, I decided to put together a dataset containing unit test scores and corresponding letter grades for all students that have taken the course. Calculus-based introductory physics which is named engineering physics (PHYS 104) at Pittsburg State University is an introductory level course taken by science, engineering, and engineering technology students. The course covers the basics of mechanics, waves, fluids, and thermodynamics, and is offered every semester. To access student learning outcomes and course objectives, three to four one-hour unit tests are organized every semester. The dataset provided only contains scores for in-class one-hour unit tests, and does not include any contributions from out-of-class assessments like quiz, homework, projects, etc.

For the dataset provided, 86% of all the students were male and 14% were female. In terms of demographics, 87% were domestic and 13% were international. A majority of the domestic students are from the four-state region (Kansas, Missouri, Oklahoma, and Arkansas), while a majority of the international students are from Saudi Arabia, and a few from Asia and South America. The student data for the 8 semesters from fall 2014 to spring 2018 was exported from the course management website (canvas). A code was written using R programming language to clean and organize the unit test scores, as well as group the scores into the corresponding grade categories: A (90–100); B (80–89), C (70–79), D (60–69); and F (0–59). The final dataframe containing test scores and corresponding letter grades was saved in a csv file, which can be downloaded using this link: https://github.com/bot13956/datasets

The dataset should be used freely by anyone interested in understanding how students from Southeast Kansas perform in calculus-based introductory physics. The dataset contains both categorical and numerical data, hence an excellent dataset for anyone looking for data to perform simple analysis in order to practice and sharpen basic data analysis skills.

Examples Illustrating how the dataset could be used

A) Import the dataset as follows:

library(tidyverse)
library(readr)
data<-read_csv("https://raw.githubusercontent.com/bot13956/datasets/master/introduction_to_physics_grades.csv")data%>%head(n=10)

B) Generate barplot showing distribution of grades as a percentage:

barplot(round(100*prop.table(table(data$Grade))),col = c('red','blue','green','orange','black'))

C) Generate probability density plot for test scores:

data%>%ggplot(aes(Score))+geom_density(fill='pink')

In summary, we’ve shown how to access the dataset for calculus-based introductory physics test scores. We’ve also shown how this data set can be used for exploratory data analysis. This is very useful dataset for practicing basic skills in exploratory data analysis.

Thanks for reading!

--

--

Benjamin Obi Tayo Ph.D.
Modern Physics

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com