Change takes time: An overview of studying student performance longitudinally

By: Edwin Asare, Ghana Learning & Evaluation Manager and Christopher Stanfill, Ph.D., Director of Learning & Evaluation

Our mission at Pencils of Promise (PoP) is to provide a quality education to the communities we serve by providing safe and healthy school environments, full of well-trained and supported teachers. PoP’s Teacher Support program aims to provide teachers with the resources and skills needed to improve the quality of education children receive. Therefore, the organization’s key performance indicators (KPIs) focus specifically on student literacy, as we expect that more effective teaching will lead to improved performance on passage reading fluency and comprehension among students.

The Early Grade Reading Assessment (EGRA), our primary tool for evaluating student performance, is an individually-administered oral assessment of the most basic foundation skills for literacy acquisition in early grades. EGRA at PoP is used as a formative assessment tool to evaluate student performance, as an indirect effect of the TS program. These data provide the TS team with one of many (i.e., qualitative and quantitative data) sources of information to guide decision making for programmatic improvements.

Until the beginning of this academic year (2018–2019), EGRA in PoP was carried out using the cross-sectional design, which compares populations at a single point in time. In our case, that is one academic year. Historically, a sample of PoP schools were included in the EGRA evaluation and eight students in each primary grade from each school were randomly selected at baseline (i.e., beginning of the school year) each year. Use of the cross-sectional design presented a central challenge: different schools and students could be selected for the following academic year, creating a situation where EGRA results could not be compared between years due to the varying samples. Due to that, student performance could only be measured or tracked in one academic year and the progress of the students could not be measured when advancing grades.

Reflecting on our approach, our team aspired to collect data in a manner that would reflect the changes in student progress that happen over time considering the complexity of skill development in literacy. This led to the development of longitudinal research designs for Ghana, Guatemala and Laos that would track individual students over the course of time. Given the differences of each country and varying hypotheses associated with the TS program in each setting, introduction of a longitudinal design offered the opportunity for us to ask complex and tailored research questions. This new approach will help in providing more reliable and valid findings for following country-specific questions:

  • What is the effect of the TS program on different language groups (i.e., Spanish and Mayan languages) in Guatemala?
  • What is the effect of the TS program on students in Laos as we expand the program into the 4th and 5th grades during the coming years?
  • What is the effect of the variations of the TS program (e.g., TS + e-readers vs. TS + books) on students in Ghana?

Case study: Longitudinal design in Ghana

Starting from this academic year (2018–2019), two groups of students, starting with Grades 1 and 3 this year, will be tracked over the course of four years. This means that PoP will be able to collect EGRA data on the same students over the course of four academic years (Table 1), which will enable us to reliably report on the progress of student performance over time. It also implies that in all the 20 schools sampled, the entire student enrollment, for included grades, will be assessed. This differs from the cross-sectional approach, which tests only eight students in all grades during one academic year.

Table 1: Grade progression of each group over four-year testing period

Of the 20 schools sampled, 15 were in the TS program (treatment sample), while 5 were not in the TS program (control sample). The detailed distribution of students with respect to their grades and gender are presented in Table 2 below:

Table 2: Grade and gender distribution of students

In order to ensure that only students who took part at the beginning of the longitudinal study (baseline at Year 1) are tracked throughout the entire four years, the following are to be adhered to:

  1. If it happens that grade 1 students move to grade 2 and new admission of students are made into that class, those new students will be excluded from taking part in the EGRA assessment. The same will apply if those in grade 3 move to grade 4 and there has been new admission of students.
  2. If a student repeats grade 1 while his/her classmates get promoted to grades 2, the student who remained in grade 1 will be excluded from the sample. This is because we will only test those who move to grade 2 in order to track the progression and same will apply to following grades and years.

We recognize it is very likely that our full sample at baseline of Year 1 will not be represented at the completion of the full evaluation cycle (i.e., four years), though we are confident that our sample size is large enough to control for population variations at the end of four years (i.e., our sample size suggests sufficient statistical power).

As mentioned, variations of PoP TS programming will be evaluated with the comparison of student performance in three different treatment groups. The three treatment groups are as follows:

  • PoP-built schools with TS programming and PoP-distributed e-readers in the classroom
  • PoP-built schools with TS programming and PoP-distributed books in the classroom
  • Non-PoP built schools with TS programming and PoP-distributed books in the classroom

Comparison of these three groups to a control group (i.e., PoP-built schools without TS programming, e-readers, or books) will enable us to understand the magnitude of effectiveness across each program design. It will also allow us to have a better understanding of the relationship between books and e-readers and the effect of a PoP-built school on student learning outcomes.

It is worth noting that all the 15 treatment schools (those in the TS program) are in their first year of the Teacher Support Program, while the 5 comparison schools (the control schools) have not been used in any comparison assessment before. This is important to note because we needed samples that have not been contaminated by previous EGRA assessments.

Patiently waiting

Ghana’s case study provides just a brief glimpse into the exciting changes our teams in all countries are making toward evaluating student performance over time. PoP has always invested in ensuring our data and results truly represent the work that is being implemented. With the celebration of our 10th anniversary and confidence that we have an intervention that is positively impacting students, teachers and communities, our evaluation strategy is now positioned to increase the reliability and validity of results related to student performance. Aligning with global best practices, we periodically review our research methods and tools in order to find out whether or not our programs are on track to achieving our objective of ensuring that every child has access to quality education. Our team looks forward to opportunities for sharing our method and results with peer organizations and other stakeholders in the global education arena.