We’d just invited a few thousand students to try our latest experiment in online open-ended response activities — a controlled trial with pre- and post-test questions focused on concept transfer. We started reviewing their work, but the data from the control group stopped us in our tracks. We got page after page of responses like this:
This kind of answer wouldn’t be so surprising, except that all of these students had just solved four numerical “find the area of this polygon” problems perfectly, including one with exactly this shape.
We were testing an activity in which students would interact with each others’ open-ended explanations and ideas. They’d learn about finding the area of polygons by dissecting them into simpler shapes.
Our control group watched a video lecture on the subject, then worked through typical textbook-style exercises on the subject, like these:
Plenty of students solved every one of these exercises correctly. Typical standardized tests might suggest that these students had mastered shape dissection. But a third of those students couldn’t give even a partial answer to the “explain why” question above; another third gave poor or incomplete answers.
We didn’t design our experiment to look at this issue: we included these area-finding problems as a “dummy” activity to parallel the richer discussions in our experimental group. Maybe the issue is that the “explain why” prompt involves more algebraic knowledge. Besides, our experimental population is not necessarily representative.
But the intense gap between the procedural and conceptual assessments here still shook me.
Have you ever gotten an “A” in a class but felt you didn’t really understand the material? That seems to be exactly the disparity we’re seeing here. How many numeric-input or multiple-choice tests would exhibit the same problems?
We’ve long discussed our suspicions around scores taken from this kind of problem; we didn’t expect to confront such jarring examples so directly!
Originally published at klr.tumblr.com.