On Silver Bullets and White Rabbits
Learning Analytics sits at the intersection of Computer Science (drawing on sub-disciplines such as Data Mining, Information Retrieval, Information Visualization, Web Semantics) and Education (e.g. Educational Research, Measurement Science, Learning Sciences, Computer-Supported Collaborative Learning, e-Assessment). In my view it is an educational incarnation of Human-Centred Informatics (the effective design of human/digital information systems) and arguably Computational Social Science where social phenomena and computational modelling meet (elegantly introduced by Hannah Wallach in her recent Medium posting, and explored in relation to Complexity Science elsewhere). Extending several recent talks (e.g. EdMedia2014) this post introduces some of the big questions I see arising around the discourse of “educational big data.”
I was speaking at an event called Codes Acts in Education, and asked Siri to find me the website so I could check the agenda. Siri cunningly seized the moment to generate a pun on the very topic of my talk — the fears that many have around the impact should the automated algorithmic analysis of educational data take off.
This tweet sounded a cautionary note on behalf of those for whom the rhetoric around data, algorithms and analytics is (variously) pedagogically vacuous, blatant marketing, or driven by misguided accountability agendas (all of the above options may be checked!).
In the very process of trying to value certain learning qualities by tracking them, will we in fact distort or even destroy a living, organic system, through clumsy efforts to categorise and quantify?
Of course, we don’t need computational analytics to take this worry seriously. Concerns about the damage to learning that can be wreaked by inappropriate assessment regimes long precedes Big Data. As chair of governors in an English primary school, working closely with the Head and senior team, I saw the pressure that teachers experience to maintain their visibility in the Department of Education’s quantitative analyticsof whole-school performance (RAISEonline). Quite apart from the pressure the teachers and 9–10 year olds experience, if schools drop too low in the stats for a couple of years, Headteachers typically lose their job as part of the standard ‘improvement measures’. Not exactly a warm invitation to take risks and experiment creatively (but responsibly) with new ways to engage learners in challenging contexts. School principals need courage and support to break from the instrumental test-driven mindset, and it’s good to see people like Anya Kamanetz making accessible to wider audiences the critical debate and evidence base that now testifies to the damage that such assessment regimes are causing.
Let me tell you about 10 year old “Joe”. Joe’s attendance is at 97%. When his curiosity is sparked his appetite to learn is insatiable. He’s mentoring peers thanks to his good interpersonal skills. When he gets stuck, he’s learnt not to panic, but has figured out ways to push through.
Here’s the class progress visualization of English Key Stage 2 SATs. Joe is presumably one of those light blue Level 4 good progress avatars, or perhaps even a green Level 5+?
Unfortunately not. While Joe is making faster progress than some of his more academic peers who are cruising and not stretching themselves, he is that purple Falling Behind blob.
When Joe started at school, he would often fall asleep in class. He could be aggressive to staff and peers with very little provocation. We’d often have to give him breakfast, or a clean shirt. His reading age was 6 months behind. But it’s no wonder. He was often up several times a night feeding his baby sister while his mother was in a drunken sleep. He was picked up by the police recently on the streets at 3am, on his way to collect drugs for her.
Through their commitment to Joe, the school was able to provide a stable weekday environment (picking up the pieces on Monday after a turbulent weekend), and cultivate a set of qualities that have transformed his attitude. He actually likes learning now. He’s learnt that when something is new and difficult, that’s what learning feels like. He’s learnt that asking good questions is as important as knowing the right answer. He’s learnt what to do when you don’t know what to do. Not all pupils get this (even the ‘high achievers’). Indeed, not all teachers get this. We’d still like to see Joe’s reading and numeracy improve, but we’re winning the strategic battle — he now wants to learn. “Joe” is a composite of some of the children we worked with, and his story will be painfully familiar to many school staff. If you’re in tertiary education, substitute him with one of your students from a tough background but who is resilient enough to have made it into college, still has the right attitude given the opportunity and support, but is still fragile.
This vignette reminds us that the data points in a graph are tiny portholes onto a rich human world, and encapsulates some of the concerns that educators have about the misuse of blunt, blind analytics — proxy indicators that do not do justice to the complexity of real people, and the rich forms that learning take.
In all sectors, however, progressive practitioners and researchers are emphasising the need to instill higher order competencies in learners to complement the conventional indices. However, these qualities are presumably even tougher to quantify in in a meaningful way.
This is the highly charged social, organisational,
political, educational context that Learning Analytics enters,
and must do so with eyes wide open.
While we’re used to declarations of new silver bullets on the airport bookshelves, it’s extraordinary to see serious researchers get so myopic about a new technology that they follow suit, and even manage to get this past the editorial control of serious publishers — it does happen, although thankfully not yet in Learning Analytics.
However, as K-12, higher education institutions, and the associated government departments wake up to data, they are now seen by business intelligence (BI) companies as exciting new markets ready to hear how they can be transformed by Data and Analytics. The lure of the dashboard which shows at a glance how a student, department, institution, region or nation is doing, holds a deep appeal.
Indeed, to the extent that schools and universities are businesses, they can benefit from the sorts of optimisations that BI brings other enterprises. Moreover, there there can be no complacency on the part of educational institutions about the risk of educational disruption by analytics-intensive businesses (who now go far beyond traditional BI vendors). Educational startups make mistakes at a furious rate, making them the object of scorn by some academics (“Haven’t they read the literature?!”). They also learn at a furious rate, a design prototyping approach that puts educational institutions to shame.
So no room for complacency as edtech startups rev their engines, but just as we interrogate the assumptions and biases underpinning computational models of other human phenomena (economics; epidemics; crime; migration…) we need to ask how an algorithmic mindset shapes our conception of learning: what assumptions about learning are made in the selection of data, the setting of thresholds, the selection of advice, recommendation or adaptation of curriculum? And who is supposed to make sense of the dazzling dashboards embedded in every e-learning product pitch? If we are to govern algorithms (and not vice-versa), who is equipped to ask questions in the right way, get the attention of those who can answer, and make sense of the responses?
Follow the white rabbit
It’s here that our friend the rabbit has some provocations to offer.
Some worry about Learning Analytics as the Alice in Wonderland Rabbit — an alluring, hard to pin down promise whose ROI is down a frustratingly long, dark hole (but hopefully the next upgrade will fix it…). Sales hype guilty of technological solutionism does not help of course, with so many organisations experiencing less than the promised delights of new information systems.
It’s relatively early days, and financial analyses of the ROI on Learning Analytics are hard to come by at present. Our current educational paradigms and accounting systems make it easy to operationalise $$student in terms of course enrolments. Consequently, one form of analytics attracting a lot of interest is the use of predictive models for identifying ‘at risk’ students based on behavioural data: if an intervention programme increases the student completion rate, that has direct monetary value. Models are validated on historical data of student dropouts, which gives statistical confidence that their deployment on live student data detects genuinely ‘at risk’ students.
Some of the most promising efficiency results also seem to be in the rate at which formally modelled curriculum and skills can be mastered through personalised adaptive tutors and educational games (there’s a significant research literature on this now in the AI in Education community). It may be that adaptive platforms can release pressured curriculum time for other modes of learning that remain beyond the scope of the student modelling algorithms.
This brings us to Learning Analytics as a Magic White Rabbit — in which we ooh and aah when out of the black analytics hat pops a delightful surprise — but nobody’s quite sure where it came from…
Sounds like a non-starter as a sales pitch to a school or university, surely. As Candace Thille (Stanford) has provocatively put it, what educational institution would outsource key core competencies (learning design, assessment, feedback ) to an ed-tech analytics platform, without knowing inside-out what was in the black box?
But just a minute. Everyday we put our trust in black boxes we don’t understand. Society manages this through accredited professionals who can see inside the boxes and explain to mere mortals (up to a point) what’s going on — in our car, or in a medical test, or why the bank sent us the wrong automated letter.
Black box algorithms certainly seem to be accepted by some educators who have positive experiences with student ‘at risk’ early warning systems using predictive modelling, or with adaptive tutoring environments. Indeed, if they were shown the models, many wouldn’t understand them without a maths and machine learning tutorial, and reading the background research publications. The pragmatic argument says that overstretched educators just want tools that work for them and for students: they care most that it is treating students in a manner which they judge to be appropriate — just as you do with your thermostat or phone.
So should we be content as long as somebody in the school or university is able to explain how the magic works? What if that person isn’t actually in the institution, but in the company you’ve outsourced to? What if no-one’s quite sure why it’s behaving like it is — because it’s been learning autonomously for the last 3 months — but it seems to be doing a great job?
We are of course now in the realms of the oldest of AI dilemmas. But this is hardly Isaac Asimov science fiction. As we saw recently with Facebook’s end of year review algorithm, it had all sorts of assumptions and values baked into it. So the question of how one validates an algorithm cannot escape values-laden issues: it’s flawed if it merely replicates flawed human judgement.
We might look to the open source software movement for important principles and ways of working to ensure transparency around Open Learning Analytics — so that when detailed questions are asked by knowledgeable people, genuine answers can be provided, and when deep changes are needed they can be made. How companies who feel they have IP to protect will respond to questions around transparency remains to be seen. Perhaps they’re banking that clients will enjoy the magic show, and not want to look behind the curtain. Or perhaps they will be caught out by an upsurge in data literacy which makes such a position untenable.
Questions must also be asked about whether part of a university’s mission is to help students discern their calling, which may include discovering that they are on the wrong course; whether a focus on course completion comes at the risk of ignoring deep learning in favour of passing the tests; or whether a recommendation engine based on the historical habits of most students who’ve passed, threatens individual innovation and creativity.
Learners will leave an increasingly rich “digital shadow” — but as Plato reflected in his shadowy cave, this is but a pale imitation of vibrant reality. Learning Analytics as a Shadow Rabbit reminds us that a digital footprint necessarily reveals only a filtered record of the rich context in which a complex human took that step. The same step might be taken by different learners for different reasons: the algorithmic hope is that your preceding and subsequent paths reveal enough of your intent or state of mind in order to enable the software to do something useful, or alert a mentor to assess the situation and step in if required.
What we thought was just a shadow is now exercising agency… When the Rabbit Talks Back, we’re reminded that the digital shadow is no longer a passive, causally determined rendering. It is not only getting higher and higher in resolution, updated in real time, but is exercising agency. Data visualizations — especially if they are endorsed by those in power — will shape how we see the world and how we act. Recommendation engines will exploit limited human attention to place certain resources at the top of the page.
In The Matrix Neo is instructed to follow the white rabbit. This rabbit (and a very odd under-the-counter pill) will reveal to him that what he took to be reality was in fact a digital mirage. This mirage was designed to stop people from asking the real questions: to treat the digital map as the territory.
Treating digital renderings as actionable intelligence about reality is of course the ultimate promise, and risk, of all analytics. Bowker and Star help us sort things out. Their critical analysis of how we evolve information infrastructures is a sobering reminder of the compromises that are always made in the processes of abstraction that are required to construct standardised schemas for data processing (often for accountability purposes):
No coding scheme, no analytics. So what do we choose to forget about our learners, and how should we think about the role of computational analytics in learning?
The blessing and the curse…
The reflexiveness of humans is both a blessing and a curse.
Just as digital data and computational analysis has transformed fields such as genetics, astronomy and high energy physics, educational researchers and practitioners have reason to be intrigued by the opportunity to analyse authentic user data at scale, at pace. The tricky point is that the BRCA2 gene, Red Dwarf stars and the Higgs bosun do not hold strong views on being computationally modelled, or who does what with the results. Learners’ awareness that their shadow has a 24/7 audience (both human and increasingly machine) could easily lead them to distort their behaviour, or game the system. For an educator or researcher trying to get a robust measure of change, this might be considered a curse (depending on their epistemology).
On the other hand, the blessing is that if high stakes are associated with demonstrating particular behaviours, evidenced by certain kinds log files, then no matter how sophisticated the user experience, we can be sure that smart people will invent ways to hack the system. We should be grateful for that, to the extent that we believe that it’s dangerous to attach high stakes outcomes to (always limited) computational models of human behaviour.
This throws us back on the question of
what learning analytics are seeking to model,
and who gets to interpret and act on them.
This in turn begs the question — what kinds of learners are we trying to nurture? Let’s get clear where we want to go, and then we can talk about selecting and tuning engines, chassis and dashboards.
VUCA / Liminal Space
Volatile. Uncertain. Complex. Ambiguous. These are the conditions we now confront in all sectors, and the educational world is struggling to adapt, such is the systemic inertia to change. We should now be designing our educational systems to build learners who can not only survive, but thrive under unprecedented conditions of complexity and uncertainty.
This goes deep, since we’re talking about liminal space at many levels of the system. A favourite quote is from Richard Rohr, writing at the time about the post-9/11 disorientation in the US, but more broadly, about how we train ourselves emotionally and spiritually to tolerate, and navigate the unknown:
Liminal space is being developed as a concept for educational practice, for instance Johansson & Felten:
The Knowledge–Agency Window (p.71) from Ruth Deakin Crick helps us orient to this new landscape. Different kinds of analytics may help in different parts of this design space. But we can no longer sit primarily in the lower left cell, for which our dominant educational apparatus is tuned. Educators must learn to move fluidly around this space.
My interest is in analytics for the top right quadrant, and future posts will consider how we can prototype analytics suited to building these qualities.
Update, Sept. 2016: For examples of how some of these ideas are playing out, see this edited volume on LEARNING ANALYTICS FOR 21ST CENTURY COMPETENCIES