Do Brief Educational Interventions Work?
Here’s what the research shows
Education is always looking for cost-effective, quick ways to boost student attainment. The problem that educators face, however, is that having a significant impact on student outcomes is hard: it seems that there is little that can move the success needle in the direction that is desired for large swaths of students.
Often schools will turn to what I call “brief educational interventions”, which are employed from K12 through higher ed. A brief educational intervention is often a short (~30 min) computer prompt type of activity that students engage with. Learning styles, growth mindset, plan making, goal setting, value affirmations, belonging are all examples of these types of interventions that are hoped to increase student outcomes.
These brief interventions are easy to implement, cheap to execute, and make people feel good for doing something. But what do we know about them from a research perspective? Are they effective? On the whole, not really, no. Despite educators intuitively knowing this, there seems to be an unrelenting motivation to solve problems with brief interventions that constantly overpromise and underdeliver.
Over the last couple years, a few big papers show what I see to be three interesting and important characteristics of research on brief educational interventions.
Research Is Underpowered
Stanley et al., 2018 published an impactful and telling paper on the state of statistical power across different areas of psychology, educational psychology included. In this paper, the authors used meta-analysis (a statistical technique to assess data from numerous studies) to estimate the average statistical power (the ability of the study to yield a significant effect of a particular size assuming the effect actually exists) of 200 meta-analyses published in the journal Psychological Bulletin, the premier journal for publishing meta-analyses in psychology.
The results showed that most areas of psychology are severely underpowered and unlikely able to accurately detect real effects in observational and experimental research. The authors estimate that only 8% of psychology studies are adequately powered (defined as achieving at least 80% power) with the average power of psychology studies being only 36%.
Although education research fared quite well relative to the other areas of psychology, educational research still had less than 50% of their studies achieve adequate statistical power. In the case of the educational studies included in this particular paper, they were all observational, meaning that they were not intervention studies.
However, the authors noted an interesting pattern in the data: areas of psychology where experiments are dominant, such as social and cognitive psychology, had the lowest overall power in their research, whereas highly observational fields, such as behavior genetics and education had relatively higher average power. If observational education studies achieve adequate power less than 50% of the time, it’s likely that experimental educational studies have worse power overall.
And they do.
In a meta-analysis of 141 educational trials, the average power was a meager 22-25% across studies.
Effects Are Small.
That same meta-analysis of 141 studies mentioned above, showed another concerning characteristic about educational trails: the effect sizes are small. Analyzing trials aimed at increasing K12 student attainment commissioned by the UK Education Endowment Foundation and the US National Center for Educational Evaluation and Regional Assistance involving more than 1.2 million students showed that these randomized control trials (RCTs) had an average effect size of .06 standard deviations.
To give you an idea of what that difference looks like, see the graphic below. The smallest effect size shown is .20 meaning that in the education context there would be an 83% overlap in the distributions of academic outcomes for students in the control and experimental group. The average effect size estimated in the meta-analysis is .06 – the difference is very minimal between control and intervention groups. The authors conclude: “Given the significant level of educational research funding currently being spent on rigorous large-scale RCTs, it is clearly unsatisfactory that so many trials are uninformative.”
But do small effects mean that the impact of the intervention is meaningless? Maybe not. It comes down to return on investment. If schools are paying an arm and a leg to train teachers in growth mindset techniques, for example, but the effects are small, is it worth it?
Some say yes.
For example, Carol Dweck, the founder of the growth mindset movement in education makes explicit the potential impact of small effects in her 2019 paper detailing a large scale growth mindset trial, which found an effect size of .11 on high school student’s GPA: “The model estimates that 5.3% of 1.5 million students in the United States per year would be prevented from being ‘off track’ for graduation by the brief and low-cost growth mindset intervention, representing a reduction from 46% to 41%, which is a relative risk reduction of 11%” (p. 366).
Interventions Don’t Generalize.
The final characteristic of brief educational interventions that is important to note is that these types of interventions don’t generalize across contexts well. A blockbuster paper published in 2020 utilized Massive Online Open Courses (MOOCs) offered by Harvard, MIT, and Stanford to embedded numerous brief educational interventions into 247 courses taken by 250,000 students over 2.5 years to evaluate if the brief interventions impacted course completion rates.
The result? No single intervention had beneficial effects on course completion rates across courses and countries, and the effect sizes were in some cases an order of magnitude smaller than initial pilot-like studies suggested. Instead, the effects of interventions varied by country type (e.g., individualistic, developed), or whether the intervention occurred in the first year or the second year of the study. Interventions such as “plan making” only increased course engagement for the first week but had no long-term effects. Overall, the results show that noise and small, inconsistent effects that are highly context dependent are the norm in educational intervention work.
The overall finding is relatively disappointing in that there is no blanket solution intervention that can be successfully applied across diverse learners and contexts. Rather, context matters, and scaled-up research has the benefit of investigating what interventions work under what conditions. Put differently, education interventions will likely be more successful if prescribed based on specific conditions. The question is how to do that. As the authors conclude: “We encourage greater focus on the characteristics of different contexts that induce variation in the effects of interventions to advance the development of a science of context in education. In a new paradigm, the question of ‘what works?’ would be replaced with ‘what works, for whom, right here?’” (p. 14904).
Education is a complex ecosystem: we’re highly unlikely to find a ‘silver bullet’ intervention to increase student attainment. The brief intervention approach to education is unlikely to solve embedded problems in the way we approach education, the incentive structure against teaching excellence, and inequitable public school systems.
Education research needs to identify fruitful strategies that improve learner outcomes at scale. My perspective, based on the state of the research, is that brief educational interventions are not the best path forward. No research that I’ve come across has been able to use rigorous design and result in large, generalizable effects. The reality is that in the case of brief educational interventions, you can have only one.
Do these "interventions" generally not involve learning strategies? I do understand that learning styles seem to make no difference according to research, but what if the intervention teaches students to use flashcards and about concepts like spaced-repetition, recall, interleaving, etc. which have shown to help?