Reflections on the rise of evidence-based policymaking

You know how it is: you spend months waiting for the next global summit on evidence, and then when the invitations arrive they’re all scheduled at the same time. Recent weeks saw two held in London, and a few people missed them because of a rival summit in the USA! Poor timing, but the headcount at each was a heartening indicator of the level of interest in experimental evidence. However, these conferences are not just a lot of back-slapping among fans of randomised control trials (RCTs). Indeed, they have given us randomistas a lot to chew on, particularly off the back of a thought-provoking critique of RCTs from Nobel Prize winner Angus Deaton.

The first of the two summits was focused more on the ‘supplier side’. The What Works Global Summit 2016 brought together many from the academic and research community, particularly from the Campbell and Cochrane collaborations that specialise in systematic reviews. The second, Evidence Works 2016 organised by Results for America and the Alliance for Useful Evidence, catered more to users, with representatives from a large number of governments across the world.

One of the sessions, chaired by Geoff Mulgan and myself, and organised by the Cabinet Office What Works team with the Alliance, sought to make progress on cross-national collaboration around the commissioning and sharing of systematic reviews and evidence. Currently, regional and national governments often unknowingly commission similar reviews in parallel, with each of them separately trawling the same international evidence.

Yet it was striking that many Foundations and Governments thought it would be nigh impossible for them to enter into a collaboration to jointly commission such reviews. They just wouldn’t get it past their procurement folks, they explained. This is crazy. Fortunately, some Government commissioners are embracing such opportunities. Our Department for Education, for example, recently announced the commissioning of a What Works Centre for children at risk, to help guide the difficult decisions that children’s social workers (or ‘child protection’ workers in many countries) have to make. The centre will start its work with a series of systematic reviews, seeking to answer questions that no doubt many other nations and cities will be asking. They would be delighted to work with those interested in jointly commissioning similar reviews, so let them know if you would like to join forces – it’s a chance to both save money and get a more extensive review done at the same time!

But what about the ongoing rumble about RCTs? Perhaps it was inevitable that with the spread of RCTs from medicine into so many other fields there would be a bit of push-back. Angus Deaton’s lecture last Thursday, on ‘Understanding and misunderstanding RCTs’, was a nuanced critique based on his recent paper with Nancy Cartwright. Angus has long expressed reservations about RCTs – he bent my ear about them regularly when we worked together on the Legatum Commission on Wellbeing – but now that he’s got a Nobel Prize and a Sir in front of his name we’re all taking him even more seriously than we did before.

A number of his reservations will be familiar enough, such as the risks of generalising from one sample population to another, or when randomisation is far from random. We should keep a close eye on one of his concerns in particular: how RCTs behave when treatment effects vary across trial participants. For example, consider a medical treatment that would greatly help, say, 1 in 100, but leave most unaffected or even mildly worse off. The result of an RCT in such a situation is highly unstable – essentially it depends on whether the one or two people happen to end up in the control or treatment groups. In such circumstances, the sample size and standard errors are really in trouble – suddenly the group that really matters is not the 1,000 people in your sample, but the 10 who are highly responsive.

Most critiques of RCTs can be addressed by careful attention to method, such as making sure that randomisation is performed carefully, looking out for spillover effects into the control group and so on (this recent paper by Roland Fryer provides some further pointers). But some issues, like extreme heterogeneity of response, really do require careful thought, and sometimes should drive you to look to other evaluation methods.

Ultimately, Angus is not advocating against the use of RCTs. But he is cautioning against the idea that they are inherently superior to other techniques. The best evaluation always has and always will draw on wider theories and evidence base, and at the very least, RCTs normally need to be accompanied by a rationale for why the result can be applied outside of the very particular circumstances of the original trial.

Reflections on the rise of evidence-based policymaking

Authors

Professor David Halpern CBE