On utility of RCTs, Academic Research, and Development

[Gulzar Natarjan has two posts on RCTs - Academic Research - and Development. I commented on the post and Gulzar has replied. I am posting my response to Gulzar's reply here because the comment box wasn't accepting my comment due to its length.]

1.  There are different types of RCTs and we need to segregate them for analysis. Briefly, they can be categorized as a) RCTs that test hypothesis of binding constraints (Is problem with teachers or the pedagogy?); b) RCTs that test programmes or interventions; c) RCTs that test the cost-effectiveness of programmes.

Each of them have different applications. 

2. For policy making, the first type of RCTs (those which test hypotheses of binding constraints) are extremely useful because they help us do a systematic 'first-principle' analysis to weed out competing hypotheses regarding binding constraints. An individual RCT may not seem useful here but a collection of them can be useful.

A first principle analysis of an education system can look like as follows: 

Let's consider a classroom. What's the issue here? May be the teacher doesn't have information on the level of learning of students. Now you refer RCT (the one done in AP) that does an intervention on this which finds that giving diagnostic information to teachers doesn't lead outcomes. Then you revise your prior - Ok. If this is not the reason, what else is?

Someone might say - may be the technical know how (using Deaton's term) of teacher's pedagogy is the reason. Now, you consider a pedagogy that is good at technical-know-how (proved through an RCT - Pratham's TaRL) and  implement it in a classroom. You find that are no outcomes even with pedagogy that's good at technical know-how.

You then say - may be the binding constraint is NOT availability of technical know-how but with the teachers (human agency). You then refer an RCT (Pratham's Bihar RCTs) where the same government teachers teach using same pedagogy (good at technical know how) in different settings - one within traditional classroom during academic year and one during summer (outside usual constraints). You then find that the same government teachers are being effective outside the academic year but not during usual school time.

With all these, you infer that the binding constraint is neither necessarily with availability of technical know-how, nor with human agency but it has something to do with the structure within which the teachers work.

You thus zero-in the problem to the issue of structures within which the teachers work in - therein comes the argument of state capacity as the binding constraint. (This is the line of argument in my book).

This kind of systematic first principle analysis is useful because it helps us to be clear in our thoughts and understand the context better. It also helps avoid what I called 'experts' parochialist world views'. It often happens that one cites only those things as binding constraints, in which he/she is expert in. For example - a pedagogy expert argues that 'pedagogy' is the binding constraint and so on. They often refuse to look beyond that. In such process, competing hypotheses for binding constraints emerge. (BTW - I used para teachers example because I recently saw two TV debates in Lok Sabha TV where this was repeatedly being pointed out by the panelists. I only intended to use this as an example to point to the phenomenon of false traps regarding binding constraints.)

A systematic first principle analysis as above, facilitated by knowledge of RCT evidence helps us peel off these competing hypothesis and get to the core of the problem. RCTs have made such first principle analysis possible, if not for policy makers but at least for others.

In the absence of such systematic first principle analysis, policy makers end up being victims of pedagogy experts (who are traditionally considered as educationists) and parachute complex pedagogies into classrooms, which only backfire. Unfortunately, this is a recurring phenomenon.

3. Examples of government imbibing lessons from RCTs: At the outset, I would like to point to two examples of Pratham's TaRL scale up and deworming programmes but the issue is deeper.

The 2nd type (RCTs on interventions) and 3rd type (RCTs on cost-effectiveness) of RCTs mentioned above have structural limitations. Only those RCT papers are publicized and taken up with government that have shown results "across contexts".

The messy nature of development by definition means that there will be very few examples of interventions that have worked across contexts.

The USP of Pratham's TaRL is that, even with the given constraints and given level of state capacity, gains are still possible, if you do tweaks to the style of teaching - by grouping kids. Hence, it shows impacts across contexts even within low-capacity contexts.

The other advantage of RCTs on Pratham's RCT is that - for the first time it questioned the arguments rooted in philosophy of education who were vehemently against separating children as per ability even in the initial levels. Even after this evidence, some are still against it but RCTs have certainly weakened their position.

The deworming example is an instance of 3rd type of RCTs which push governments to take up an action by showing a value for money. If we think of it , one can say: giving pills to kids with worms is a no-brainer - if kids have worms, why don't you just give pills? Why do you need RCT? It's not so simple because despite such clear logic, governments hadn't take up such programmes. The cost effectiveness of RCTs encouraged governments to take that up.

4. RCTs vs. other studies: We have to be again careful here. We need to segregate the number of studies and their influence. If we just consider the number of studies - RCTs vs. other types is not a zero sum game. Growth in RCTs needn't necessarily stop people doing ehtnographic studies. RCTs just add to an existing variety of papers and not necessarily displace others.

Duflo also points this in one of her lectures citing numbers on trends in economic papers, that suggests that RCTs did not displace other papers but they just "added on" to the existing research of other variety.

Coming to the influence of the studies, rigour is definitely one aspects that makes RCTs seem popular. But, more importantly, a whole institution is built around RCTs whose only job is to publicize this evidence. Hence, it seems more popular.

5. Though a minor point, RCTs is not a lazy way to publication. There's a huge risk and effort involved in carrying them and there's good probability of failure due to execution issues. In fact, Chris Blatmann advices people not to do RCTs for Ph.D thesis because of the risks involved and good probability of failure.

Though there are many seemingly spurious RCTs coming up these days, the one's addressing fundamental questions involve huge effort.

Summing up, policy making requires wide variety of evidence in the life cycle of a policy. RCTs just filled a gap in this process. RCTs help in doing systematic first-principle analysis while designing MVP. Other types of evidence like dip stick surveys, ethnographies help during iterating the project or for coming up with ideas for interventions. 

Needless to say, it's unrealistic to expect RCTs for everything. Somethings are to be done even if there's no RCT or even if RCTs say otherwise.

