The fourth caveat of evaluations - Necessary vs Sufficient

In an earlier post, we have discussed the nuances of evaluation of a theme vs evaluation of a product. In this post, we will discuss the causal framework used in interpreting evaluations. Most caveats about evaluations aren't necessary the problems with the evaluation design but more with the communication and interpretation of the results.

In a typical evaluation, the outcome of interest is dependent on multiple factors. The evaluations can be broadly classified into two categories based on the manner in which these factors are dealt with.

  1. Evaluations of programs which span across multiple factors: Do cash transfers increase the learning outcomes of children?
  2. Evaluation of the individual factors: Does providing textbooks to children improve learning outcomes?
In the second case, textbook/learning material is a part of the process of education. While in the first, cash transfers isn't a direct factor in the process of education but it operates through indirectly influencing several other factors which are part of the process of education.  This post discusses one of the caveats in the interpretation of evidence on evaluations of the second kind.

The summary of evidence on the factors in the process of education look as the following (hypothetical)
  • Providing free textbooks - No impact
  • Increasing teacher attendance - Positive impact
  • Increasing access - No impact
  • Teacher training - No impact
  • De worming - Positive impact
  • Decreasing Pupil Teacher Ratio - No impact

This leads us to thinking of each of these as individual factors resulting in outcome. All these studies seen together may seem like this.

In this framework, each factor is seen as a sufficient condition (needn't be necessary). These factors are considered independent and the net outcome is the sum total of the impacts of each of these factors. This means that you can achieve outcome by improving access to schools alone, by providing infrastructure alone and so on.

The reality, however, is like this

The factors are interdependent and each of these are necessary conditions to achieve the outcomes. It is similar to a circuit of factors; breaking the circuit at any point will affect the outcomes, even if everything else is up to the mark. In this framework, for achieving outcomes, teachers have to come to school, students have to come to school, teachers have to teach, students have to listen, students have to practice, they should have good health and so on. It is only when all these factors work to a threshold value, we start seeing outcomes.

The implications of this difference in frame of thinking are obvious and far reaching. Even if the concerns of external validity are considered and if we just stick to the project site where the evaluation was carried out, the intervention administered might actually have been good and has worked but the results weren't manifested in the form of final outcome since the other factors necessary for achieving this outcome haven't been worked up on. If it is an intervention about increasing access, we might conclude that increasing access won't result in increase of outcomes. This can have serious consequences for policy, the fact being access alone may not result in outcomes. In such scenarios, such interpretation hurts all the efforts that go into easing the access, which is a crucial factor. You can't have outcomes if you don't have students in the schools!

What about the positive results of the evaluation? If a wide range of factors are necessary to achieve outcomes, does this mean that in evaluations showing positive impacts, all factors are in place? There are thresholds at work here. The links in the second picture above might have been very weak and strengthening a particular link through intervention might have resulted in the impact. There is a threshold one can achieve by merely increasing the strength of one link. The positive results are cases where the weaker individual links below the threshold were strength while the negative results could be due to lack of other links in the chain.

These nuances call for a lot of care while interpreting the evidence, especially when deciding policy priorities.

No comments:

Post a Comment