Overcoming Premature Evaluation

Chris Roche (the koala – I’m the kangaroo, right) is a friend and a brilliant development thinker, even if he has an chris-roche-and-dg-melbourne-nov-2014alarming tendency to be able to reference development jargon like a machine gun. If you can get past the first para, this is well worth your time.

There is a growing interest in safe-fail experimentation, failing fast and rapid real time feedback loops.  This is part of the suite of recommendations that are emerging from the Doing Development Differently crowd as well as others. It also fits with the rhetoric of ministers in Australia and the UK responsible for aid budgets about closing down failing projects and programs quickly when they are deemed to not be working.

When it comes to complex setting there is a lot of merit in ‘crawling the design space’ and testing options, but I think there are also a number of concerns with this that should be getting more air time. Firstly as Michael Woolcock and others have noted it can simply take time for a program to generate positive tangible and measurable outcomes, and it maybe that on some measures a program that may ultimately be successful dips below the ‘its working’ curve on its way to that success. What some people call the ‘J curve’.

roche-fig-2Furthermore and more importantly it ignores some key aspects of the complex adaptive systems in which programs are embedded.  In the area of health policy Alan Shiell and colleagues have noted non-linear changes in complex systems are hard to measure in their early stages and simply aggregating individual observable changes fails to explore the emergent properties that those changes in combination can produce.

They note for example that in many areas of public health policy such as gun control that ‘multiple “advocacy episodes” may have no discernible impact on policy but then a tipping point is reached, a phase transition occurs, and new laws are intro­duced. In the search for cause and effect, the role played by advocates in creating the conditions for change is easily overlooked in favour of prominent and immediately prior events. To minimise the risk of pre­mature evaluation and wrongful attribution, economists must become comfortable work­ing with evidence of intermediate changes in either process or impact that act as pre­conditions for a phase transition’.  Premature evaluation in cases like these could result in potentially successful programs being closed, or funding being stopped.

So how might programs be better able to measure and communicate the propensity of their programs to be able to seize opportunities (or critical junctures) if and when they might arise?

Firstly, in keeping with what a number of complexity theorists and some development practitioners have long suggested it is often changes in relationships that lie at the heart of the processes which lead to more substantive change.  It is these changes in relational practice i.e. how people, organisations, states relate to each other that can eventually create new rules of the game, or institutions.  However what those institutional forms will look like is an emergent property and is not predictable in advance.  What is therefore required is information on changing relationships which can provide some reassurance that changes to the workings of the underlying system are occurring.  An interesting example of this approach is provided by the What Works and Why (W3) project  which applies systems thinking to understanding the role of peer-led programs in a public health response, and their influence in their community, policy and sector systems.

The second thing that needs to be explored are the feedback loops between the project and the system within which it is located.  These are often ignored as monitoring and evaluation approaches tend to be of the ‘project-out’ variety i.e. they start with the ‘intervention’ and desperately seek directly attributable changes amongst the population groups they seek to benefit.   ‘Context’ in approaches on the other handrecognise that the value of an activity or project can change over time in line with changes in the operating environment, and that multiple factors longer-lasting-programmesare liable to have induced these changes.  So for example a mass shooting at a school can radically change the value that a gun control program is perceived to have, which in turn can affect its ability to be successful. Enabling community organisations, local government departments, activists as well as other actors in the system to be part of this feedback is an important part of the picture. Using citizen generated data along the lines that organisations like Ushahidi promote can achieve this under the right circumstances and when politics is adequately factored in.

The third element which needs to be recognised is that simply adding up individual project outcomes such as changes in health, income or education is not going to capture emergent properties of the system as a whole. As has been noted in India the collective ‘social effect’ of establishing quotas for getting women into local government has important multiplier effects. Repeated exposure to female leaders “changed villagers’ beliefs on female leader effectiveness and reduced their association of women with domestic activities”.   Outcomes such as these need to be measured at multiple levels in different ways appropriate to the task, and factor in context rather than seek to eliminate it.

Now of course people putting up money to support progressive social change need to have some indication – during the process – that what they are supporting can, or more realistically might, become a swan, or indeed a phoenix. I argue that the following are some of things they need to be concerned about a) what is the changing nature of relationships (captured for example through imaginative social network analysis), b) whether effective learning and feedback loops are in place, c) ensuring that the demands they place encourage multi-level analysis, learning and sense-making which avoids simplistic aggregation and adding up of results.

So I am all for the move to do iterative, adaptive and experimental development, but if we are serious about going beyond saying ‘context matters’ then exhortations to ‘fail fast’ need to be more thoroughly debated.

My thanks to colleagues Alan Shiell and Penny Hawe for comments on an earlier draft.

And in a similar vein, here’s Oxfam’s Kimberly Bowman with a great post on the new Real Geek blog

Subscribe to our Newsletter

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please see our .

We use MailChimp as our marketing platform. By subscribing, you acknowledge that your information will be transferred to MailChimp for processing. Learn more about MailChimp's privacy practices here.


4 Responses to “Overcoming Premature Evaluation”
  1. This is a great post, really appreciate all these aspects of evaluating in complex contexts brought together. Just one additional thought to consider – talking about learning and feedback loops and fast feedback loops can suggest that we are focusing on aggregate characteristics – the patterns that have arisen and can be recognised and handled. Complexity theory adds to this idea the importance of detail – of events, particular people, particular chance happenings, variations etc. So sometimes what has to be tracked is either pre-pattern formation (and hence pre the establishment of feedback loops) or, at the opposite end, it is noticing what is ‘invading’/affecting current patterns. Fast feedback loops sounds good, but sometimes what is pertinent is noticing emerging change and how that could potentially shift what is there. So fast feedback loops convey one sort of information but spotting the quirkiness and particularity of specific factors or people or events that start to have an effect is important too. Incidentally, does anyone have some good examples of using fast feedback loops?

  2. Yes, another great post. I enjoy these systems posts. I find outcome mapping to be a great alternative (or combination with) the logframe for capturing relationships and nuance. It can also be tied to various stakeholders or “boundary partners.” The Expect to see/ Like to see/ Love to see seems better at humility. Better at connections between things and seeing what is possible in the current reality rather than an aspirational laundry list of possible outputs and outcomes. I also think the systems tools of current and future context mapping or Kahane’s transformative scenario building are extremely helpful but possibly the process and dialogue of a network grappling is more important than the product? Part of the slow changes in the mindsets that changes the relationships that is so hard to capture. As one colleague said beautifully, we are counting the trees we plant instead of trying to make sense of how we navigate the jungle.

  3. Nick Pyatt

    I agree in principle with all that is said here. Unfortunately my experience is that development professionals have a history of being insufficiently curious about their own behaviour and the impact of the way they exercise power. NGOs included. The result has too often been that sound systems logic like this results in busy and intellectually stimulating but ineffective outcomes which undermine the faith not only funders but more importantly the supposed beneficiaries have in their work. The complex systems concept that history matters would support the idea that this is a like outcome until the momentum of history is different. So whilst in some ways crass and likely to fail in another way, I wonder if a period that instills a culture that is stronger on accountability isn’t in the long term a good thing if it redustributes power. This is my conclusion after 30 years in development, less so in recent years but when I do re engage I find that things gave not changed enough for ne to conclude differently. The premature and immature monitoring that the sector has moved to is certainly not the answer but is does seem to be getting development professionals to listen differently, even if what they have to listen too is not always very sensible. I also wonder if this is more likely in the DFID dominated part of the development world. Some other donors and the professionals within them seem to be able to work in a way which is more able to benefit from this sort of complex systems approach. Our work with GIZ comes to mind for example.

  4. When I hear people talking about adaptive programming I wonder whether it all boils down to a messages that is as basic as “we need to do things faster”. But isn’t this advice flying in the face of decades of development aid experience which suggests that actually we need to be a lot more patient and take a much longer time perspective than that encouraged by the average 3 year funding period? So rather than “speed up” perhaps the latest advice should be “hang in there!”

    And… faster iteration of different versions of an intervention is not the only way to learn. In biological evolution individual organisms can vary their behaviors, and learn from trial and error, but equally important is the population level learning that is taking place, whereby multiple organisms in the same species are trying out different ways of ding things, _at the same time_. In the world of computers this is called parallel processing in contrast to serial processing. Crowd sourcing of computer power, for science projects, like SETI, are all about making massive use of parallel processing.

    Many large development projects are in practice involved in some form of parallel processing, in as much as their interventions get implemented in multiple locations within a given country. Intentionally and unintentionally there is often a significant amount of diversity in how these interventions get implemented. But the question is whether this diversity of practice in pursuit of a common objective then gets the analysis it deserves? In my own experience, with maternal health projects in multiple districts in Indonesia, the answer was a resounding no. We were too bound up with trying to come to aggregate conclusions about the project as a whole. Mea culpa, mea maxima culpa!