Page contents
- Impact evaluation: an introduction
- Attribution and the counterfactual: the case for more and better impact evaluation
- Randomised Control Trials: the gold standard?
- Adapting to time, budget and data constraints
- Mixed-method designs
- Toolkits
- Theory-based evaluation
Impact evaluation: an introduction
The recent emphasis on accountability and results-based management has stimulated interest in evaluating not just the process, outputs and outcomes of development programmes, but also their impact (ultimate effect) on people’s lives. Impact evaluations go beyond documenting change to assess the effects of interventions on individual households, institutions, and the environment, relative to what would have happened without them – thereby establishing the counterfactual and allowing more accurate attribution to interventions.
This counterfactual approach to evaluation is increasingly advocated as the only reliable way to develop an evidence base on what works and what does not in development. There are some 800 quantitative impact evaluations in existence across a wide range of sectors, and more are in progress or being commissioned. There is growing consensus that more rigorous quantitative approaches such as randomised control trials should be used more widely, but they are not appropriate in all contexts.
There is growing consensus that where RCTs are not appropriate, there remain a range of quantitative counterfactual approaches for large n interventions (where there are many units of assignment to the intervention – such as families, communities, schools, health facilities, even districts). It is possible to collect outcomes data using qualitative methods, within the context of a counterfactual evaluation design. For small n interventions (where there are few or only one unit of assignment – such as an intervention carried out in just one organisation, or one which affects everyone in the relevant population – mixed methods that combine quantitative and qualitative methods may be appropriate. All impact evaluations should collect information along the causal chain to explain not just whether the intervention was effective, but why, and so enhance applicability/generalisabiltiy to other contexts.
Lucas, H. and Longhurst, H., 2010, ‘Evaluation: Why, for Whom and How?’, IDS Bulletin, vol. 41, no. 6, pp. 28-35
This article discusses theoretical approaches to evaluation and draws on experiences from agriculture and health. It notes that different stakeholders may have varying expectations of an evaluation and that alternative approaches to evaluation are more suited to meeting some objectives than others. Randomised control trials, or well-designed quasi-experimental studies, probably provide the most persuasive evidence of the impact of a specific intervention but if the primary aim is systematic learning a Theories of Change or Realistic Evaluation approach may be of greater value. If resources permit, different approaches could be combined to cover both accountability and learning objectives. As there will be trade-offs between objectives, transparency and realistic expectations are essential in evaluation design.
Access full text: via document delivery
White, H., 2009, ‘Some Reflections on Current Debates in Impact Evaluation’, Working Paper 1, 3ie, New Dehli
There is a debate in the field of impact evaluation between those promoting quantitative approaches and those calling for a larger range of approaches to be used. This paper highlights four misunderstandings that have arisen in this debate. They involve: 1) crucially, different definitions of ‘impact’ – one based on outcomes and long term effects, and one referring to attribution; 2) confusion between counterfactuals and control groups; 3) confusion of ‘attribution’ with sole attribution; and 4) unfounded criticism of quantitative methods as ‘positivist’ and ‘linear’. There is no hierarchy of methods, but quantitative approaches are often the best available.
Access full text: available online
Asian Development Bank, 2011, ‘A Review of Recent Developments in Impact Evaluation’, Asian Development Bank, Manila
How can impact be credibly attributed to a particular intervention? This report discusses the merits and limitations of various methods and offers practical guidance on impact evaluation. A rigorously conducted impact evaluation produces reliable impact estimates of an intervention through careful construction of the counterfactual using experimental or non-experimental approaches.
Access full text: available online
Attribution and the counterfactual: the case for more and better impact evaluation
Development interventions are not conducted in a vacuum. It is extremely difficult to determine the extent to which change (positive or negative) can be attributed to the intervention, rather than to external events (such as economic, demographic, or policy changes), or to interventions by other agencies.
Impact evaluations attempt to attribute change to a specific programme or policy and establish what would have happened without the intervention (the counterfactual) by using scientific, sometimes experimental, methodologies such as randomised control trials or comparison groups.
A number of organisations and networks have emerged in recent years to make the case for more rigorous evaluation methods. These include the Network of Networks for Impact Evaluation (NONIE), the Abdul Latif Jameel Poverty Action Lab (J-PAL) and the International Initiative for Impact Evaluation (3ie). The final report (2006) of the Evaluation Gap Working Group at the Centre for Global Development is a seminal document calling for greater use of impact evaluation:
Center for Global Development, 2006, ‘When Will We Ever Learn? Improving Lives Through Impact Evaluation’, Evaluation Gap Working Group, Centre for Global Development, New York
Despite decades of investment in social development programmes, we still know relatively little about their net impact. So why are rigorous social development impact evaluations relatively rare? This paper examines this question and provides recommendations for more and better evidence for policymaking and programme planning. A new, collective approach is needed, in which developing country governments, bilateral and multilateral development agencies, foundations and NGOs work together to define an agenda of enduring questions and fund the design and implementation of rigorous impact evaluations in key sectors.
Access full text: available online
The NONIE guidelines reflect the views of a number of impact and evaluation networks. They stress that there is no single ‘best’ method for assessing the impacts of interventions, but that some methods have a comparative advantage over others in analysing particular objectives. Quantitative methods (experimental or quasi-experimental methods) have a comparative advantage in large n interventions and in addressing the issue of attribution. Certain tools or approaches may complement each other, providing a more complete ‘picture’ of impact. The guidelines present a list of eight experimental and quasi-experimental methods for causal attribution, but acknowledge that other methods may be needed for more complex or small n interventions.
Leeuw, F. and Vaessen, J., 2009, ‘Impact Evaluations and Development: NoNIE Guidance on Impact Evaluation’, NONIE – The Network of Networks on Impact Evaluation, Washington DC
This report provides a guide to evaluating the impact of a project or programme. Impact evaluation is about attributing impacts to interventions, and can play a key role in development effectiveness. No single analytical method is best for addressing all aspects of impact evaluations, but some methods have an advantage over others in addressing a particular question or objective. Different methods can complement each other to provide a more complete picture of impact. Impact evaluations provide the greatest value when there is an articulated need to obtain the information they generate.
Access full text: available online
Randomised Control Trials: the gold standard?
Randomised Control Trials (RCTs) are often referred to as the ‘gold standard’ of impact evaluation, but whether or not they are always feasible, appropriate and rigorous is the subject of some debate. Since RCTs seek to measure a counterfactual, they often require data collection before the start of an intervention, which makes it difficult to apply this approach after an intervention has begun. One of the key issues with RCTs is the problem of ‘external validity’, or the degree to which the findings of one study can be generalised to other contexts. Some argue that while RCTs may be suitable for measuring simple, short-term development interventions, they are less suitable for more complex, long-term interventions, where many factors seek to produce change.
Duflo, E., and Kremer, M., 2003, ‘Use of Randomization in the Evaluation of Development Effectiveness’, Paper prepared for the World Bank Operations Evaluation Department Conference on Evaluation and Development Effectiveness 15-16 July, 2003, Massachusetts Institute of Technology, Cambridge, US
Just as randomised pharmaceutical trials revolutionised medicine in the 20th Century, randomised evaluations could revolutionise social policy in the 21st. This paper draws on evaluations of educational programmes. It argues that there is an imbalance in evaluation methodology and recommends greater use of randomised evaluations. As credible impact evaluations, these could offer valuable guidance in the search for successful programmes.
Access full text: available online
Case Study: Progresa in Mexico
The Progresa case is considered one of the most successful examples of the application of a randomised control trial in a development context:
Attanasio, O., et al., 2005 ‘Education Choices in Mexico: Using a Structural Model and a Randomized Experiment to Evaluate Progresa’, Institute for Fiscal Studies, London
What impact have monetary incentives had on education choices in rural Mexico? How can the design of educational interventions aimed at improving educational participation be improved? This paper analyses the education component of the Mexican government’s welfare programme, Progresa, which aims to reduce rural poverty. It argues that increasing the grant for secondary school children while eliminating it at the primary age would strengthen Progresa’s impact.
Access full text: available online
For further case studies of randomised control trials, see: www.povertyactionlab.com.
Adapting to time, budget and data constraints
Ideological positions can obscure the issue of which methodologies are actually feasible. Scientific approaches can be costly, time consuming, and therefore unrealistic. Many organisations do not have the resources to carry out the ideal evaluation, and an M&E framework needs to be designed with organisational capacity, human and financial resources and political context in mind. Although it will not be feasible to rigorously evaluate all projects, donors such as DFID now require that all projects must be considered for some form of evaluation as part of the design process. The rationale for the decision to evaluate or not must be defensible to the UK’s new Independent Commission for Aid Impact.
It is important to understand the minimum methodological requirements for evaluation rigour in cases where it is not possible to use strong evaluation designs.
Bamberger, M., 2006, ‘Conducting Quality Impact Evaluations Under Budget, Time and Data Constraints’, World Bank, Washington DC
How do cost, time and data constraints affect the validity of evaluation approaches and conclusions? What are acceptable compromises and what are the minimum methodological requirements for a study to be considered a quality impact evaluation? This booklet provides advice for conducting impact evaluations and selecting the most rigorous methods available within the constraints faced. It provides suggestions for reducing costs and increasing rigour and clarifies the nature of trade-offs between evaluation rigour and budgets, time and data.
Access full text: available online
Evaluation designs need to be adapted to local realities: experience demonstrates that no single methodology is applicable in all cases.
White, H., 2006, ‘Impact Evaluation: The Experience of the Independent Evaluation Group of the World Bank’, World Bank, Washington DC
Aid spending is increasingly dependent on proof that interventions are contributing to the attainment of Millennium Development Goals (MDGs). Yet there is still debate over the definition of impact evaluation and how it should be carried out. This paper defines impact evaluation as a ‘counterfactual analysis of the impact of an intervention on final welfare outcomes’ and recommends a theory-based approach. Two sources of bias are highlighted: contamination and self-selection bias.
Access full text: available online
Mixed-method designs
As discussed in the introduction to this section, quantitative and qualitative methods can be used to answer different evaluation questions. Quantitative methods are more useful for answering ‘what works’ questions, while qualitative methods are more useful for answering ‘why’ questions. Quantitative methods are more suited to large n interventions, while mixed methods or just qualitative methods should be used in small n interventions. Mixed methods can help to bolster findings where there are gaps in the data. These approaches are particularly useful for assessing aspects of poverty that are not easy to quantify, such as governance, trust, empowerment and security.
Garabino, S., and Holland, J., 2009, ‘Quantitative and Qualitative Methods in Impact Evaluation and Measuring Results’, Issues Paper, Governance and Social Development Resource Centre, Birmingham
This paper reviews the case for promoting and formalising qualitative and combined methods for impact evaluation and measuring results. The case for qualitative and combined methods is strong. Qualitative methods have an equal footing in evaluation of development impacts and can generate sophisticated, robust and timely data and analysis. Combining qualitative research with quantitative instruments that have greater breadth of coverage and generalisability can result in better evaluations that make the most of their respective comparative advantages.
Access full text: available online
A growing issue in development evaluation is the need for M&E practices to reflect and embrace the insights of complexity science. This involves recognising that the contexts within which governance and development interventions are conducted are unstable and unpredictable, and that existing linear models of causality are ill-equipped to understand change in these contexts. A complexity perspective implies the need for more flexible and adaptive approaches.
Ramalingam, B. and Jones, H. et al., 2008, ‘Exploring the Science of Complexity: Ideas and Implications for Development and Humanitarian Efforts’, Working Paper 285, Overseas Development Institute, London, Second Edition
What is complexity science? How can it contribute to development and humanitarian efforts? This paper explores the key concepts of complexity science and shows how they might help development practitioners engaged in reform. The concepts highlight that the best course of action will be context-dependent, and they offer new ways to think about questions that should be posed. Development practitioners need to recognise that they live with complexity on a daily basis, and to use the ‘complexity lens’.
Access full text: available online
Toolkits
Baker, J., 2000, ‘Evaluating the Impact of Development Projects on Poverty: A Handbook for Practitioners’, World Bank, Washington DC
There is broad evidence that developmental assistance benefits the poor, but how can we tell if specific projects are working? Have resources been spent effectively? What would have happened without intervention? This comprehensive handbook seeks to provide tools for evaluating project impact. It advises that effective evaluations require financial and political support, early and careful planning, participation of stakeholders, a mix of methodologies and communication between team members.
Access full text: available online
World Bank, 2006, ‘Impact Evaluation and the Project Cycle’, PREM Poverty Reduction Group, World Bank, Washington DC
The goal of an impact evaluation is to attribute impacts to a project using a comparison group to measure what would have happened to the project beneficiaries had it not taken place. The process of identifying this group, collecting the required data and conducting the relevant analysis requires careful planning. This paper provides practical guidance on designing and executing impact evaluations. It includes some illustrative costs and ideas for increasing government buy-in to the process.
Access full text: available online
Duflo, E., Glennerster, R., and Kremer, M., 2006, ‘Using Randomization in Development Economics Research: A Toolkit’, J-PAL paper.
Access full text: available online
Theory-based evaluation
Impact evaluations have increasingly focused not simply on the question of what works, but also why an intervention achieved its intended impact – or why it did not. There has been growing emphasis on incorporating analysis of causal chains and a growing interest in ‘theory-based’ impact evaluation. Despite this interest, few studies apply the approach in practice.
White, H., 2010, ‘Theory-Based Impact Evaluation: Principles and Practice’, 3ie Working Paper 3, New Delhi
How can impact evaluation identify not just what does – or does not – work, but why? A theory-based approach to impact evaluation maps out the causal chain from inputs to outcomes and impact, and tests the underlying assumptions. Despite wide agreement that this approach will address the why question, it has not often been effectively used. This paper outlines six principles for successful theory-based impact evaluation: (1) map out the causal chain (programme theory); (2) understand context; (3) anticipate heterogeneity; (4) rigorously evaluate impact using a credible counterfactual; (5) use rigorous factual analysis; and (6) use mixed methods.
Access full text: available online
One popular technique has been to use a ‘theory of change’ to help design and implement development programmes. While a logical framework graphically illustrates programme components and helps stakeholders to identify inputs, activities and outcomes, a theory of change links outcomes and activities to explain how and why the desired change is expected to come about, drawing on underlying assumptions.
Retolaza, I., 2011, ‘Theory of Change: A thinking and action approach to navigate in the complexity of social change processes’, Humanist Institute for Development Cooperation / United Nations Development Programme
What is a Theory of Change (ToC) and why is it important? This guide to understanding and developing a ToC shows how a ToC helps to configure the conditions needed to achieve desired change, using the experience of a given context. This is done partly by making assumptions explicit and by analysing them critically. It is also a monitoring tool that facilitates accountability. A good ToC allows development practitioners to handle complexity without over-simplification.
Access full text: available online
More detail about the ‘theory of change’ approach and examples of how it can be applied can be found at www.theoryofchange.org.