Strengthening the Evaluation Design and the Validity of the Conclusions

How can threats to the validity of evaluations be identified and addressed? This chapter from Realworld Evaluation: Working Under Budget, Time, Data and Political Constraints outlines some of the most common threats to the validity of both quantitative (QUANT) and qualitative (QUAL) evaluation designs. It offers recommendations on how and when corrective measures can be taken to protect validity.

The concept of validity is closely related to that of accuracy: actual conditions must be represented in the evaluation data. In QUANT methodology, the accuracy of the data is referred to as internal validity or reliability, and in QUAL methodology, as descriptive validity or credibility. The validity of evaluation findings based on data are referred to as interpretive or evaluative validity (QUAL) and their applicability beyond the site context as generalisability (QUAL) and external validity (QUANT). The validity of an evaluation is affected by: (a) the appropriateness of the evaluation focus, approach and methods; (b) the availability of data; (c) how well the data support valid findings; and (d) the adequacy of the evaluation team to collect, analyse, and interpret data.

Evaluation designs can be assessed for potential threats to the validity of conclusions. Steps can then be taken to strengthen the likelihood of adequate and appropriate data collection and of valid evaluation findings. The Integrated Checklist for assessing evaluation validity, which includes more specific information, may be helpful.

To assess and strengthen QUAL evaluation designs:

consider the comprehensiveness of data sources
consider the cultural competence of data collectors
consider the adequacy of ongoing and overall data analysis techniques and team capacity.

To assess and strengthen QUANT evaluation designs:

consider whether random sample selection is appropriate and, if so, whether there is sufficient sample size or any potential sampling bias
consider whether key indicators have been appropriately identified and whether measures or estimates of them are likely to be accurate
consider whether statistical procedures have been appropriately selected and whether there is sufficient expertise for their use.

To assess and strengthen all evaluation designs:

consider and use, as available, triangulation, validation, meta-evaluation and peer review
consider the likelihood that a thoughtful combination of QUAL and QUANT approaches in a mixed-method design would improve the comprehensiveness of data and validity of findings
consider the attitudes of policymakers and how they may affect data access and utilisation.

Summary