Disappointed workers sit at an office table.

The usual suspects: Common challenges on evaluation projects and how to manage them 

Our Thinking | insight

Published

Author

6 Minute Read

RELATED TOPICS

Share insight

Idea In Brief

Evaluations that go awry tend to face a similar set of challenges

These can be mitigated by a range of practices across the setup, design, and delivery of the project. 

The most prevalent challenge is data access and quality

This can stem from inadequacies in data collection, inconsistencies in data reporting, or simply the absence of relevant data altogether. 

Balance collaborative working and independent analysis

Addressing the challenges head-on not only enhances the credibility of the evaluation but also ensures that it serves as a valuable tool for decision-making and development. 

In our 25 years conducting evaluations for governments, not-for-profit organisations and businesses, we have observed that evaluations that go awry tend to face a similar set of challenges. While issues inevitably arise in evaluations, the likelihood that they arise, and their impact when they do, can be mitigated by a range of practices across the setup, design, and delivery of the project. These are relevant both for those commissioning an evaluation and those conducting it. 

This article explores the most common challenges we find in evaluations – the usual suspects, if you will – and how they can be most effectively managed and avoided. 

Figure 1 | The usual suspects: Common challenges on evaluation projects
Figure 1 | The usual suspects: Common challenges on evaluation projects
X
Figure 1 | The usual suspects: Common challenges on evaluation projects

1. High quality data is not readily available

Perhaps the most prevalent challenge in evaluation projects is data access and quality. This can stem from inadequacies in data collection, inconsistencies in data reporting, or simply the absence of relevant data altogether. Sometimes, access to datasets can require lengthy or complex approvals from data custodians, or access fees that may not be factored into timelines or commissioned budgets by procuring organisations. Quality issues are especially common for outcomes data; often, the data available does not measure the right outcome, or is measuring outputs that cannot be used to discern impact in a meaningful way. Evaluators thus find themselves grappling with incomplete or low-quality datasets, which can hinder their ability to draw meaningful and accurate conclusions. Data deficiencies not only undermine the rigour of the evaluation but can also erode stakeholder trust in the findings and recommendations.

Mitigating this challenge is no mean feat. Ideally, monitoring and evaluation frameworks are developed during the design of a program so that appropriate indicators and metrics are collected from early implementation and monitored during program delivery. But evaluation considerations are not always front of mind early in the policy lifecycle or program design, especially when policy teams are working at speed. 

Even so, this challenge can still be mitigated. For example, being clear, at the outset, about the type and quality of data required to support evaluations allows for the design of an evaluation method that is realistic. Preliminary assessments of the evaluability of a program – given the data that is likely to be available – can help to determine what kind of evaluation is possible and can help avoid commissioning and conducting resource-intensive evaluations that are unable to draw meaningful conclusions. 

Evaluators can also develop alternative options, in addition to a preferred data collection and analysis approach. This can help to manage risks about data availability and quality and ensure that those commissioning and conducting evaluations do not let perfect become the enemy of the good. 

In addition, making relevant data sources available to evaluators and presented in an accessible format to allow immediate analysis can ensure a productive and efficient start to evaluations, and identify whether alternative or proxy data indicators may be required to optimise the methodology design.

2. Stakeholder engagement is not insightful or representative

Another common issue is stakeholder engagement not yielding the insights one hoped for. The insights derived from speaking to stakeholders are essential for a thorough evaluation, but this can fall short if the wrong people or an unrepresentative sample of people are consulted, or if stakeholder engagement is treated as a mere formality. The absence of a shared understanding among all stakeholders about the scope, purpose and timing of an evaluation – or the lack of buy-in among key stakeholders – can get things off on the wrong foot. This is especially important when evaluating programs affecting First Nations peoples, which often requires specialised skills and cultural competency, and dedicated time and investment.  

Mitigating this risk starts with making internal and external stakeholders aware of the evaluation, including the purpose, timelines, and expectations regarding their participation. In most cases, Nous has had success in collaboratively designing evaluation methodologies with stakeholders to gain buy-in for future evaluation participation. Clarifying the extent of an evaluation’s independence is critical, a point we discuss further below. If there are complex internal governance forums overseeing the evaluation, it is critical to clarify how they will work with the evaluation team and who will ultimately sign-off on the evaluation design and reporting. 

Careful design of a stakeholder engagement plan is also important. With limited time and resources, there will inevitably be trade-offs, but consultation is inherently valuable: giving people the forum to voice their experiences, explain their perspectives, and reflect how a program is valued or could be improved can be intrinsically good independent of the insight that it produces. In such contexts, broad engagement though a range of different forums, like workshops and focus groups or surveys, can obtain diverse perspectives efficiently. 

Often, however, it is best to spend the time identifying and engaging extensively with those who are truly ‘in the know’ about a program or policy. We find that a targeted number of interviews with stakeholders with deep knowledge often produces more insight than many survey responses or focus group participants.   

In the context of evaluations involving First Nations people, communities and organisations, ensuring cultural safety in the engagement process is non-negotiable. There are many frameworks and guidelines for evaluation in a First Nations context, including the Australian Evaluation Society’s First Nations Cultural Safety Framework which sets out principles of culturally safe evaluation. In Nous’ experience, the principles in Figure 2 below are particularly useful to achieve culturally safe and meaningful engagement.

Figure 2 | Principles that guide Nous' approach to stakeholder engagement in First Nations evaluations
Figure 2 | Principles that guide Nous' approach to stakeholder engagement in First Nations evaluations
X

3. Evaluative reasoning underpinning findings is not made explicit

Evaluations are not simply lists of objective claims about the outcomes of an intervention or the implementation of a program: they typically involve evaluative assessments about how good or appropriate the outcomes or implementation were and why they matter or don’t. In other words, they involve value judgements.

A common challenge in evaluations is that the standards used to assess the performance of a policy or program – the basis for determining whether, or to what extent, a program is excellent, okay, or unacceptable – are not made explicit during the delivery of an evaluation. A lack of transparency around standards can lead to reasonable questions about the evidence and reasoning that inform findings and recommendations. 

To mitigate this issue, evaluators should strive for explicitness and transparency in their value judgments. Good evaluative reasoning requires clarity on the standards used to measure the performance of a policy or program. For example, clearly defining what good, great, and unacceptable performance looks like ensures transparency and consistency. Performance levels can be defined using a range of sources, including program theory, agreed indicators of progress, program targets, academic literature, performance checklists, or industry standards or benchmarks.

Nous finds it useful to have rubrics that set out criteria for performance and describe what different levels of performance look like. These can be a handy addition to program logics, theories of change, and key evaluation questions, which tend not to make transparent how evidence is synthesised and analysed to form an overarching evaluative judgement. Clearly articulating the evaluative criteria and the rationale behind their selection helps stakeholders understand the basis of the conclusions drawn, and builds buy-in to and acceptance of evaluation findings. 

4. Preconceived notions impede methodological flexibility

Given the high stakes and often public profile of many evaluations, government agencies and organisations understandably require detailed research designs from evaluators before the evaluation commences. This can be valuable to ensure independence, quality and credibility in the outcome. 

Often, however, evaluators are required to develop research designs and methodologies with little understanding of agenda, context, constraints, and data availability. Remaining steadfast to the original proposal – even as new information comes to light – can be counterproductive. To achieve an evaluation’s intended outcomes, often evaluators will need to adjust their approach over the course of the project. 

This challenge can be managed by incorporating flexibility as a core principle underpinning design, procurement and delivery of an evaluation. This enables teams to adapt their approach and methodology as new information is acquired. In particular: 

  • During the design phase it is important to recognise that tools like a theory of change, program logic and outcome indicators and measures will often be refined over the course of an evaluation.
  • During procurement, it is generally better to have a descriptive rather than a prescriptive request for quote (RFQ) that enables methodological flexibility and innovative approaches. It is also helpful to have processes that enable revisions in scope and size as new information emerges.
  • During delivery, it is helpful to have budgets that allow reallocation of resources as needs and processes evolve and adaptive methodologies that can change in real time to suit project needs and timelines. For example, where data quality or availability issues are significant, it can be helpful to go back to the drawing board or discontinue an element of the research, rather than persisting with data that will not support credible findings. 

5. The evaluator’s independence is not adequately protected

An evaluator’s independence is important to the quality and credibility of their findings. A common challenge in evaluation projects is the misalignment between ways of working and the required level of evaluator independence. This can manifest in various ways, such as evaluators being overly influenced by stakeholders or not having the autonomy needed to conduct thorough and unbiased assessments. 

More generally, there are risks that evaluators become unduly influenced by the perspectives of those commissioning the work, who often have a vested interest in an evaluation finding that the program or policy has been effective. As a result, the evaluation's credibility and objectivity may be compromised, leading to findings that stakeholders may perceive as biased or untrustworthy. 

It is crucial to clearly define the level of independence required for an evaluation at the outset and adopt ways of working that ensure that this is achieved. At Nous, we embark on collaborative or participatory design and formal agreement on methodology, before shifting to independent analysis of data and reporting. Developing and adhering to protocols that safeguard evaluator autonomy is essential to ensure that a collaborative approach to working (for example, sharing progress, risks, opportunities for change or enhancement, and emerging insights) does not impede the independence of findings. This may include creating formal agreements that outline the boundaries of stakeholder involvement. Key questions regarding a commissioners’ involvement in shaping the recommendations, designing evaluation tools, and contributing to the report are key factors to consider. 

Good evaluators must of course be committed to providing a fair, balanced and impartial view about the performance of a program or policy. But given that people are prone to a range of well-known cognitive biases that impact the quality of reasoning, a good faith commitment to independence is often not enough: it is necessary but not sufficient. Attention to structural and methodological factors in the design and delivery of an evaluation can help to ensure that an evaluator’s independence is protected. 

Avoid the pitfalls…

Successful evaluations require everyone involved to be aware of, and proactively address, these common challenges. Ensuring clarity about data access and quality, engaging stakeholders with integrity, employing transparent evaluative reasoning, and being flexible and adaptable on methodology are all key steps. Establishing shared intent and maintaining a balance between collaborative working and independent analysis also strengthens evaluations. 

By being aware of these ‘usual suspects’ at the outset, evaluators can deliver robust insights that have a real-world impact on policies and programs. Addressing the challenges head-on not only enhances the credibility of the evaluation but also ensures that it serves as a valuable tool for decision-making and development. 

Get in touch to discuss how your organisation can manage and avoid risks in delivering evaluations. 

Connect with Robert Sale on LinkedIn.

Prepared with input from Heidi Wilcoxon and Andrew Benoy.

This is the second in our four-part series of articles on good practice evaluation. This series focuses on the steps you can take to ensure rigorous, high-quality evaluations. Download the full series here.