UNDP United Nations Development Programme برنامج الأمم المتحدة الإنمائي
Programme on Governance in the Arab Region برنامج إدارة الحكم في الدول العربية POGAR
Publications: Judiciary

- Introduction

- Assessments
- Elements of an Overview Assessment
- Monitoring
- Evaluation
- Research
- Conclusions

- References
- Annex I: List of Basic Statistics for an Assessment
- Annex II: A Proposed Instrument for Assessing Judicial Operations

Assessments, Monitoring, Evaluation, and Research: Improving the Knowledge Base for Judicial Reform Programs
By
Linn Hammergren

Monitoring:
Until the recent emphasis on results management, monitoring of judicial reforms was conspicuous by its near absence. When it was done, the emphasis was on tracking inputs or outputs delivered, not the impact on overall goals and objectives. While the shift to tracking results is important, it poses problems for this kind of project because of how institutional change is accomplished. The underlying logic of any institutional change project is as follows:

  • Certain external impacts are pursued by changing the ways institutions operate.
  • Institutional change is affected by shifting the mix, composition, and quality of internal variables.
  • Until internal behaviors have responded to these shifts, outcomes and impacts are not likely to vary, and if they do, there may be a temporary decline in quality or quantity while the new patterns are learned and perfected.

The process described thus has two main stages: first alterations in internal operations are effected, and once they are in place, outcome should improve. This suggests a sort of progress by plateaux in which the second order, outcome changes will not occur incrementally, or not begin to do so, until the first order changes are relatively complete. In reality, there is still a third stage, in which improved outcomes (court performance variables) produce changes in extrajudicial behaviors and thus in overall societal well-being. We would not expect to detect outcome changes (and the subsequent impacts on societal goals) until the internal change process is well advanced. This resulting dilemma can be phrased as two related questions:

  • How do we know anything is happening in the meantime?
  • How can we be sure the long-term goals will ever be achieved? (If, for example, the hypothesized links between internal change, outcomes, and impacts do not hold).

Only the first is really a monitoring problem, but its solution is closely related to the second. If one cannot directly monitor progress through incremental changes in the final objective (whether reduction of delay, improved decision making, greater trust in courts, or further downstream, higher growth rates, poverty or crime reduction), then one will have to monitor it by tracking progress in making intermediate process modifications. In essence, this means monitoring by benchmarks – the accomplishment of the various stages in the execution of a change strategy which in the end will produce fundamental alterations in how things are done and thus in the value of outcomes. For example, a delay reduction effort might start with the establishment of baseline data, proceed through an analysis of process bottlenecks, develop means for changing those not regulated by law and encouraging any needed legal regulations, introduce improved manual or automated systems, and so on. The benchmarks generally coincide with the project strategy and plan, following the chronological order they establish.

It is also possible, going to a much lower level of detail, to track the impact of these partial internal changes on lower order behaviors. For example in a project aimed at improving the coordination of police and prosecutorial operations in Panama and thus their success in detecting, investigating and prosecuting crimes [ 9 ], intermediate progress was measured by tracking changes in specific procedures – types of evidence collected, rate of consultations between police and prosecutors, or even something so simple as improvements in the format of the police charge sheet. These are extraordinarily context specific indicators, not only as regards the specific country, but also the problems identified for resolution. In Panama, it was thought important to encourage police to number the pages of the charge sheet. This thus became an indicator of progress. In the delay reduction project mentioned above, behavioral changes might include participants’ adoption of the new systems, evidence that once the means to do so are installed, those responsible are keeping track of the progress of individual cases and trying to meet deadlines, that records of individual judges or courtrooms are being monitored, etc. In both examples, internal changes should be visible long before the outcome goal (more investigation completed within a shorter time, higher clearance rate of criminal cases, or reduced average time to disposition and backlog reduction for all cases) shows any signs of impact.

In the end, the significance of achieving the benchmarks or effecting the intermediate behavior changes hinges on the validity of the overall strategy. If training judges is not going to produce better decisions, than monitoring stages in implementing a training program or judges’ absorption of new skills and knowledge will not have much point. This type of monitoring only makes sense in the context of a viable change strategy. The solution to this quandary is not to return to an effort to monitor outcomes and impacts. However much those responsible might want to do so, they are not going to have visible system-wide results for years. Instead, the emphasis will have to go to improving the quality of strategies and finding some way to test the hypotheses on which they rest. There is, most would agree, considerable room for improvement here. Much of it rests on the other elements of better knowledge management, including improved analyses of country specific problems (assessments), a better use of the lessons of past experience (starting with more systematic evaluations to bring them forth) and a concerted effort to explore hypothesized linkages for which empirical evidence is scant (research). This reinforces the relationship already noted among the topics explored in this article as well as the importance of increased attention to knowledge consolidation, dissemination and debate.

Although the topic of improving strategies extends beyond the themes discussed here, two further measures warrant noting: a effort on ensuring that reform programs do have explicit strategies, thus lending themselves to monitoring, ex ante quality control, and ex post evaluation, and the use of pilot programs to do a quick check on the anticipated outcomes. As regards the first measure, as critics have frequently noted, large and small programs alike are too often characterized as collections of activities with only the most tenuous links to their presumed downstream goals. They embody strategies only in the loosest meaning of the term. Forcing them to articulate the means-ends changes they presumably incorporate might be a way of better aligning outcomes with inputs (how much impact will a code or a course on ethical behavior have on curbing rampant corruption?), weeding out some of the less likely proposals (will $19 million of new buildings and $1 million of computer equipment really increase access to justice in a country where judges are believed to be corrupt, incompetent, vulnerable to political pressures, and guided by outdated rules and procedures?), and establishing a better base for both monitoring and evaluation. This does not eliminate space for innovation. It does mean that an innovative approach which produces no results will less likely be reattempted in the future.

Novel approaches which cannot be tested by logic or past experience can be tried out in pilot projects. Rather than build all the new, computer-equipped courtrooms, or train the entire judiciary, the proposed approach can be applied in one district to see not only whether it has an impact, but also how that impact might be enhanced. Pilots can also be used to assess the costs of successful implementation, evaluate competing approaches, and make adjustments so that replication, if desired, will be more within the means of the country. Pilots, of course are special cases, and should be evaluated with this understanding. What works for a group of judges who see themselves as, perhaps are selected for, being in the vanguard, will probably have less impact for the entire universe. Pilot projects are themselves no novelty in judicial reform programs. Unfortunately, they have too often become ends in themselves – neither created with an eye to their replicability, nor utilized as inputs to a broader knowledge base. This accounts for a growing skepticism about their value, for which their backers must take a large part of the responsibility. The other part lies in an inadequate evaluation of their results and potential impact, and thus leads to a third element of the knowledge management agenda.

____________________________
[9] This was a USAID project, undertaken in the early 1990s. I am indebted to Tim Cornish, a consultant with the project for providing documents and explanations of the benchmarking process.

Top of this page