UNDP United Nations Development Programme ÈÑäÇãÌ ÇáÃãã ÇáãÊÍÏÉ ÇáÅäãÇÆí
Programme on Governance in the Arab Region ÈÑäÇãÌ ÅÏÇÑÉ ÇáÍßã Ýí ÇáÏæá ÇáÚÑÈíÉ POGAR
Publications: Judiciary

- Introduction

- Assessments
- Elements of an Overview Assessment
- Monitoring
- Evaluation
- Research
- Conclusions

- References
- Annex I: List of Basic Statistics for an Assessment
- Annex II: A Proposed Instrument for Assessing Judicial Operations

Assessments, Monitoring, Evaluation, and Research: Improving the Knowledge Base for Judicial Reform Programs
By
Linn Hammergren

Elements of an Overview Assessment:
1) Descriptive section: Description is often seen as a mere prelude to the real assessment, and consequently either given short shrift or done with no particularly sense of what is relevant and what is not. It is not uncommon to find entire assessments composed of little more than a repetition of legal facts, a restatement of the laws with little attention to how they really operate. Thus, to facilitate cross-country comparison and to ensure the information gathered will be the most useful for country planning, the following guidelines are recommended:

    General Discussion of national context: There are some basic elements of the national setting which are critical to evaluating judicial performance. These include the population size, distribution (rural and urban, as well as regional dispersion), per capita income and income distribution, economic base, literacy rates, ethnic and other divisions, and so on. Certain historical information will also be needed – colonial tradition, form of government and any recent changes (e.g. redemocratization), recent ethnic conflicts and civil wars, and salient political issues. The treatment should be brief, and is largely for outsiders, but it may even be helpful to local planners in interpreting what follows.

    General Description of Legal Tradition and Basic Organization: While this general description is primarily for “outsiders” who will either review or make inputs to the reform plan, it may be a useful reminder to insider judges and others of what the entire system comprises. Emphasis here is less on how the system operates than on establishing its basic organizational and procedural parameters. Although a few basic statistics will enter here (how many members on the Supreme Court, number of court districts, etc) most of these are reserved for the next section.

First off, what is the legal tradition (common, civil or some subcategory or other mix)? Is there an indigenous or traditional source of law and what is its legal and organizational relationship to the state system? What are the basic laws shaping the court or justice system, when were they written, and how often have they been revised? How much of the organization, powers, etc of the sector are governed by the Constitution, by statutory law, or by regulations issued by the organizations themselves? What are the major institutions in the ordinary justice system, peripheral ones (administrative and other special courts, administrative police, bar associations, independent regulatory agencies with judicial powers, etc)? How is legal representation handled? Are there traditional or communal adjudicatory bodies outside the ordinary court system? If so, how are they organized and where are they located?

As this is the most free form part of the exercise, it would be helpful to establish a general outline to be used by those without a strong independent preference for their own descriptive scheme. This would make the results more useful for comparative purposes and also avoid the omission of important categories. Even experienced reviewers frequently overlook administrative or parallel court systems, or ignore things like the presence, organization and compulsory nature of bar associations and memberships. The section analyzing operational workings may remind them of their omissions, but a basic checklist of what should be included in the descriptive system would also be helpful. It might be a joint product of agencies doing many of these assessments. It then could be made available to all interested in undertaking one.

Basic Statistics (so far as available) on Sector operations: It is extremely helpful, although often very difficult, to include statistics ranging from the number of judges, court staff, prosecutors, private lawyers, etc to the number of cases filed (by category, district, individual court), the caseload per judge, annual filings and dispositions, backlog, and if possible disposition rates and times. Figures on budgets, the funding sources (how much from the national budget, how much from judicial fees or taxes?), salaries, and expenditure categories are also useful as are crime rates (at the various points of reporting, including, if available, victim surveys). The reason for collecting these statistics is to flesh out the narrative descriptions of judicial and sector operations and potential problems, and to allow comparison with figures drawn from data bases on other systems.

As further elaborated below, single statistics tell us little, and even a complete statistical profile of a country must be interpreted with care. However, for a reader with some knowledge of worldwide trends, a few key numbers (judges and cases per 100,000 population, ratio of judges to court staff, prosecutors, defenders and private lawyers, budgets and salaries) may provide a much better handle on performance and performance problems than the longest narrative description. This also suggests the utility of a standardized format for data capture and presentation. Experience suggests that some statistics are more meaningful than others, and that the manner of their presentation also facilitates their use. The lump sum spent on salaries is far less useful than the salary range for different kinds of employees (and this in turn is meaningful only judged against per capita income and salary scales for similar public and private sector employees). An understanding of the real workload usually requires several statistics: cases pending, filed and disposed on an annual basis, broken down by level of court, and if possible by major types of proceedings. A list is provided in the annex outlining the usual categories and conventional presentations of data.

Obviously, the availability and quality of statistics are highly variable. In some countries, it may be impossible to get a good count even of the judges and lawyers. Individual court districts may have no idea of the size of their caseload or what is being resolved and at what speed. Furthermore, available statistics are often of dubious quality. Used largely for end of year speeches by agency heads, they may never be checked and as a result supplied quite casually. Finally, whatever their accuracy they may not be provided in very useful forms. Socio-economic information on court users is frequently unavailable even in the most sophisticated systems, as are useful breakdowns by type of case, or comparable categories across jurisdictions. Despite long years of statistical studies of judicial operations in the United States, researchers still find it difficult to make comparisons across state systems. A recent effort [ 3 ] to accumulate comparable statistical data bases on a number of European courts has footnotes as long as the statistical tables – allowing the authors to stress the difficulty if not impossibility of making precise comparisons. In such more developed systems, even something so simple as an appeals rate is often impossible to calculate.

Nonetheless, numbers are important, and efforts should be made to collect them. They are helpful in determining the relative importance of phenomena identified more impressionistically and may also help put a different slant on problems reported by local observers. It is rare to find a judiciary that is not described as swamped with work, but figures for annual filings or dispositions often present another picture. In the worst of cases, assessors may have to supply their own statistics, conducting inventories or studies based on smaller samples. Most assessments will not have funding to do this, and thus the absence of a good, reliable statistical system may be among their most important findings. In that situation, they will just have to collect what they can, note what is absent, and interpret their findings with extreme caution.

It is also important to recognize that the numbers don’t speak for themselves, and where they do speak it is usually cumulatively, contextually, and comparatively. There has been an on-going effort in recent years to collect and use court statistics as basic indicators of judicial performance. This, it should be stressed, is not why their collection is recommended and brings its own problems. Efforts to elevate certain statistics (average caseload, number of judges per 100,000 population, ratio of judges to court staff or to private lawyers, conviction rates, etc) to the level of indicators (representative of some aspect of the quality of judicial performance) or even worse, to develop one or ten basic indicators of the same have yet to be abandoned, although most who have attempted this have soon discover the practical and logical shortcomings [ 4 ]. Are 22 judges per 100,000 inhabitants better or worse than 8 or 30? What is a reasonable caseload for individual judges? A good disposition rate? A good appeals rate? To begin to develop answers to these questions we need far more information on the range of variations (one additional reason for urging the collection of statistics, and in some fairly standard categories), and even then, the answer is far more likely to be a range rather than a single figure, and still subject to contextual constraints.

In short, statistics, when available should be considered primarily as ways of enriching the description of the target system, not as a direct means of evaluating its performance. That there is, and probably never will be one statistic that can be used to measure the quality of justice, is not only a consequence of the various values pursued, but also of the different contextual situations of specific systems. These descriptive statistics may, once a problem has been identified, be a means, collectively and comparatively, of deriving some hypothetical explanations. The number of judges absolutely or per 100,000 inhabitants means nothing on its own. Combined with litigation rates, number of lawyers, and some understanding of the procedural rules and requirements for representation, it may begin to give us a picture of whether a complaint about delay or lack of access is valid or not.

The same need for caution should be applied to more complex statistics and statistical trends. Although they look more like good indicators, they are usually subject to a variety of interpretations and explanations. Ironically, the same statistical profile might characterize a very good and a very bad system, and as the situation develops over time, rates and trends may take unexpected directions. More or less is not always better. A high conviction rate on criminal cases brought to trial might mean a well trained prosecution with a good nose for what should go to court, or just an easily pressured judiciary. Where the latter is the case, an effort to impose due process rules and curb other abuses may well lower the conviction rate, at least temporarily. A drop in the percentage of untried prisoners (a usual goal for human rights groups) might mean a very selective use of pretrial detention, or that other, less desirable means have been found for reducing the prison population. As has been frequently observed, crime rates depend on many things besides an effective criminal justice system. A system may be so ineffective that citizens simply stop reporting crimes. As improvements are made, and citizens begin to have more faith in it, reported crime rates are likely to rise. The highly desirable 50-50 rate for overturns on appeals (indicating that only very difficult cases go to appeal) could also be produced by a system that operates like a lottery, and even worse a corrupt one.

For those used to dealing with general purpose indicators like infant mortality rates or GDP, the ambiguous significance of candidate indicators for judicial performance will be perplexing. It is a logical consequence of the judicial role, the variations in how it is defined and enacted in different countries, and the complex nature of its output. The judiciary is a reactive rather than proactive institution, and the raw material with which it works (conflicts and rule violation) is shaped by a variety of other forces. Its own statistical systems track the tip of the iceberg (the cases getting to the courts), not the broader social phenomena. Limiting performance measures to how well it deals with the business it receives would be tantamount to judging a public health system only by the number of sick people treated. Both measures are important, but both, if used exclusively, would also create perverse incentives. Courts like public health systems have a preventive as well as a curative role, and if performance evaluation hinges only on the latter, they may be encouraged to waste their resources by attracting more business than is necessary and focusing on processing the easy, not the important cases [ 5 ].

2) Analysis of Internal Operations: Because this is the area where most reforms will be directed, it is an especially critical part of the assessment. The objective here is not to focus on output problems (e.g. delays, bias, barriers to access), but rather to understand how the organization or organizations function. The immediate challenge is to select those characteristics which are most influential in shaping sector output and defining its quality. Early assessments often did this by measuring organizational and operational details against an implicit model – usually drawn from the assessors’ own country. The recognition that function can and should be separated from structure and that there are consequently various paths (or path dependencies) to improving performance has made this less acceptable. However, it has also removed the implicit yardstick. Fortunately, there are any number of suggested substitutes. In fact, over the past two decades, those called upon to do assessments have often seen part of their task as leaving a format that might be used by others [ 6 ]. There is still no agreement on which is best, and a decided tendency to reinvent rather than addopt some existing proposal.

Presumably this sort of checklist should be short, and its elements should have a direct relationship to predicting output. As many normal inclusions are already covered in the above sections (statistics, general description) most of the suggested templates could be substantially condensed. The suggestion here is to treat the judiciary (and other sector institutions covered) as an organization with all the typical requirements for effective functioning. This means a way of selecting appropriate staff, provisions for supervising and directing their performance (without, in the case of judges, interfering with the necessary degree of independent decision making), adequate resources and a system for administering them, and procedural rules congruent with quantitative and qualitative standards for output. The attached table is one suggestion of how this might be covered. It is divided into three dimensions, one covering simple organizational functionality (here called institutional governance) and the others relating to independence and accountability. It is particularly tailored for the judiciary, and would have to be slightly modified for other sector institutions, especially as regards the independence and accountability dimensions.

Aside from a reluctance to adopt existing schemes, the biggest obstacle to constructing this sort of assessment guide has been the tendency for legal professionals to define the problem in terms of the adequacy of the legal framework. Given the often imperfect coincidence between legal requirements and what really transpires, this has not been a very productive approach. It has not been improved by an accompanying faith in certain doctrinal principles believed intrinsic to better performance [ 7 ]. At least in explaining internal operations, the principles of neo- and classical organizational (or institutional) analysis are arguably better guides. Judiciaries may have special characteristics, but they still have to provide adequate incentives to their professional and administrative staff, monitor performance, and administer their resources effectively. Their failure to do so appears to account for many common problems and thus a good part of reform programming will inevitably focus on these functions.

3) Discussion of the Major Problems Attributed to Sector Performance: While it may be a hard message for their strongest proponents to swallow, judicial reforms are usually not supported for their own sake but as instrumental means of achieving some larger goal. At the very least such goals involve improving the provision of services already delivered (greater efficiency and efficacy in deciding conflicts and otherwise dealing with existing demand). Often they refer to downstream events, the judiciary’s impact on such things as crime, economic investment and growth, incorporation of marginalized groups, citizen security and so on. The links between the two levels of goals are not very well defined. Efforts to articulate them in any given country often rest on scant, and frequently inaccurate understandings of what courts actually do. The ultimate solution, a good, empirically based theory of judicial roles and impacts, is a long term project, however. For specific assessments, the immediate task is to inventory and test current complaints, and to identify other problems which may be less obvious to those standing too close to the system.

Any reform project must logically start with a problem or with a desired change in the status quo ante. Hence, early on and throughout the assessment, information will have to be gathered on what people believe to be wrong, as this will be the motivating force for the reform. Initial responses are likely to be vague and not very specific – the judiciary is not modern, the laws are out of date, there is too much delay, judges are overworked, biased, corrupt, or incompetent. Assessors will have to attempt to elicit more precision or at the very least some concrete examples of the alleged ills. Ideally, this will provide them with a list of more targeted areas for improvement with easily identified causes (measurable delays or paralysis of cases caused by procedural bottlenecks, inefficient administrative practices; or too few judges or support staff; limited access caused by the physical distribution of courts, high court fees, or requirements for representation; widespread use of speed money or petty corruption caused by low salaries, insufficient supervision, or court practices which facilitate bribes). However, they are just as likely to find that problems are exaggerated, misdiagnosed, or entirely ignored, or that widely touted remedies are unlikely to produce much of any effect. Corruption or delay may be less common, or more localized than usually believed. What looks like an excessive workload may be vastly inflated by a large number of inactive cases remaining in the courts. Judges may make good decisions, but execution may be avoided. Or reformers may believe that all it will take is an up-to-date law to increase the efficacy of the criminal justice system or bring foreign investment flooding into the country.

To continue the medical analogy, assessors are very much like a doctor confronted with a patient with a variety of complaints and symptoms who has already decided what is wrong and what needs to be done. The patient’s complaints and symptoms cannot be ignored, but the self-diagnosis is frequently completely in error. What the patient would like to see fixed with a pill, a shot, or perhaps an operation may require a much more complex kind of treatment, including lifestyle changes and other less pleasant kinds of therapy. However, because judicial health is a much more subjective state, reported ills and even suggested remedies will necessarily be given more weight. On the basis of international experience, advisors should note the costs and other difficulties associated with making certain changes, or the likelihood that proposed solutions will not bring the desired results, but in the end what constitute a better system is in the eye of its users, not in any sort of universal standards.

4) Analytic Summary: Taken as a whole, the three parts of the assessment provide an overview of the situation of the sector or organizations reviewed, list and prioritize the principal problems attributed by the local stakeholders or found by the assessors, and identify traits and practices accounting for current patterns of operations and output. The analysts’ further task is to use this information to derive an overall diagnostic of sector or organizational performance and suggest how it might be improved. Their reworking of the basic information will also separate spurious or second order problems from more basic ones, and note where solutions might be more practically pursued outside the sector or organization [ 8 ]. Areas where further investigation is required should also be noted; in this sense, the initial assessment should be regarded as a first cut at diagnosis. While the quality of the analysis hinges on the skills and knowledge of the assessment team, the emphasis on more standardized formats is intended to assist their efforts. It should help call their attention to details they might otherwise overlook, simplify the collection of information, and provide a basis for evaluating the significance of the immediate findings.

Because assessments (and evaluations) collectively constitute our best source of information on sector operations and reform programs, it is particularly important that they be made available to the entire reform community. There are evident obstacles to adopting this in practice, but unless they can be overcome, the discipline as a whole as well as individual reform efforts will find their progress limited.

____________________________
[3] See Blankenburg
[4] USAID developed such a list of 75 indicators in the mid 1990s. The list has since been relegated to the status of “suggested” indicators when field testing revealed that most were relevant only for a few systems, and that even the desirable direction of change was contextually determined. See USAID …
[5] In a study of the Mexican federal courts, two researchers found that an evaluation system based on cases resolved tended to encourage the admission of cases which would ultimately be dismissed for lack of merit. Since they counted as dispositions, judges had little reason to reject them out of hand. (Magaloni, and Negrete, n.d.)
[6] A sample of the efforts is found in Hammergren (1998).
[7] Many of these are now entrenched in international conventions which often seem to incorporate arbitrary, and occasionally ethnocentric assumptions about best practices. While much of this is so vague as to constitute only a nod toward good intentions, efforts like those in Latin America to stipulate a guaranteed percentage of the national budget for the judiciary are potentially counterproductive.
[8] For example, research in Mexico (Hammergren et als, 2001) on debt collection proceedings found many problems originating outside the courts – for example, poor practices in issuing credit, absence of credit bureaus, corrupt property registries, lack of debtor education.

Top of this page