Improving the Quality of Mass Justice

Font Size:

Agencies need to pursue systematic efforts to provide quality assurance in their adjudicatory processes.

Font Size:

Based on leading work by Jerry L. Mashaw, the Administrative Conference of the United States (ACUS) in 1973 issued a seminal recommendation urging federal agencies to use “statistical quality assurance reporting systems” anchored in “positive caseload management systems” to ensure timely, accurate, and fair adjudications. Since then, the challenges agencies face as they struggle to decide large numbers of cases, matters, and claims have only deepened.

The backlog the country’s immigration courts face has grown at record-breaking speed in recent years. Enormous spikes in caseloads have prompted the U.S. Department of Health and Human Services’ Office of Medicare Hearings and Appeals to test untried case management methods. The pandemic also forced fundamental changes to the hearings the U.S. Social Security Administration (SSA) convenes for hundreds of thousands of disability benefits claimants each year.

These challenges threaten to create a crisis in the quality of agency decision-making. Individual claimants can always appeal erroneous decisions. But if agencies err wholesale, any retail efforts at appeal, pursued one-by-one and haphazardly, will still leave many errors undetected. Matters as important as asylum or access to vital social welfare benefits hang in the balance.

There is good news. Agencies are as well-positioned as they have been since 1973 to improve the quality of their decision-making systematically. Agencies large and small have experimented with innovative methods to strengthen adjudicator performance. Several agencies have developed sophisticated case management systems to capture data they use to identify recurring problems and inform effective fixes. Artificial intelligence holds considerable promise.

To renew the work it started in 1973, ACUS asked us to report on quality assurance practices and how federal agencies might further develop them to ensure accurate adjudication. This work culminated in a new ACUS recommendation entitled “Quality Assurance Systems in Agency Adjudication.” Agencies that implement this recommendation will take important steps toward ensuring that their adjudicators accurately decide the millions of claims, cases, and matters that come before the federal government each year.

Our report drew upon prior work each of us has done to study and implement quality assurance systems. One of us, for example, spearheaded extensive efforts at the SSA to improve the quality of disability benefits adjudication. We have also studied and evaluated quality assurance programs used by numerous other agencies, including the Executive Office for Immigration Review and the Board of Veterans Appeals. Our research has included extensive analyses of data reaped from an agency’s case management system, a comprehensive historical survey of agency engagements with quality assurance, and reviews of thousands of pages of internal agency documents. We have also interviewed officials responsible for quality assurance at numerous adjudicatory agencies.

Our research yielded important lessons. One of these was that the heterogeneity in agency adjudication systems—from disability adjudication to patent examination—means that there is no one-size-fits-all approach to quality assurance. A meaningful program of quality assurance requires answers to several key questions:

  • Who participates in the review of adjudicator decision-making, and for how long?
  • Whose work gets reviewed?
  • How do cases get selected for review?
  • What standard should reviewers use to measure the quality of decisions?
  • At what stage in the adjudication process should decisions get reviewed?
  • How should a quality assurance program provide feedback to adjudicators?

These questions involve important matters of institutional design. For instance, a large agency handling a high caseload might opt for a formal quality review program, with dedicated staff who serve years-long terms and who sample a small but significant percentage of cases for review. A small agency may not be able to justify this resource investment. But its adjudicators might still participate in a peer review program, as we observed at several agencies, in which adjudicators assess and comment on drafts authored by colleagues.

Institutional concerns also involve where reviewers fit into an agency’s hierarchy. Reviewers must have the requisite expertise to command the respect of adjudicators whose work they review. They need the capacity and willingness to exercise independent judgment. At the same time, constructive engagement with feedback may depend on adjudicators’ sense that quality reviewers understand and sympathize with the day-to-day demands that decision-making imposes.

Other matters that should influence a quality assurance program’s design include an agency’s relationship with a reviewing court and whether internal agency reviewers ought to calibrate their reviews to how decisions might fare if appealed. The SSA, for instance, has a program whereby judges from its appeals council and frontline administrative law judges—adjudicators from different levels of appeals—work together to resolve differences in interpretation. Also important are the mechanisms agencies can use to provide feedback. These mechanisms may depend on the agency’s information infrastructure and the relationships between supervisors and subordinates.

Another important lesson from our research centers on emerging tools for quality assurance that agencies have pioneered since the 1973 ACUS recommendation. These include, importantly, the development of case management systems and case analysis tools that enable agencies to capture large amounts of data on decision-making. If properly designed, these systems can enable agencies to identify issues that cause outsized numbers of errors. Rather than relying on individual appeals to correct mistakes one-by-one, agencies can leverage lessons from data to develop targeted interventions against recurring flaws.

At the frontier of quality assurance lie uses of artificial intelligence and data-driven techniques. Natural language processing and even more ambitious uses of machine learning can enable agencies to steer decision-making in real time to avoid errors. The SSA, for instance, has developed a tool that can review draft decisions and flag over 30 types of possible errors for adjudicator review before a decision’s issuance.

These tools may affect institutional design choices. If available agency personnel and resources would otherwise only permit quality review of a small sample of decisions after issuance, artificial intelligence tools may allow the agency to subject all decisions to a form of ongoing, continuous evaluation.

Prompted by our report, ACUS’s Committees on Adjudication and Administration and Management proposed a recommendation, which ACUS adopted in December 2021. The recommendation renews ACUS’s 1973 call for quality assurance in agency adjudication.

Most importantly, the recommendation advises federal agencies to consider developing quality assurance systems “to promote fairness, the perception of fairness, accuracy, inter-decisional consistency, timeliness, efficiency, and other goals relevant to their adjudicative programs.” Those with existing programs should review them to ensure they succeed by several key metrics.

The ACUS recommendation includes a number of guidelines to aid agencies as they design and evaluate quality assurance programs best suited to their personnel, structures, and dockets. These guidelines correspond to the institutional design questions enumerated above. For instance, agencies should determine whether personnel should rotate through quality review assignments or remain permanently on a quality review team, considering how best to guarantee both their independence of judgment and requisite expertise.

Agencies should also consider how to select cases for quality review, and when. The recommendation identifies several options, including random or stratified sampling, targeted selection of cases identified by specific case characteristics, and review of every case adjudicators decide. These choices will also depend on institutional capacity and constraints.

The ACUS recommendation emphasizes the crucial importance of data collection and analysis. Agencies should design and administer appropriate case management systems that captures data that can support systemic efforts to improve quality assurance systems. Data-driven findings should inform interventions designed to improve decision-making systemically, while preserving adjudicator independence.

Finally, and crucially, the recommendation notes that agencies should disclose how they design and administer quality assurance systems. Agencies should further consider disclosing de-identified data captured by case management systems. These disclosures would permit research by individuals outside the agency to ensure that quality assurance systems achieve their stated objectives.

Agencies cannot rely on individual claimants choosing to appeal to improve decision-making systematically. Especially as caseloads increase, agencies need to pursue systemic efforts to ensure that their decision-making achieves an acceptable threshold of quality. The design and implementation of rigorous programs, informed by guidelines in the ACUS recommendation and innovations in quality assurance since 1973, will help agencies meet this challenge.

Daniel E. Ho

Daniel E. Ho is the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford Law School.

David Marcus

David Marcus is a professor at the University of California, Los Angeles School of Law.

Gerald K. Ray

Gerald K. Ray was previously an administrative appeals judge and the Deputy Executive Director of the Office of Appellate Operations at the Social Security Administration.

This essay is part of a six-part series on the Administrative Conference of the United States, entitled Improving Transparency and Administrative Accountability.