a gloved hand holds a handgun next to a tray of bullets

Firearm Forensics on Trial

Defense attorneys appealing a murder conviction enlisted the expertise of statisticians to determine whether the field of firearm forensics is grounded in solid science. The answer? Not even close.

By Alexander Gelfand

One day last March, biostatistician Michael Rosenblum got an email from the chief attorney in the Forensics division of the Maryland Office of the Public Defender.

He was more than a little surprised.

Jeff Gilleran wanted to know if Rosenblum would assess the scientific validity of forensic firearms examination, a discipline better known as “ballistics” to fans of police procedurals like CSI and Law & Order

“I initially thought, How could I help?” says Rosenblum, PhD.

It was an understandable response. A professor in Biostatistics, Rosenblum has spent the last 13 years developing improved statistical methods and free software for designing and analyzing clinical trials. His research has informed FDA guidance for designing COVID-19 treatment trials, and he’s collaborated with clinical investigators to design trials to evaluate treatments for Alzheimer’s disease and stroke. 

Firearm examiners, meanwhile, are tasked with determining whether the bullets and cartridge cases found at a crime scene were fired from a suspect’s gun. Most of them work in police department crime labs or for state or federal law enforcement agencies. Their job entails test-firing the gun in question and comparing the bullets and cartridge cases side-by-side under a microscope with the slugs and cartridge casings collected at the scene. They also serve as expert witnesses—typically for the prosecution—by testifying that the fired evidence found at the scene came from the suspect’s weapon.

“It’s deadly evidence,” says Gilleran, who notes that juries, which tend to view any kind of forensic evidence as objective and scientific, rarely doubt a firearms examiner’s testimony. “The bullets match the gun. The gun was found in the defendant’s apartment. What else do I have to know as a juror?” he asks rhetorically.

‘We’re Not Scientists’

The number of cases involving forensic firearms evidence is hard to pin down, according to Jeff Salyards, PhD, a researcher with the Center for Statistics and Applications in Forensic Evidence. But the best source of data—a 2016 report by the federal Bureau of Justice Statistics—found that in 2014, publicly funded forensic crime labs completed some 154,000 requests for what is sometimes called “firearms/toolmarks” examination.

Gilleran and his colleague Molly Ryan explained to Rosenblum that they needed someone who could evaluate the quality of the studies that are cited in court. 

“We’re not scientists,” Ryan says, “so we need help identifying the issues.”

Initially, Gilleran simply told Rosenblum that he would be one of several scientists contributing to an amicus brief for the Supreme Court of Maryland, the state’s highest court of appeals. (Filed by amicus curiae, or “friends of the court,” an amicus brief is submitted by third parties who are not directly involved in a case but seek to offer information in support of one side or the other.) 

“Trying to establish scientific validity is something I think about all the time,” Rosenblum says.

Only later would Rosenblum learn that the brief was being filed on behalf of Kobina Ebo Abruquah, who had already spent a decade in prison for the shooting death of Ivan Aguirre-Herrera in 2012. Aguirre-Herrera’s body was found in the house where both men lived in Riverdale Park, Maryland. At trial, a forensic firearms examiner testified that the bullets recovered from Aguirre-Herrera’s body had been fired from Abruquah’s gun. A jury found Abruquah, then 40, guilty of first-degree murder and the use of a handgun in the commission of a crime of violence. A judge sentenced him to life plus 20 years. 

Abruquah’s lawyers have appealed his conviction by questioning the scientific validity of the methodology used by firearms examiners and of the studies that support their conclusions. 

“Trying to establish scientific validity is something I think about all the time,” Rosenblum says. “This is part of my and my colleagues’ everyday work to ensure that the research being done to evaluate new medical treatments is scientifically sound.”

A Litany of Flaws

In the case of firearms examination, establishing scientific validity turned out to be problematic. The central issue, Rosenblum explains, is that firearm examiners claim to be able to determine whether the bullets found at a crime scene were fired from one particular gun and no other. Doing so, however, presupposes that every gun imprints a set of unique physical characteristics on bullets and cartridge casings during the firing process.

Yet as Rosenblum discovered, the notion that every single gun has a signature distinct from every other gun has not been established scientifically—nor has the ability of firearm examiners to reliably and accurately say that a bullet was fired from one gun to the exclusion of all others.

“There have been many studies of this, but each one is lacking in at least one important aspect … to establish scientific validity,” Rosenblum says.

Some of those deficiencies were first spelled out in a 2009 National Academy of Sciences report that criticized the scientific basis of a variety of forensic disciplines. (The sole exception was DNA testing, which grew out of biomedical science rather than law enforcement.) Arturo Casadevall, MD, PhD, chair of Molecular Microbiology and Immunology at the School and a former member of the National Commission on Forensic Science, says the NAS report was a “shock to the system.” It led to an even more detailed 2016 critique by the President’s Council of Advisors on Science and Technology.

Although both reports garnered pushback from prosecutors and law enforcement, they were eagerly taken up by defense attorneys—and prompted efforts by some forensics experts to put their field on more solid scientific footing. “Everyone that I have dealt with in the forensic community are good people who are really trying to do the right thing,” says Casadevall.

Nonetheless, Rosenblum turned up a litany of flaws in the studies that prosecutors and ballistics experts use to support the practice. These range from inadequate sample sizes (most studies involve 30 to 40 guns, whereas the total number of firearms circulating the U.S. is estimated at approximately 400 million) to a lack of transparency and peer review (most have been published in trade journals, and data are rarely shared).

‘It’s Not a Real Test’

Perhaps the most troubling issue had to do with the calculation of error rates, which indicate the likelihood that an examiner will correctly conclude that a bullet did or did not come from a particular gun. 

Current validation studies, most of which were conducted by forensic firearms experts themselves, report error rates ranging from 0 to 11.3%. (In their simplest form, these studies involve presenting examiners with a pair of bullets that have been fired under laboratory conditions from various guns, and asking them to determine whether the bullets came from the same gun or from different guns.) 

But Rosenblum points out, the manner in which most validation studies have been conducted means that the true error rate remains unknown.

For example, the examiners who participate in such studies are typically permitted to reach three different conclusions: The two bullets were not fired from the same gun (exclusion); they were fired from the same gun (identification); or they can’t tell (inconclusive). An inconclusive finding, Rosenblum says, clearly means that the examiner didn’t reach the correct answer, since every pair of bullets must have been fired either from the same gun or from different guns. Yet most validation studies count inconclusives as correct or exclude them entirely.

Rosenblum compares the situation to one in which students taking the SAT were allowed to skip any question they couldn’t answer. “It’s not a real test; everyone will get a perfect score,” he says. Indeed, when inconclusives are counted as incorrect, error rates suddenly climb as high as 93%. 

The manner in which most validation studies have been conducted means that the true error rate remains unknown.

In previous testimony at trial, James Hamby, a forensic firearms examiner and expert witness for the state, contended that inconclusives should not be counted as errors. 

“We’re talking about a mechanical issue where bullets fly through the air, they hit bodies, they bounce off things,” he said, implying that the damage to a given bullet could make it impossible to say with certainty whether it did or did not come from a particular weapon. 

As the defense noted, however, while that may be the case in field work, it is not in validation studies, which use pristine samples fired under controlled conditions. 

What’s more, one of the best-designed studies, known as Ames II, revealed that firearms examiners often reach different conclusions when presented with the same sets of bullets in several rounds of testing. Different firearms examiners, meanwhile, disagree with one another’s conclusions 32% to 69% of the time. (When questioned about the Ames II study, Hamby professed not to know that it had in fact been designed to measure not only accuracy but also repeatability and reproducibility.)

Firearm Forensics’ Future

Given these shortcomings, Rosenblum says he doesn’t see how firearms examination could be used as a reliable source of evidence in a criminal trial—though he has begun thinking about how to improve the situation. Test samples, for example, could be randomly slipped into the workflow of forensic firearms examiners to assess their error rates, as is already done in the case of DNA testing. The number and types of guns could be increased. And validation studies could be modeled after clinical trial templates and subjected to peer review. 

In the meantime, however, it’s up to the courts to decide how to treat forensic firearms evidence—which is precisely why the amicus brief Rosenblum helped write, and that is signed by 12 other independent scientists including Casadevall, is so important.

Abruquah’s lawyers drew heavily on it in oral arguments before the Supreme Court of Maryland in November. There is no deadline for the court to rule on it, but other lawyers can now cite the brief, providing them with a powerful tool for educating judges and juries about the current weaknesses of forensic firearms examination—and perhaps preventing other defendants from being convicted on the basis of evidence that may be considerably less trustworthy than its proponents suggest.

“It’s important for Mr. Abruquah,” says Gilleran. “But the importance of that brief goes way beyond this single case.”