Illustration of data scientist searching binary digits with flash light

Lost Knowledge: Open Science is One Solution to Hidden Data

The progress of science depends on how we preserve and share what we know.

By Kay Dickersin • Illustrations by Michael Glenwood

Some of the most valuable studies on clinical trials disappeared in the 1990s. Clinical trial findings and studies that transformed systematic review research, for example, effectively vanished.

In 1992, when I became an associate editor of the Online Journal of Current Clinical Trials—one of the first electronic peer-reviewed biomedical journals—I joined an effort to speed the sharing of knowledge among scientists and researchers. Just two years later, however, the journal was sold. It published a few more articles, and then it went dark. Although the journal’s article titles and abstracts were accessible through Medline, the full articles—many of which were groundbreaking—were lost to potential readers.

True knowledge is hard won. It takes time and resources to design and carry out a good clinical trial or systematic review of intervention effectiveness and safety. Then the knowledge gained in the study must be shared through reporting and publication. Each decision made along the way—about what (or whether) information is reported, where the study is reported, whether the report is accessible—creates an opportunity for knowledge to be lost.

We lose knowledge when researchers design studies without first knowing about previous research into the question. We lose knowledge when studies or parts of studies (including negative or null results) are not reported. Knowledge may be lost if we report research findings in a way that makes them hard to find, such as in languages other than English (which may be the only language the searcher reads well) or in journals not indexed by Medline. And even when we publish in Medline-indexed journals, knowledge can be lost if the full publications are not generally available.

If we are to accumulate, preserve and pass down knowledge, we cannot afford to lose any part of what we know. If knowledge is lost, the foundation for research is lost as well. Future investigators might spend time and resources reinventing the wheel. Patients might not benefit from potential therapies or might be harmed by ineffective or dangerous ones. Clinical practice guidelines may be built on shaky ground. The progression of science may be needlessly stalled.

There are a few things we can do to ensure that we continue science’s forward progress. First, we need to refocus the academic reward system. One should be rewarded for research that is reproducible and reported completely and well. Academia’s existing system, however, rewards investigators for getting grants to support their research and publishing the findings of that research, but not the quality of the research reporting. Encouragingly, organizations such as the National Academy of Medicine have recommended steps to refocus academic rewards, for example, by assigning credit to investigators whose research data are made publicly available and are used by other investigators. Still, changing such an entrenched system will take time.

Illustration of data scientist with flash light

Second, we need open science. If investigators publicly register all trials they initiate, the trial’s design is registered and is “open” information. U.S. law requires registration on for FDA-approved products. Investigators in academia in the U.S. who conduct clinical trials without industry support, however, have been slow to do this voluntarily. In addition, although the same U.S. law requires investigators of FDA-approved products to report the trial’s findings on, investigators are not required to make available trial findings from products that are not FDA-approved. We know from research, for example, that adverse events associated with drug interventions are not consistently reported. We don’t know, however, practical and actionable solutions to this very real problem, given the literally hundreds of possible harms that are identified in any one trial.

Fortunately, because the scientific community is increasingly concerned about the reproducibility of research, we have been moving for the last decade toward open science for all clinical trials. Questions remain about how to make this a reality: What is the best way to share trial data and metadata, and how can we require it worldwide? Who will pay for making all data and metadata available? Who will pay for using the information? Answers to these complex questions will require collaboration among disciplines and sectors.

Third, we need a way to help scientists organize and preserve, not just index, knowledge. The state of knowledge translation would improve, for example, if the U.S. had a czar of scientific knowledge. That office should provide national guidelines on what researchers and institutions preserve. In turn, institutions should employ archivists who create and implement local guidelines. New researchers should be required to meet with an archivist during the orientation process, as well as with a librarian. Librarians make excellent collaborators for those who need to learn about existing knowledge—and that should include all of us.

In fact, it was a librarian who helped us resurrect the lost OJCCT articles. When I realized those papers had been lost, I sought help from Johns Hopkins librarian Mariyam Thohira. Mariyam told me about Portico, a digital preservation service that tags, manages, updates and provides access to content. She brokered a deal with Portico to do this for the OJCCT, and today, we have acquired all 52 Medline-indexed articles and made them available online.

Sustaining the knowledge we have gained is hard. As stakeholders, we have to take knowledge seriously. We have to value it, preserve it and pass it down like our future depends on it.

Because it does.