Using computer processing to make data from medical records easier to use in research
An electronic health record (EHR) is a digital medical chart that has health data like doctor visits, lab results, and other information. These data are useful for understanding trends in health information, including how Long COVID affects people. Because of this, EHR data from different settings can be difficult to compare and use in research.
Researchers from the National COVID Cohort Collaborative (N3C) looked at over 15 million EHRs from 75 hospitals and clinics. Their goal was to make data from different healthcare settings more compatible. To do this, they had to understand and describe how definitions of patient visits differ between healthcare settings so the EHR data from different healthcare settings would be compatible.
The researchers focused on identifying patterns in EHR data. They hoped these patterns would help them gain a better understanding of a patient’s complete care experience, including:
- How long the patient received care
- The number and types of treatments and medical procedures the patient received
- The order in which the patient received these treatments and procedures
To detect these patterns, researchers created two sets of rules for computer processing of EHR data. These rules are called algorithms. The first algorithm allowed researchers to group EHR data in new ways to make it easier to understand how the information in an EHR is related and easier to analyze individual EHRs and to compare EHRs from different sources.
The second algorithm allowed researchers to identify when EHRs indicated that a patient had been admitted to the hospital. Better data about hospitalizations will help future researchers study COVID and its long-term effects, including Long COVID.
By using algorithms, N3C researchers are trying to make large amounts of EHR data more consistent, manageable, and understandable. Algorithms like the two tested by these researchers can help other researchers enhance the quality of EHR data, making it more consistent and capable of producing important insights about conditions like Long COVID.