Recent widespread adoption of electronic health records (EHRs) is revolutionizing healthcare. Larger volumes of electronic healthcare data promote the discovery of new evidence and the delivery of evidence-based healthcare interventions. EHR data-driven algorithms continue to evolve providing clinical insights that assist healthcare professionals to identify patients who benefit from certain healthcare services. However, EHR algorithms that rely on patient data can only assist individuals who have the required data. Populations who experience data poverty (those who have disproportionately incomplete and inaccurate EHR data) are overlooked, exacerbating disparities in healthcare outcomes, particularly for medically underserved and marginalized populations.[1]
Individualized cancer risk evaluation is one area that could benefit from population-based approaches driven by algorithms over EHR data. Early identification of individuals at higher inherited risk for developing cancer is critical for personalized cancer prevention and to reduce disparities in morbidity and mortality, particularly among individuals from historically marginalized groups.[2] For example, the US Preventive Services Task Force recommends genetic testing be incorporated into risk assessment of patients with personal or family history of breast or ovarian cancer in order to identify those individuals with cancer risk levels warranting increased screening or risk reducing surgery.[3] Similarly, the US Multi-Society Task Force on Colorectal Cancer recommends including family history in tailoring colorectal cancer screening.[4] Estimates based on family history indicate that the prevalence of individuals with familial risk is 13% for breast cancer and 5% for colorectal cancer.[5] However, despite increased availability and lower cost of genetic testing, the majority of individuals meeting evidence-based criteria for genetic testing of hereditary syndromes have not received genetic services.[6], [7]
The Genetic Cancer Risk Detector (GARDE) platform is an EHR innovation that uses algorithms to identify patient populations that meet criteria set by National Comprehensive Cancer Network (NCCN) guidelines for genetic testing of hereditary cancer syndromes using patients' family health history from the EHR.[8], [9] GARDE has been used to support the Broadening the Reach, Impact, and Delivery of Genetic Services (BRIDGE) trial. BRIDGE is a randomized controlled trial with 3,073 patients who receive primary care at the University of Utah Health (UHealth) and New York University (NYU) Langone Health. The trial compared two models of patient outreach and education (enhanced standard of care versus automated chatbot) to offer eligible patients access to genetic testing for hereditary breast, ovarian, and colorectal cancer syndromes.[10] In a recent study that analyzed family history data extracted from UHealth and NYU, our group discovered substantial disparities across sex, race, ethnicity, and preferred language in the availability and completeness of family health history documentation and consequently in the identification of NCCN-eligible patients at both organizations.[11]
Given the effect of information presence bias discovered in family history data and GARDE’s dependence on structured family history data, the authors formulated two methods to mitigate missing data with the goal of reducing the discovered disparities: extracting family history attributes such as age of disease onset using natural language processing (NLP) over family history comments fields; and relaxing algorithm criteria to identify individuals who partially match criteria. As such, the objective of this study was to investigate these new methods comparing 1) identification rates of eligible patients; and 2) demographic differences according to sex, race, ethnicity, and preferred language.
Problem. Computer algorithms over EHR data are promising approaches to identify patients who may benefit from certain healthcare services, such as genetic testing, but have the potential to exacerbate health disparities.
What is already known. A previous study has shown significant disparities in family health history documentation in the EHR in terms of sex, race, ethnicity, and language preference.[11]
What this paper adds. This study investigated EHR algorithms to help address demographic differences in the identification of patients meeting family history-based criteria for genetic testing of hereditary cancer syndromes. The study provides a method that could be used as a part of EHR algorithm development to deliberately assess potential algorithm disparities.
Comments (0)