A novel approach selected small sets of diagnosis codes with high prediction performance in large healthcare datasets.

Thomas E Cowling ORCID logo; David A Cromwell ORCID logo; Linda D Sharples ORCID logo; Jan van der Meulen ORCID logo; (2020) A novel approach selected small sets of diagnosis codes with high prediction performance in large healthcare datasets. Journal of Clinical Epidemiology, 128. pp. 20-28. ISSN 0895-4356 DOI: 10.1016/j.jclinepi.2020.08.001
Copy

OBJECTIVES: The objective of the study was to examine an approach for selecting small sets of diagnosis codes with high prediction performance in large datasets of electronic medical records. STUDY DESIGN AND SETTING: This was a modeling study using national hospital and mortality records for patients with myocardial infarction (n = 200,119), hip fracture (n = 169,646), or colorectal cancer surgery (n = 56,515) in England in 2015-2017. One-year mortality was predicted from ICD-10 codes recorded for at least 0.5% of patients using logistic regression ('full' models). An approximation method was used to select fewer codes that explained at least 95% of variation in full model predictions ('reduced' models). RESULTS: One-year mortality was 17.2% (34,520) after myocardial infarction, 27.2% (46,115) after hip fracture, and 9.3% (5,273) after colorectal surgery. Full models included 202, 257, and 209 ICD-10 codes in these populations. C-statistics for these models were 0.884 (95% confidence interval (CI) 0.882, 0.886), 0.798 (0.795, 0.800), and 0.810 (0.804, 0.817). Reduced models included 18, 33, and 41 codes and had c-statistics of 0.874 (95% CI 0.872, 0.876), 0.791 (0.788, 0.793), and 0.807 (0.801, 0.813). Performance was also similar when measured using Brier scores. All models were well calibrated. CONCLUSION: Our approach selected small sets of diagnosis codes that predicted patient outcomes comparably to large, comprehensive sets of codes.


picture_as_pdf
Cowling-etal-2020-A_novel_approach_selected_small_sets.pdf
subject
Accepted Version
Available under Creative Commons: NC-ND 3.0

View Download

Atom BibTeX OpenURL ContextObject in Span Multiline CSV OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL EndNote HTML Citation JSON MARC (ASCII) MARC (ISO 2709) METS MODS RDF+N3 RDF+N-Triples RDF+XML RIOXX2 XML Reference Manager Refer Simple Metadata ASCII Citation EP3 XML
Export

Downloads