Chest X-ray Analysis With Deep Learning-Based Software as a Triage Test for Pulmonary Tuberculosis: An Individual Patient Data Meta-Analysis of Diagnostic Accuracy.

Gamuchirai Tavaziva; Miriam Harris; Syed K Abidi; Coralie Geric; Marianne Breuninger; Keertan Dheda

; Aliasgar Esmail; Monde Muyoyeta; Klaus Reither; Arman Majidulla; +9 more... Aamir J Khan; Jonathon R Campbell; Pierre-Marie David; Claudia Denkinger; Cecily Miller; Ruvandhi Nathavitharana; Madhukar Pai; Andrea Benedetti; Faiz Ahmad Khan; (2021) Chest X-ray Analysis With Deep Learning-Based Software as a Triage Test for Pulmonary Tuberculosis: An Individual Patient Data Meta-Analysis of Diagnostic Accuracy. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America, 74 (8). pp. 1390-1400. ISSN 1058-4838 DOI: 10.1093/cid/ciab639

Copy

BACKGROUND: Automated radiologic analysis using computer-aided detection software (CAD) could facilitate chest X-ray (CXR) use in tuberculosis diagnosis. There is little to no evidence on the accuracy of commercially available deep learning-based CAD in different populations, including patients with smear-negative tuberculosis and people living with human immunodeficiency virus (HIV, PLWH). METHODS: We collected CXRs and individual patient data (IPD) from studies evaluating CAD in patients self-referring for tuberculosis symptoms with culture or nucleic acid amplification testing as the reference. We reanalyzed CXRs with three CAD programs (CAD4TB version (v) 6, Lunit v3.1.0.0, and qXR v2). We estimated sensitivity and specificity within each study and pooled using IPD meta-analysis. We used multivariable meta-regression to identify characteristics modifying accuracy. RESULTS: We included CXRs and IPD of 3727/3967 participants from 4/7 eligible studies. 17% (621/3727) were PLWH. 17% (645/3727) had microbiologically confirmed tuberculosis. Despite using the same threshold score for classifying CXR in every study, sensitivity and specificity varied from study to study. The software had similar unadjusted accuracy (at 90% pooled sensitivity, pooled specificities were: CAD4TBv6, 56.9% [95% confidence interval {CI}: 51.7-61.9]; Lunit, 54.1% [95% CI: 44.6-63.3]; qXRv2, 60.5% [95% CI: 51.7-68.6]). Adjusted absolute differences in pooled sensitivity between PLWH and HIV-uninfected participants were: CAD4TBv6, -13.4% [-21.1, -6.9]; Lunit, +2.2% [-3.6, +6.3]; qXRv2: -13.4% [-21.5, -6.6]; between smear-negative and smear-positive tuberculosis was: were CAD4TBv6, -12.3% [-19.5, -6.1]; Lunit, -17.2% [-24.6, -10.5]; qXRv2, -16.6% [-24.4, -9.9]. Accuracy was similar to human readers. CONCLUSIONS: For CAD CXR analysis to be implemented as a high-sensitivity tuberculosis rule-out test, users will need threshold scores identified from their own patient populations and stratified by HIV and smear status.