PLOS Digital Health, 6(1), p. e0000067, 2022
DOI: 10.1371/journal.pdig.0000067
Full text: Download
Computer-aided detection (CAD) was recently recommended by the WHO for TB screening and triage based on several evaluations, but unlike traditional diagnostic tests, software versions are updated frequently and require constant evaluation. Since then, newer versions of two of the evaluated products have already been released. We used a case control sample of 12,890 chest X-rays to compare performance and model the programmatic effect of upgrading to newer versions of CAD4TB and qXR. We compared the area under the receiver operating characteristic curve (AUC), overall, and with data stratified by age, TB history, gender, and patient source. All versions were compared against radiologist readings and WHO’s Target Product Profile (TPP) for a TB triage test. Both newer versions significantly outperformed their predecessors in terms of AUC: CAD4TB version 6 (0.823 [0.816–0.830]), version 7 (0.903 [0.897–0.908]) and qXR version 2 (0.872 [0.866–0.878]), version 3 (0.906 [0.901–0.911]). Newer versions met WHO TPP values, older versions did not. All products equalled or surpassed the human radiologist performance with improvements in triage ability in newer versions. Humans and CAD performed worse in older age groups and among those with TB history. New versions of CAD outperform their predecessors. Prior to implementation CAD should be evaluated using local data because underlying neural networks can differ significantly. An independent rapid evaluation centre is necessitated to provide implementers with performance data on new versions of CAD products as they are developed.