Screening mammography performance according to breast density: a comparison between radiologists versus standalone intelligence detection

Table 3 Performance of screening mammography compared between radiologists and standalone AI by BI-RADS breast density category

Outcome	Radiologists’ BI-RADS category (0, 3, 4, 5)		Standalone AI (Cutoff 10%)		P value
Outcome	Estimate	95% CI	Estimate	95% CI	P value
Non-dense
CDR, per 1000 examinations	1.2	0.7–2.0	1.2	0.7–2.0	1.000
Sensitivity, %	77.8	52.4–93.6	77.8	52.4–93.6	1.000
Specificity, %	86.5	85.9–87.1	96.1	95.8–96.5	< 0.001
PPV, %	0.9	0.5–1.5	3.0	1.7–5.1	< 0.001
Recall rate, %	13.6	13.0–14.2	4.0	3.6–4.4	< 0.001
AUC	0.82	0.72–0.92	0.87	0.77–0.97	0.234
Heterogeneously dense
CDR, per 1000 examinations	1.2	0.9–1.6	1.0	0.8–1.4	0.059
Sensitivity, %	75.8	63.6–85.5	63.6	50.9–75.1	0.059
Specificity, %	77.9	77.5–78.3	93.6	93.4–93.8	< 0.001
PPV, %	0.6	0.4–0.7	1.6	1.1–2.1	< 0.001
Recall rate, %	22.2	21.8–22.6	6.5	6.3–6.7	< 0.001
AUC	0.77	0.72–0.82	0.79	0.73–0.85	0.575
Extremely dense
CDR, per 1000 examinations	1.0	0.7–1.3	1.1	1.1–1.5	0.346
Sensitivity, %	61.0	47.4–73.5	67.8	54.4–79.4	0.346
Specificity, %	74.5	74.1–75.0	91.5	91.2–91.7	< 0.001
PPV, %	0.4	0.3–0.5	1.2	0.9–1.7	< 0.001
Recall rate, %	25.5	25.1–26.0	8.6	8.4–8.9	< 0.001
AUC	0.68	0.62–0.74	0.80	0.74–0.86	0.297

AI, artificial intelligence; AUC, area under the receiver operating characteristic curve; BI-RADS, Breast Imaging Reporting and Data System; CDR, cancer detection rate; CI, confidence interval; PPV, positive predictive value

ISSN: 1465-542X