Screening mammography performance according to breast density: a comparison between radiologists versus standalone intelligence detection

Table 4 Performance of screening mammography compared between radiologists and standalone AI according to AI-based breast density

Outcome	Radiologists’ BI-RADS category (0, 3, 4, 5)		Standalone AI (Cutoff 10%)		P value
Outcome	Estimate	95% CI	Estimate	95% CI	P value
Non-dense (A or B)
CDR, per 1000 examinations	1.2	0.7–1.8	1.0	0.7–1.7	0.317
Sensitivity, %	83.3	62.6–95.3	75	53.3–90.2	0.317
Specificity, %	78.8	78.1–79.4	96.3	96.0–96.6	< 0.001
PPV, %	0.5	0.3–0.8	2.8	1.6–4.3	< 0.001
Recall rate, %	21.3	20.7–22.0	3.8	3.5–4.1	< 0.001
AUC	0.81	0.73–0.89	0.86	0.77–0.95	0.268
Heterogeneously dense (C)
CDR, per 1000 examinations	1.1	0.9–1.4	1.0	0.8–1.3	0.257
Sensitivity, %	69.2	58.7–78.5	62.6	51.9–72.6	0.257
Specificity, %	77.1	76.7–77.4	93.2	92.9–93.4	< 0.001
PPV, %	0.5	0.4–0.6	1.5	1.1–1.9	< 0.001
Recall rate, %	23.0	22.7–23.4	6.9	6.7–7.2	< 0.001
AUC	0.73	0.68–0.78	0.78	0.73–0.83	0.103
Extremely dense (D)
CDR, per 1000 examinations	1	0.6–1.6	1.3	0.8–1.9	0.103
Sensitivity, %	60.7	40.6–78.5	75.0	55.1–89.3	0.103
Specificity, %	78.3	77.7–78.9	89.2	88.8–89.7	< 0.001
PPV, %	0.5	0.3–0.8	1.2	0.7–1.8	0.004
Recall rate, %	21.8	21.1–22.4	10.9	10.4–11.3	< 0.001
AUC	0.70	0.60–0.79	0.82	0.74–0.90	0.003

AI, artificial intelligence; AUC, area under the receiver operating characteristic curve; BI-RADS, Breast Imaging Reporting and Data System; CDR, cancer detection rate; CI, confidence interval; PPV, positive predictive value

ISSN: 1465-542X