Skip to main content

Table 3 Performance AU-ROC curve of BCRAT and ML algorithms (with standard deviation) predicting breast cancer lifetime risk from simulated datasets (n = 1200) and the US population-based sample (n = 1143)

From: Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models

DatasetBCRATML: random forestML: Logistic RegressionML: adapt boostingML: Linear ModelML: K-nearest neighborsML: linear discriminantML: quadratic discriminantML: MCMC GLMM
A.Sim_no_signal0.53330.5016 (0.0231)0.5133 (0.0271)0.5067 (0.0307)0.5015 (0.0220)0.5054 (0.0211)0.5158 (0.0276)0.5133 (0.0323)0.5090 (0.0210)
B.Sim_atifical_signal0.52610.9308 (0.0171)0.9417 (0.0103)0.9292 (0.0095)0.7859 (0.0197)0.9125 (0.0109)0.9312 (0.0154)0.9188 (0.0111)0.9329 (0.0087)
C. Sim_ atifical_signal + 20% missing0.50680.9275 (0.0179)0.9217 (0.0259)0.9258 (0.0113)0.7807 (0.0227)0.9012 (0.0120)0.9213 (0.0202)0.9104 (0.0237)0.9191 (0.0210)
D. Sim_ atifical_signal + 20% missing + imputation0.50350.9167 (0.0184)0.9300 (0.0111)0.9213 (0.0119)0.7824 (0.0200)0.9058 (0.0117)0.9275 (0.0148)0.9121 (0.0081)0.9232 (0.0099)
US population-based sample0.62400.8889 (0.0201)0.7192 (0.0314)0.8828 (0.0229)0.6813 (0.0378)0.8089 (0.0217)0.8692 (0.0284)0.8675 (0.0241)0.8234 (0.0189)