Skip to main content

Table 5 Top five important risk factors in descending order for different ML algorithms based on the US population-based training samples in 10-fold internal statistical cross-validations

From: Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models

ML: random forestML: logistic regressionML: adapt boostingML: linear modelML: K-nearest neighborsML: linear discriminantML: quadratic discriminantML: MCMC GLMM
Number of biopsiesNumber of first-degree relatives with breast cancerNumber of biopsiesAgeNumber of biopsiesAgeNumber of first-degree relatives with breast cancerNumber of biopsies
AgeAgeAgeNumber of biopsiesNumber of first-degree relatives with breast cancerNumber of biopsiesNumber of biopsiesAge
Number of first-degree relatives with breast cancerNumber of biopsiesNumber of first-degree relatives with breast cancerNumber of first-degree relatives with breast cancerAgeEthnicityAgeNumber of first-degree relatives with breast cancer
Age at menarcheEthnicityAge at menarcheAge at menarcheEthnicityNumber of first-degree relatives with breast cancerEthnicityAge at first live birth
EthnicityAge at first live birthEthnicityAge at first live birthAge at first live birthAge at first live birthAge at menarcheAge at menarche