Skip to main content
Fig. 4 | Breast Cancer Research

Fig. 4

From: A knowledge-based framework for the discovery of cancer-predisposing variants using large-scale sequencing breast cancer data

Fig. 4

Polygenic age-dependent model breakdown. a Feature ranking of the random forest model according to the mean decrease of Gini index. At the top, the most important variable is the deleteriousness score. b ROC curve on random forest training model. An AUC of 0.84 is reached under the supervision of the training dataset formed by reported pathogenic and non-pathogenic variants according to the ClinVar and Humsavar databases. c Representation of the distribution of deleteriousness score among variants. The top predictor in our random forest model is shown without the influence of the other variants. Although it cannot represent the real tree scheme of the model alone, there is a clear positive trend between increased deleteriousness score (DS, X-axis) and the number of trees classifying a variable as pathogenic (RF score, Y-axis)

Back to article page