- Letter
- Open access
- Published:
Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction
Breast Cancer Research volume 22, Article number: 17 (2020)
A recent paper [1] compared two well-known breast cancer risk prediction models (BCRAT and BOADICEA) with eight different machine learning (ML) methods. The authors found a striking improvement in cancer prediction with ML. While their comparative assessment against more classical approaches is timely, we are skeptical about the results presented.
A recent review on ML methods in a clinical epidemiological context shows that benefits of ML tend to arise in biased comparisons [2]. In the analyses of Ming et al., the ML methods were not specific for survival data and the validation process was unfair. While the ML used fits to binary outcomes prediction (having the disease or not), BOADICEA/BCRAT computes the probability of developing breast cancer over time. Regarding the second aspect, a fair comparison of the validity of the models would require data on unaffected women with prospective follow-up, with like for like risk predictions (over the same time period) for all methods. The comparisons in [1] were based on retrospective data of families of unaffected/affected individuals, and in the context of the BCRAT/BOADICEA, it is unclear what the observed and predicted events are. Furthermore, for the existing models, the study assessed external validity, while the ML methods were fitted on the same samples incorporating a tenfold cross-validation procedure, which is only equivalent to internal validation [3]. Internal validation is often overoptimistic in comparison to external validation studies [4]. Although the authors indicate the most important risk predictors in the ML approaches, the final models are not provided. A fair comparison would require the comparison of the final models from the ML with the existing models in external, prospective datasets. Moreover, discrimination is only one measure of model performance: good calibration and clinical utility assessment are also crucial [5]. Last but not least, Ming et al. did not mention which version of BOADICEA was used for the comparison with ML methods. In conclusion, the practical relevance of ML methods needs to be further investigated in this specific context, based on more rigorous methodology.
Availability of data and materials
Not applicable
References
Ming C, Viassolo V, Probst-Hensch N, Chappuis PO, Dinov ID, Katapodi MC. Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models. Breast Cancer Res. 2019;21(1):75.
Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22.
Steyerberg EW, Harrell FE Jr. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016;69:245–7.
Siontis GC, Tzoulaki I, Castaldi PJ, Ioannidis JP. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol. 2015;68(1):25–34.
Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res. 2019;3:18.
Acknowledgements
Not applicable
Funding
Not applicable
Author information
Authors and Affiliations
Contributions
DG, AAC, LM, DFE, and EWS conceived and drafted the letter to the editor. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Giardiello, D., Antoniou, A.C., Mariani, L. et al. Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction. Breast Cancer Res 22, 17 (2020). https://doi.org/10.1186/s13058-020-1255-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13058-020-1255-4