Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction

Giardiello, Daniele; Antoniou, Antonis C.; Mariani, Luigi; Easton, Douglas F.; Steyerberg, Ewout W.

doi:10.1186/s13058-020-1255-4

Letter
Open access
Published: 10 February 2020

Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction

Daniele Giardiello ORCID: orcid.org/0000-0002-9005-9430^1,2,
Antonis C. Antoniou³,
Luigi Mariani⁴,
Douglas F. Easton^3,5 &
…
Ewout W. Steyerberg^2,6

Breast Cancer Research volume 22, Article number: 17 (2020) Cite this article

2339 Accesses
4 Citations
Metrics details

The Original Article was published on 20 June 2019

A recent paper [1] compared two well-known breast cancer risk prediction models (BCRAT and BOADICEA) with eight different machine learning (ML) methods. The authors found a striking improvement in cancer prediction with ML. While their comparative assessment against more classical approaches is timely, we are skeptical about the results presented.

A recent review on ML methods in a clinical epidemiological context shows that benefits of ML tend to arise in biased comparisons [2]. In the analyses of Ming et al., the ML methods were not specific for survival data and the validation process was unfair. While the ML used fits to binary outcomes prediction (having the disease or not), BOADICEA/BCRAT computes the probability of developing breast cancer over time. Regarding the second aspect, a fair comparison of the validity of the models would require data on unaffected women with prospective follow-up, with like for like risk predictions (over the same time period) for all methods. The comparisons in [1] were based on retrospective data of families of unaffected/affected individuals, and in the context of the BCRAT/BOADICEA, it is unclear what the observed and predicted events are. Furthermore, for the existing models, the study assessed external validity, while the ML methods were fitted on the same samples incorporating a tenfold cross-validation procedure, which is only equivalent to internal validation [3]. Internal validation is often overoptimistic in comparison to external validation studies [4]. Although the authors indicate the most important risk predictors in the ML approaches, the final models are not provided. A fair comparison would require the comparison of the final models from the ML with the existing models in external, prospective datasets. Moreover, discrimination is only one measure of model performance: good calibration and clinical utility assessment are also crucial [5]. Last but not least, Ming et al. did not mention which version of BOADICEA was used for the comparison with ML methods. In conclusion, the practical relevance of ML methods needs to be further investigated in this specific context, based on more rigorous methodology.

Availability of data and materials

Not applicable

References

Ming C, Viassolo V, Probst-Hensch N, Chappuis PO, Dinov ID, Katapodi MC. Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models. Breast Cancer Res. 2019;21(1):75.
Article Google Scholar
Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22.
Article Google Scholar
Steyerberg EW, Harrell FE Jr. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016;69:245–7.
Article Google Scholar
Siontis GC, Tzoulaki I, Castaldi PJ, Ioannidis JP. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol. 2015;68(1):25–34.
Article Google Scholar
Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res. 2019;3:18.
Article Google Scholar

Download references

Acknowledgements

Not applicable

Funding

Not applicable

Author information

Authors and Affiliations

Division of Molecular Pathology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek Hospital, Amsterdam, The Netherlands
Daniele Giardiello
Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
Daniele Giardiello & Ewout W. Steyerberg
Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK
Antonis C. Antoniou & Douglas F. Easton
Unit of Clinical Epidemiology and Trial Organization, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Luigi Mariani
Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK
Douglas F. Easton
Department of Public Health, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
Ewout W. Steyerberg

Authors

Daniele Giardiello
View author publications
You can also search for this author in PubMed Google Scholar
Antonis C. Antoniou
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Mariani
View author publications
You can also search for this author in PubMed Google Scholar
Douglas F. Easton
View author publications
You can also search for this author in PubMed Google Scholar
Ewout W. Steyerberg
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

DG, AAC, LM, DFE, and EWS conceived and drafted the letter to the editor. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Daniele Giardiello.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Giardiello, D., Antoniou, A.C., Mariani, L. et al. Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction. Breast Cancer Res 22, 17 (2020). https://doi.org/10.1186/s13058-020-1255-4

Download citation

Received: 28 December 2019
Accepted: 26 January 2020
Published: 10 February 2020
DOI: https://doi.org/10.1186/s13058-020-1255-4

Letter to the editor: a response to Ming’s study on machine learning techniques for personalized breast cancer risk prediction

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Breast Cancer Research

Contact us