Overdiagnosis and overtreatment of breast cancer: Microsimulation modelling estimates based on observed screen and clinical data
© BioMed Central Ltd 2005
Published: 21 December 2005
Skip to main content
© BioMed Central Ltd 2005
Published: 21 December 2005
There is a delicate balance between the favourable and unfavourable side-effects of screening in general. Overdiagnosis, the detection of breast cancers by screening that would otherwise never have been clinically diagnosed but are now consequently treated, is such an unfavourable side effect. To correctly model the natural history of breast cancer, one has to estimate mean durations of the different pre-clinical phases, transition probabilities to clinical cancer stages, and sensitivity of the applied test based on observed screen and clinical data. The Dutch data clearly show an increase in screen-detected cases in the 50 to 74 year old age group since the introduction of screening, and a decline in incidence around age 80 years. We had estimated that 3% of total incidence would otherwise not have been diagnosed clinically. This magnitude is no reason not to offer screening for women aged 50 to 74 years. The increases in ductal carcinoma in situ (DCIS) are primarily due to mammography screening, but DCIS still remains a relatively small proportion of the total breast cancer problem.
Breast cancer screening has been effective in reducing breast cancer mortality. Both randomised controlled trials and nation-wide screening programmes have shown a roughly 25% reduction in disease-specific mortality for women aged 50 years and over invited to screening [1–4]. This benefit applies to the group as a whole, but at the individual level it is impossible to determine who will actually benefit or who will receive more harm than benefit from such a programme: there is a delicate balance between the favourable and unfavourable side-effects of screening in general . For example, detecting breast cancers by screening that would otherwise never have been clinically diagnosed, but are now treated, is such an unfavourable side effect. Because of lead time and length-biased sampling, the screening test will generally detect more early lesions with possibly different biological behaviour and also more slowly growing tumours, especially ductal carcinoma in situ (DCIS). Screening at older ages will, due to existing co-morbidity, lead to the detection of clinically relevant diseases; however, women may not necessarily benefit because they more often die due to other diseases.
This paper presents quantitative estimates of overdiagnosis in breast cancer screening based on microsimulation modelling, with special emphasis on DCIS. In this study, overdiagnosis is defined as diagnosing cancers that would not have been diagnosed clinically if there were no screening programme.
These increases in incidence represent real overdiagnosis to only a limited extent. From the observed rates, one can not easily determine to what extent overdiagnosis is involved because screening is still being continued. In these circumstances, modelling of the natural history of breast cancer and its early lesions, and what screening is estimated to depict, is crucial and provides a 'best guess'. Using the microsimulation model MISCAN [6, 9], we first simulate individual life histories for women in the absence of screening and then assess how these histories would change as a consequence of a screening programme. The natural history is modelled as a progression from no breast cancer through pre-clinical disease (DCIS, T1a, 1b, 1c, T2+) to clinical disease (same stages). From a given pre-clinical state, a cancer may be detected by screening or become clinically apparent or, if undiagnosed, progress to the next pre-clinical state. To correctly model this natural history of breast cancer for women in a certain age group, one has to estimate mean durations of the different pre-clinical phases, transition probabilities, and sensitivity of the applied test . Basically, one therefore needs data from two sources: observed screen and clinical data. These data include clinical incidence data by age and stage in the situation without screening, data on screen-detected cancers by stage, screening round (and interval) and age, and corresponding clinical incidence data when screening is being implemented . Although the observed data can often be explained by a small range of parameters (e.g., a somewhat higher sensitivity and shorter mean duration of the stage may also result in a good fit), by having more detailed data from several screening rounds, by screening different age groups and/or by using different screening intervals, best parameters often fall into a smaller range . In the Netherlands, such detailed data have been used: in the past using pilot data , and more recent data from the annual monitoring by the National Evaluation Team for Breast cancer screening .
We also used the MISCAN approach to analyse the results of the Health Insurance Plan trial study. These comparisons show the potential power of modelling: the parameter values for the invariant part of the natural history of pre-clinical breast cancer are indeed the same, whereas the increase in the sensitivity reflects the improvement in mammography. Taking the obvious differences between HIP and Nijmegen (one of the two Dutch pilot studies) into account, the model shows that there is a good correspondence between the screening data from these studies. The findings about the duration of pre-clinical disease and the sensitivity of screening can be compared with results from other modelling approaches. Day and colleagues  applied this model to data from Utrecht (the other Dutch pilot study). The study reports a good fit of the model (chi-square of 7.2 and 7 degrees of freedom) when assuming a sensitivity of 99% and a mean duration of 2.8 years. It is not indicated exactly what data from Utrecht were used, but it is clearly a less detailed subset of the data than we used for testing model assumptions. An adapted version of the Day and Walter model was applied to the Nijmegen data . In general, the estimated parameters are comparable to the values found with the MISCAN approach presented here, especially regarding the age-dependency of the estimated duration of the preclinical stage. The reported average duration is somewhat shorter, however, for example, 2.5 years in the 50 to 64 year old age group.
Data on the natural history at older ages have been very limited, but are slowly emerging now that the Dutch programme includes women aged 70 to 74 years . Data on the natural history of DCIS are scarce , but parameters concerning the screen-detectable pre-clinical period can be estimated, based on the aforementioned data.
In our first analyses, we have assumed that 10% of invasive breast cancers are preceded by a screen-detectable DCIS phase and that the chance of progressing to invasive cancer or clinical DCIS is almost 90% in the long term. Recent data from randomised treatment trials support a high progression rate in the long term . The observed screen data are then consistent/compatible with a mammography sensitivity of 40% and a mean screen-detectable duration of 5 years.
During the first years of screening, the increase in newly diagnosed cases in the age group invited to screening will not yet be reflected by a decrease in incidence at older ages as these are different cohorts of women. In the later years of screening, the increase in newly diagnosed cases and the decrease in incidence should be at a steady state, although this isn't always the case, because of other changes in the screening programme.
Figure 3b shows that the change in DCIS detection is especially striking, although some lesions would have progressed to an invasive disease (not shown in the figure). The amount of overdiagnosis, the increase in primary surgery/radiotherapy, and the longer time frame since diagnosis has to be weighed against the favourable effects of screening: about 750 breast cancer deaths prevented per year (16%), reduction of treatments for advanced disease and its consequences for quality of life, and 15 life-years gained if dying from breast cancer has been prevented. We consider this to be a very acceptable balance at the population level .
Overdiagnosis is inherent to screening. The crucial issue is the extent to which it happens and what the consequences are for the population involved. This then has to balance with the favourable effects of screening in order to be able to decide on an appropriate screening policy. In breast cancer screening, overdiagnosis is not negligible but is relatively limited. The increases in DCIS are primarily due to mammography screening, but they remain a relatively small proportion of the breast cancer problem. The screen data observed in this study provide workable assumptions on the natural history of DCIS and do not lead to a major difference in conclusions regarding overdiagnosis. More and more women with DCIS are being treated by breast conservation, and in the Netherlands screen-detected DCIS is more often treated by conservation than clinically diagnosed DCIS. Categorisation of DCIS lesions into high risk versus low risk lesions (by screening) is still urgently needed.
ductal carcinoma in situ.