Validation of the Gail model for predicting individual breast cancer risk in a prospective nationwide study of 28,104 Singapore women

Introduction The Gail model (GM) is a risk-assessment model used in individual estimation of the absolute risk of invasive breast cancer, and has been applied to both clinical counselling and breast cancer prevention studies. Although the GM has been validated in several Western studies, its applicability outside North America and Europe remains uncertain. The Singapore Breast Cancer Screening Project (SBCSP) is a nation-wide prospective trial of screening mammography conducted between Oct 1994 and Feb 1997, and is the only such trial conducted outside North America and Europe to date. With the long-term outcomes from this study, we sought to evaluate the performance of GM in prediction of individual breast cancer risk in a Asian developed country. Methods The study population consisted of 28,104 women aged 50 to 64 years who participated in the SBSCP and did not have breast cancer detected during screening. The national cancer registry was used to identify incident cases of breast cancer. To evaluate the performance of the GM, we compared the expected number of invasive breast cancer cases predicted by the model to the actual number of cases observed within 5-year and 10-year follow-up. Pearson's Chi-square test was used to test the goodness of fit between the expected and observed cases of invasive breast cancers. Results The ratio of expected to observed number of invasive breast cancer cases within 5 years from screening was 2.51 (95% confidence interval 2.14 - 2.96). The GM over-estimated breast cancer risk across all age groups, with the discrepancy being highest among older women aged 60 - 64 years (E/O = 3.53, 95% CI = 2.57-4.85). The model also over-estimated risk for the upper 80% of women with highest predicted risk. The overall E/O ratio for the 10-year predicted breast cancer risk was 1.85 (1.68-2.04). Conclusions The GM over-predicts the risk of invasive breast cancer in the setting of a developed Asian country as demonstrated in a large prospective trial, with the largest difference seen in older women aged between 60 and 64 years old. The reason for the discrepancy is likely to be multifactorial, including a truly lower prevalence of breast cancer, as well as lower mammographic screening prevalence locally.


Introduction
The Gail model (GM) for evaluation of the absolute individual risk of invasive breast cancer in women has been used extensively in Western populations for individual counseling and cancer prevention trial design. Given that international breast cancer incidence varies widely and that the validation of GM has been conducted predominantly in Western populations, it is relevant to examine its performance in different settings. The age-standardized rate of breast cancer was 59.5 per 100,000 persons per year among Singaporean women in the period of 2004 to 2008 [1]. This is considerably lower in comparison with the West, where the estimated age-standardized breast cancer incidence rates for 1998 through 2002 were 86.3 per 100,000 women in Italy and 76.7 per 100,000 women in the US [2]. Given these differences, it is possible that a breast cancer prediction model such as the GM, which uses a small number of risk factors to estimate an absolute risk of invasive breast cancer, estimates that are less well calibrated when applied to varying populations. Indeed, the accuracy of the calibration of the GM might depend on the extent of the prevalence of risk factors as well as differences in the use of screening mammography. In the US, 70.1% of women who were older than 39 years of age in 2000 reported having had a mammogram within the previous 2 years [3]; this figure is distinctly higher than that seen in Asian populations. For example, in Singapore, an estimated 40.9% of women between the ages of 50 to 69 years had a mammography screen in the last 2 years [4].
The original GM was derived from data of the Breast Cancer Detection Demonstration Project (BCDDP), a program consisting of five annual mammography screens across 29 centers in the US [5]. By means of multivariate relative risk components and baseline agespecific breast cancer risks estimated from the BCDDP population, Gail model 1 was formed to estimate the absolute risk of both invasive and in situ breast cancer. This model was subsequently modified by investigators from the Breast Cancer Prevention Trial [6] by using invasive breast cancer rates from the Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute (NCI) to better estimate the risks of invasive breast cancers. This modified model, referred to as 'Gail model 2' [7], has been validated by using independent data from the US and hereafter is referred to as GM in our study. Rockhill and colleagues [8] used data from the Nurses' Health Study for the period from 1992 to 1997 for calibration of the calculated average absolute breast cancer risk predicted by the GM. Overall, Rockhill and colleagues concluded that the GM was well calibrated but had only modest discriminatory accuracy at the individual level. This is consistent with the findings of Costantino and colleagues [7], who found an overall ratio of expected to observed (E/O) breast cancer cases of 1.03 for women in the control arm of the Breast Cancer Prevention Trial and also concluded that the GM was well calibrated. Interestingly, although mammographic screening for women older than 50 years has become routine practice in many Western countries, no prospective data from large studies have examined mammography in Asian women. Relatively few publications have evaluated the GM in a population outside the US; even then, these validation studies were based primarily on limited case control data, which may perform better for assessment of relative rather than absolute risk [9][10][11].
From 1968 to 2007, Singapore experienced an almost threefold increase in breast cancer incidence. The incidence rate of breast cancer from 2003 to 2007 was 2.9 times that in 1968 to 1972 and 5.65% higher than in 1998 to 2002 [12]. Our study was undertaken to evaluate the utility of the GM in an Asian population by using 10-year follow-up data from the Singapore Breast Cancer Screening Project (SBCSP), which is a large-scale prospective community mammography screening program. To the best of our knowledge, ours is the first study evaluating the systemic application and validation of the GM by using long-term outcomes in a prospectively studied Asian population. The goal was to evaluate the performance of the GM as an appropriate breast cancer risk assessment tool in the Asian population.

Materials and methods
Singapore breast cancer screening project The SBCSP was a population-based mammography screening project conducted between October 1994 and February 1997 among female Singaporeans. The design of the SBCSP has been described in detail previously [13]. Briefly, all women who were 50 to 64 years old on 1 October 1994 were first identified from a comprehensive population registry (n = 166,600). A total of 69,473 (41.7%) of these women were randomly selected and invited for a single free mammography screening from 1 October 1994 to 28 February 1997. Prior to mammography, all attendees completed a questionnaire to determine eligibility, and only those who were eligible proceeded to further interview and mammography. Women who had cancers of the breast or other sites (except non-melanoma skin cancer) and had mammography or breast biopsy in the 12 months prior to screening or were pregnant were excluded from screening. In total, 28,235 women from 50 to 64 years of age participated in this initial mammography screening program with a single two-view mammogram examination. No subsequent free mammography screening was offered.

Study population
Among the 28,235 participants, 131 women were detected with breast cancer during initial screening and were excluded from the study population. Therefore, a total of 28,104 women provided the basis of this study. Electronic matching with the Singapore Cancer Registry and the National Death Register was used to identify any breast cancers occurring among these women from the date of the screen until December 2007. Permission was obtained from the Ministry of Health (Singapore) and the Ministry of Home Affairs (Singapore), respectively, for data access. One hundred forty-four and 409 invasive breast cancers were diagnosed within 5 years and 10 years from screening, respectively. Women not diagnosed with breast cancer during each period formed the corresponding control group for comparison with the diagnosed cases.

Formation of Gail model risk factor categories
Risk factor categories in the GM were formed by using information from the screening questionnaire: age at menarche (at least 14 years, 12 to 13 years, or fewer than 12 years), age at first live birth (nulliparous, fewer than 20 years, 20 to 24 years, 25 to 29 years, or at least 30 years), previous breast biopsy, and number of firstdegree relatives with breast cancer. Number of previous breast biopsies and presence of atypical hyperplasia in biopsy were not ascertained in the questionnaire, which asked only whether women had had any previous breast biopsy. The questionnaire asked each woman whether she had a mother, any sister(s), or any daughter(s) with breast cancer and not the actual number of affected first-degree relatives. Assuming that women who reported that they had only sister(s) or only daughter(s) with breast cancer were recorded as having one affected first-degree relative, we recategorized our information on family history to three categories of 0, 1, or 2 or more affected first-degree relatives.

Statistical methods
The 5-year and 10-year absolute risks of each individual woman were calculated from their age of screening until end of follow-up at 5 years and 10 years, respectively. The calculation was done by using publicly available SAS codes for GM prediction from the NCI's Breast Cancer Risk Assessment Tool website [14]. The expected number of cases (E) among women in a category was calculated by summing the estimated absolute risk across all women in the category.
Model calibration was assessed by comparison of the expected (E) and observed (O) number of breast cancer cases by GM risk factor categories, 5-year age group, and quintiles of predicted risk. Quintiles were formed by ranking all women by their predicted risk from the GM in ascending order. Confidence intervals (CIs) were calculated for E/O ratio on the basis of exact theory, and O was assumed to follow a Poisson distribution. The Pearson chi-squared test was used to test the goodness of fit between E and O. Comparison of categorical variables between diagnosed cases and the control group was performed by using the chi-squared test or Fisher exact test as appropriate. A P value of less than 0.05 was considered statistically significant. All analyses were performed by using SAS 9.1 (SAS Institute Inc., Cary, NC, USA).
Matsuno and colleagues [15] recently reported the creation and use of an Asian American Breast Cancer Study (AABCS) model that was based on data from the AABCS combined with use of ethnicity-specific data from the NCI's SEER program. This model was calibrated by using data from the Asia Pacific women in the Women's Health Initiative. Using this model, the authors reported lower rates of breast cancer incidence in Chinese-American women compared with Caucasian-American women. Applying this AABCS model to the SBCSP population, we attempted to predict the breast cancer risk by using the Chinese-American AABCS model for the SBCSP Chinese women and the other Asian-American AABCS model for Malays, Indians, and other racial groups in the SBCSP.

Results
Up to December 2007, 575 of 28,104 individuals developed ductal carcinoma in situ (DCIS) invasive breast cancer. The distribution of the study cohort by risk factors in the GM is shown in Table 1. There were proportionally more breast cancer cases with early menarche, previous breast biopsy, a late age at first childbirth, and at least one first-degree relative with breast cancer than the controls in the SBCSP study cohort.
The various breast cancer risk factors had a similar influence on the incidence of both DCIS and invasive breast cancers as compared with the corresponding estimated impact on risks as calculated by the GM (Table  2). However, the relative risk for each risk factor was larger on the basis of SBCSP data as compared with the GM. Women with early menarche, women with a previous breast biopsy, and women with age of first live birth after 30 years of age appeared to have a much higher risk of breast cancer. The GM provides greatest overestimation of risk in women who had late menarche, women who had no affected first-degree relative, and women who were more than 50 years old and had no prior breast biopsy at screening.
A parallel comparison between the observed breast cancers developed in the 5 years after screening for categories defined by breast cancer risk factors and the calculated expected numbers of invasive breast cancers predicted by using GM's 5-year predicted risk for Caucasian women was performed (Table 3). Overall, 362 invasive breast cancers were expected and 144 were observed. This corresponds to a ratio of expected to observed cases (E/O) of 2.51 (95% CI 2.14 to 2.96), indicating that the GM predicted a 2.51-fold higher risk of breast cancer incidence as compared with that observed over this period. Over-prediction of the number of breast cancers by GM was higher for women who were 60 to 64 years old (E/O ratio = 3.53) than for their younger counterparts who were 50 to 59 years old (E/O ratio = 2.15) ( Table 4). The E/O ratios were fairly similar across the 5-year predicted risk quintile groups. Breakdown by 5-year age group and 5-year predicted risk quintile group further emphasized the over-prediction of the GM prior as compared with our risk population in certain subpopulations. The E/O ratio was greater than 5 for women who were 60 to 64 years old in the 41% to 80% quintile group.
Four hundred nine breast cancers developed in the 10 years after screening as compared with the 758 expected cancer cases predicted by using the GM ( Table 5). The extent of over-prediction of the number of invasive breast cancers by GM within 10 years from screening (E/O ratio = 1.85) was smaller than that for the 5-year period (E/O ratio = 2.51). Over-prediction of the number of breast cancers by GM was again higher for women who were 60 to 64 years old (E/O ratio = 2.54) than for their younger counterparts who were 50 to 59 years old (E/O ratio = 1.61). Trends of E/O ratio by age group and predicted risk quintile group which were based on 10-year prediction were broadly similar to those based on 5-year prediction ( Table 6).

Use of Gail model in an Asian setting
The GM is widely used in Western countries for predicting the absolute risk of invasive breast cancer. Several other prospective studies looking at the accuracy of the GM were previously performed but these studies were done predominantly in Western populations: New York [16], Canada [17], Edinburgh [18], Malmo [19,20], Kopparberg and Ostergotland (Swedish Two-County) [21], Stockholm [22], Gothenburg [20,23], and Turkey [11]. Recruitment of Asian women into such trials conducted in Western populations has been rare and difficult. Relatively few publications have evaluated the GM in a population outside the US, and these were based primarily on case control data, which are more appropriately used for assessment of relative rather than absolute risk [9,10].
The SBCSP was a national prospective study of 28,235 predominantly Asian women. This is the only prospective trial conducted outside North America and Europe. Our results demonstrate, for the first time, an association between breast cancer incidence and the GM risk factors in an Asian population. Given the mature 10year follow-up data of this study, the number of expected invasive breast cancer cases which was based on the GM was 1.85 times higher than the actual number observed in our local population. We conclude that the use of the GM considerably overestimates population risk for breast cancer incidence in Singaporean women who are not in a structured program of regular mammography screening. Given that our study was conducted in an Asian population, this result would be relevant to our Asian population and have a relevant impact on local chemoprevention studies and screening policies. In our cohort, mammography was performed only once; thereafter, patients continued usual care. In Singapore, the incidence of breast cancer was 59.5 per 100,000 per year in the period from 2004 to 2008 [1]. From the International Agency for Research on Cancer's GLOBOCAN database, we obtained corresponding rates of 76 per 100,000 per year for the US [2]. Noting the differences in age-adjusted incidence rates between Singapore and the US, we considered the possibility that differences in distribution of the GM risk factors may account for this discrepancy if the GM included all relevant risk factors.
The use of screening would often result in a transient rise in numbers of pre-invasive and invasive lesions and a resultant fall in events in the years immediately following the period of screening because prevalent cases that were detected at initial screening were eliminated in subsequent years. This would result in fewer cases observed in the year immediately following the screening period and thus result in over-prediction. This was also observed in our cohort: the incidence of invasive breast cancer (per 10,000 woman-years) fell from an initial 37.5 to a mere 9.2 in the second year and 12.8 in the third year following screening. In the unscreened population, the invasive breast cancer incidence per 10,000 women remained relatively similar throughout all follow-up years after screening. However, the original model reported by Gail and colleagues [7] had computed the age-specific breast cancer incidence rates with inclusion of the first 3 years of the study. As such, we adhered to the same approach for model validation by including the first few years of follow-up in our data analysis.
Singapore is a small, dense, and highly electronically networked society. All Singaporeans have a unique identification number that enables accurate matching of electronic data with a central electronic cancer registry. Cancer notification is mandatory in Singapore and is managed through pathology departments. This enables near complete ascertainment of breast cancer incidence by linkage of SBCSP members with the cancer registry data.
There is minimal impact made on the results of our study despite the minor differences in coding of covariates in the SBCSP data when compared with the original GM. While the original GM includes the number of previous breast biopsies (0, 1, and 2 or more) and presence of atypical hyperplasia as risk factors, our analysis used an indicator of whether a woman had a previous breast biopsy (no, yes) in place of these factors for the SBCSP. These coding differences are unlikely to affect predictions meaningfully. In the NCI's SAS macro for . For unknown presence of atypical hyperplasia, the relative risk of all women with a previous breast biopsy done is multiplied by a factor of 1. In addition, the percentage of women with a previous breast biopsy in SBCSP is 5.2, which is lower than the 15% based on the BCDDP data used for the estimation of relative risk in the GM, further lowering the influence of this covariate on our study results. The models overestimate risk in most of the categories. The discrepancy may arise from additional risk factors varying across populations not taken into account by the GM. In our study, the GM overestimates risk for women in whom age of menarche was at least 12 years, women who were more than 50 years old and who had no biopsy, or women without a first-degree relative affected and age at first live birth of 20 to 29 years, and these particular groups would benefit less from the use of mammography. It is of interest that the relative risks associated with family history are larger in our study as compared with those found in other studies performed in Western populations. This is especially so for women with more than one affected first-degree relative with breast cancer, in whom the larger relative risks could be due to the small number of such women in our study. Matsuno and colleagues [15] also reported breast cancer risks in Asian-American patients. In that study, the relative risks associated with family history were not larger in an Asian-American population when compared with the original GM. However, the study by Matsuno and colleagues is much smaller than ours, and the subjects are derived from an immigrant Asian population in the West. Hence, our study observation is likely to be representative and meaningful. The good performance of the IT-GM and the IT1-GM using independent data of Florence-EPIC (Florence-European Prospective Investigation into Cancer and Nutrition) indicates that the data from the Italian Multicentre Case Control Study of Diet and Breast Cancer might be useful for revising the GM to include additional risk factors, particularly modifiable risk factors, such as dietary consumption patterns. In principle, the case control data used in this study, together with cancer registry data, can be used to construct such models of absolute risk, and the current findings encourage us to do so. The extent of over-prediction of number of invasive breast cancers by GM within 10 years from screening (E/O ratio = 1.85) was smaller than that for the 5-year period (E/O ratio = 2.51). There are a few possible reasons for this decrease in E/O ratio. One possible explanation is the increase in breast cancer incidence with age. This results in larger numbers of breast cancer diagnosed in an older population. A larger incidence of breast cancer would better approximate the risks of breast cancer incidence calculated by the GM. In addition, it is possible that, over time, an increased proportion of women adopted regular mammographic screening, thereby resulting in an increase of breast cancer detection and consequent narrowing of our E/O ratio as calculated by using the GM. There was a significant lack of fit between the observed and expected breast cancer numbers based on GM for the SBCSP study cohort. By age cohort, females who were 50 to 59 years old appeared to derive a better benefit from a one-time screening mammography in comparison with females who were 60 to 64 years old. The Matsuno models gave a breast cancer number that was closer to the observed count as compared with the Caucasian-American GM model, but overestimation remained at 5-year follow-up (Tables 3, 4, 5 and 6). Based on the Matsuno models, the 5-year E/O ratio was 1.43. This was much lower than the corresponding ratio of 2.51 based on GM. For the 10-year prediction period, the E/O ratio was 0.99 based on the Matsuno models and 1.85 based on GM. There was no significant difference between the expected count and observed count of breast cancers by presence of previous biopsy, 5-year age at screening age group, and predicted risk quintile groups when the predictions are based on the Matsuno models. Possible reasons to explain the presence of an over-prediction at 5 years and its subsequent disappearance at 10 years are multifactorial.

Calibration of the Gail model
Decarli and colleagues [24] pointed out, in a recent article establishing the validity of the GM in a Southern European population, that the GM may be well calibrated for Western European populations in which mammographic screening is common but that its applicability remains uncertain in populations in which screening is less frequent. Although the good calibration of the GM in the Florence-EPIC cohort suggests that it may be a useful model for predicting risk in other Western European populations in which mammographic screening is common, our results suggest that, at the minimum, the GM may be less useful in our prospective study population in which screening is less frequent.
Based on rapid fertility and lifestyle changes in the Singapore population (particularly in the 1970s), an agecohort model is thought to provide possibly the best prediction for breast cancer trends seen [25]. In our study population, the GM provided the best predictions for individuals from the 50-to 54-year age group (E/O ratio = 1.52, 95% CI 1.29 to 1.80). This may be compared with the setting of older women who were 60 to 64 years old (E/O ratio = 2.54, 95% CI 2.10 to 3.07), in whom the model significantly over-predicts the risk by up to three times. There has been a steady increase in breast cancer incidence in Singapore to match that of Western countries [25]. The reason for this trend is multifactorial and includes increased adoption of Western lifestyle and changes in fertility.

Risk of over-diagnosis
Over-diagnosis is used to describe a condition that is diagnosed but that otherwise would not cause symptoms or go on to cause death. This occurs when the cancer either does not progress or grows at such a slow rate that the individual succumbs to other comorbidities. Given the increasing prevalence of cancer screening together with the improvements in screening technologies and interventions, the issue of over-diagnosis is now more evident. The 15-year follow-up results of the Malmo mammographic screening trial suggest that the risk of over-diagnosis of a mammographically detected cancer is 24% [26]. While early detection of invasive cancers aids in the treatment and improved survival for breast cancer, the role of early treatment of DCIS has not been well studied [27]. Four of the five large trials that found breast cancer mortality reductions, however, found no overall mortality reductions [28]. In addition, DCIS has been reported to have excellent survival [17], and it has been suggested that one in three 'invasive' cancers detected by mammography is non-threatening [29].
For patients, early detection adds to the increased risks of procedures such as breast biopsies as well as the risks of unnecessary treatment and medical expenses. Over-diagnosis also adds to patient anxiety and has social implications. A thorough discussion between clinician and patient should be carried out to ensure that the patient understands the implications and limitations of screening prior to proceeding with it. Strategies to reduce the risk of over-diagnosis would include a review of the current risk factors in our local population and a refinement of our risk prediction models to allow us to better advise our patients with regard to the need for screening.
Much of our data are based on information obtained from our national cancer and diseases registry, which has excellent coverage. However, it is true that certain individuals who had migrated from the country and changed citizenship would subsequently be lost to follow-up on such a national database. However, the numbers of migrants remain small and should not affect our results significantly. It should be noted that our study represents the only systematic one-time mammography screening study performed to date and differs from other studies that were conducted on the basis of annual/biennial mammograms. With increased ascertainment of incidental tumors, the incidence of breast cancer would be higher with annual screening as compared with a single screening mammogram.
In summary, we report long-term results from a large national prospective breast cancer mammographic screening study from Singapore, the only one conducted outside North America and Europe. Our study validates the GM risk factors individually but demonstrates that the GM overestimates individual risk for breast cancer in the setting of a developed country in Asia. With industrialization and its accompanying lifestyle shifts, Asia has witnessed a dramatic rise of breast cancer incidence, representing a huge global burden. Consequently, it has become imperative to evaluate local breast cancer epidemiology in order to validate existing models of breast cancer risk. With our results, future work should focus on the development of appropriately calibrated models for better prediction of risk, which would benefit individual counseling and cancer prevention research.