Risk prediction for estrogen receptor-specific breast cancers in two large prospective cohorts

Background Few published breast cancer (BC) risk prediction models consider the heterogeneity of predictor variables between estrogen-receptor positive (ER+) and negative (ER-) tumors. Using data from two large cohorts, we examined whether modeling this heterogeneity could improve prediction. Methods We built two models, for ER+ (ModelER+) and ER- tumors (ModelER-), respectively, in 281,330 women (51% postmenopausal at recruitment) from the European Prospective Investigation into Cancer and Nutrition cohort. Discrimination (C-statistic) and calibration (the agreement between predicted and observed tumor risks) were assessed both internally and externally in 82,319 postmenopausal women from the Women’s Health Initiative study. We performed decision curve analysis to compare ModelER+ and the Gail model (ModelGail) regarding their applicability in risk assessment for chemoprevention. Results Parity, number of full-term pregnancies, age at first full-term pregnancy and body height were only associated with ER+ tumors. Menopausal status, age at menarche and at menopause, hormone replacement therapy, postmenopausal body mass index, and alcohol intake were homogeneously associated with ER+ and ER- tumors. Internal validation yielded a C-statistic of 0.64 for ModelER+ and 0.59 for ModelER-. External validation reduced the C-statistic of ModelER+ (0.59) and ModelGail (0.57). In external evaluation of calibration, ModelER+ outperformed the ModelGail: the former led to a 9% overestimation of the risk of ER+ tumors, while the latter yielded a 22% underestimation of the overall BC risk. Compared with the treat-all strategy, ModelER+ produced equal or higher net benefits irrespective of the benefit-to-harm ratio of chemoprevention, while ModelGail did not produce higher net benefits unless the benefit-to-harm ratio was below 50. The clinical applicability, i.e. the area defined by the net benefit curve and the treat-all and treat-none strategies, was 12.7 × 10− 6 for ModelER+ and 3.0 × 10− 6 for ModelGail. Conclusions Modeling heterogeneous epidemiological risk factors might yield little improvement in BC risk prediction. Nevertheless, a model specifically predictive of ER+ tumor risk could be more applicable than an omnibus model in risk assessment for chemoprevention.


Background
Breast cancer (BC) screening and chemoprevention strategies should prioritize women who are expected to benefit from the interventions. Risk prediction models could be useful assessment tools to facilitate this strategy, as long as the models themselves possess sufficient predictive power. So far, more than 20 risk prediction models have been developed for BC since the first model developed by Gail in 1989 [1,2]. Initially, the Gail model (hereinafter referred to as Model Gail ) was based on age, age at menarche and at first live birth, previous breast biopsy, and family history of BC, yielding moderate discriminatory power (C-statistic) of 0.58 in external validations [3,4]. New predictors, such as breast density, hormone replacement therapy (HRT), anthropometric measures, and lifestyle factors (e.g. alcohol intake), were continuously introduced into the succeeding models, resulting in marginal improvements in prediction [5].
BC comprises etiologically distinct subtypes defined by molecular factors. Hormonal and reproductive factors, such as elevated circulating sex hormone levels, early menarche, delayed childbirth, and nulliparity, are only or are more strongly related to increased risks of subtypes expressing estrogen receptor (ER+) and progesterone receptor (PR+) [6]. Further, ER+ breast tumors respond more favorably to hormone therapy than ER-/PR-tumors [6][7][8]. It has been hypothesized that combining etiologically distinct subtypes as one single outcome undermines BC prediction [9]. However, most of the published BC risk prediction models are omnibus models and only one model differentiates risk associations by hormone receptor status [10].
In the current analysis, using data from the European Prospective Investigation into Cancer and Nutrition (EPIC) and the Women's Health Initiative (WHI) study in the USA, we examined whether modeling heterogeneous risk associations by ER status, which entails building ER-specific risk prediction models, could yield better prediction of BC risk.

Study population for model derivation and internal validation
The study population for model derivation consisted of women recruited into the EPIC cohort from 1992 to 2000 in 10 European countries (Norway, Sweden, Denmark, the UK, the Netherlands, Germany, France, Spain, Italy, and Greece) [11,12]. Women with one or more of the following characteristics were excluded: (1) < 40 or > 70 years of age at recruitment (n = 49,410); (2) diagnosed with cancer before recruitment (n = 39,760); and (3) no information on censoring date and/or disease status (n = 142). All women recruited in the study center of Malmö, Sweden were also excluded due to lack of information on ER status for all BC diagnoses (n = 14,396). After these exclusions, 281,330 women (51% postmenopausal at recruitment) were included in the analysis.

Study population for external validation
The WHI study was launched in 1993 and recruited 161,808 postmenopausal women aged 50-79 years into either an observational study or one of the three clinical trials that tested the health effects of HRT, a low-fat diet, and calcium-vitamin D supplementation, respectively [13]. For the purpose of the present study, we excluded non-Caucasian women (n = 28,267), women in the HRT trial (n = 27,347), women who had mastectomy or a history of cancer at recruitment (n = 16,501), and women with incomplete information on the risk factors considered in our models (n = 29,431), resulting in a validation population of 82,319 women.
All women in the EPIC and WHI studies provided written informed consent. In the WHI study, Human Subjects Committee approval at each participating institution was provided. The present study was approved by the Ethical Review Board of the International Agency for Research on Cancer (Lyon, France).

Risk factors and disease outcomes
Among the most frequently included predictors in current BC risk prediction models [5], the following variables were available in EPIC and WHI, and were therefore included in this study: menopausal status, age at menopause, age at menarche, duration of HRT, duration of breastfeeding, full-term pregnancy (FTP), number of FTPs, age at first FTP, body height, body mass index (BMI), interaction between BMI and menopausal status, alcohol intake, and country. Table  5 in Appendix provides the coding of these predictor variables. We retained all the women for analysis and handled the missing values by five-time multiple imputations with chained equations [14]. Three predictor variablesin the Gail model were not included in our models, i.e. family history of BC in first-degree relatives, previous breast biopsy, and history of atypical hyperplasia. In the EPIC study, family history of BC was only available for 49% of women, while information on previous breast biopsy and history of atypical hyperplasia were not collected.
Sensitivity analyses that included effect modification of parity by menopausal status in the EPIC study showed no evidence of statistically significant interactions. Similarly, no effect modifications were observed for HRT by BMI and breastfeeding by parity. These interactions were hence not retained further. Incident BC diagnoses among the EPIC women were ascertained through national cancer registries or a combination of health insurance records, pathology registries, and regular questionnaire surveys. The definition of positive hormone receptor status was standardized using the following criteria: ≥ 10% cells stained, any "plus-system description", ≥ 20 fmol/mg, an Allred score of ≥ 3, immunoreactive score (IRS) ≥ 2, or an H-score ≥ 10. Among the WHI women, centrally trained, locally based physician adjudicators verified BC diagnoses by medical record and pathology report review, and positive hormone receptor status was defined as ≥ 10% cells stained [15].

Absolute risk modeling
Using the EPIC data, we fitted cause-specific piecewise-constant hazards models [16] for ER+ and ERtumors separately (hereinafter referred to as Model ER+ and Model ER-). The cutoffs were placed at 45, 50, 55, 60, 65, 70, and 75 years of age. Whether a risk association is heterogeneous by ER status was examined using the likelihood ratio test [17].
Tumors with unknown ER status, primary cancers at other sites, and deaths from non-cancer causes were modelled as competing events to ER+ tumors and ERtumors. A Gompertz model with age as the time scale was fitted for all these competing events combined. In addition, ER+ and ER-tumors were considered mutually competing.
To evaluate the improvement in risk prediction by modeling the heterogeneous risk associations, an omnibus model was also fitted following the same methodology described above, treating ER+ and ER-tumors as one single disease outcome.

Model validation
First, we validated our ER-specific models internally by fivefold cross-validation [18] and then externally using WHI data. For external validation using the WHI data, we combined the model coefficients derived from the EPIC women and the ER-specific baseline hazards of the WHI women to project 5-year ER-specific absolute risks. We calculated C-statistics to assess discriminatory accuracy and the ratio of expected-to-observed number of tumors occurring in the first 5 years (E/O) to assess overall calibration. In the WHI women, the 5-year absolute risk of developing BC was projected using Model Gail , enabling us to compare the performance of our model with that of Model Gail . We performed decision curve analysis in the WHI women to compare the clinical applicability of Model ER+ and Model Gail for identification of women for chemoprevention.
Let B denote the benefit of receiving chemoprevention for an individual who would develop BC, H the harm of receiving chemoprevention for an individual who would never develop BC, and p i indicates an individual risk. The rationale of decision curve analysis is that positive net benefits is guaranteed at the population level if chemoprevention only covers individuals with risk projections p i above the risk threshold p t , where: p t × B = (1 − p t ) × H [19,20]. Given the fact that quantities of B and H of chemoprevention remain unknown, net benefits are calculated through all the possible risk thresholds between two extremes, i.e. zero and the maximal risk estimate, representing a treat-all strategy and a treat-none strategy, respectively. The clinical applicability of a risk prediction model is indicated by how much the model's net benefit curve is above the treat-all and treat-none strategies, i.e. the area formed by the model's net benefit curve and the two extreme strategies.

Cohort description
Country-specific distributions of the risk factors among the EPIC women are shown in Table 6 in Appendix. Distributions of the same risk factors among the WHI women are shown in Table 7 in Appendix. During an average follow-up period of 14.7 years, 12,067 BC cases (7210 ER+ tumors, 1598 ER-tumors, and 3259 tumors with unknown ER status), 16,929 primary cancers at other sites, and 6548 deaths from non-cancer causes were ascertained among the EPIC women, as reported in Table 1.

The ER-specific absolute risk models
Among the risk factors with identical associations by ER status (Table 2), being postmenopausal compared with premenopausal at recruitment was associated with a reduced tumor risk after controlling for age (hazard ratio (HR) = 0.66, 95% confidence interval (CI) = 0.60 to 0.74). For postmenopausal women, a statistically significant and monotonically increasing tumor risk was observed with older age at menopause compared with reaching menopause before the age of 45 years (p trend < 0.001). No statistically significant association was observed for breastfeeding and breast cancer risk among parous women. Later age at menarche (≥ 15 vs ≤ 11 years of age) was statistically significantly associated with decreased tumor risk (HR = 0.85, 95% CI = 0.79 to 0.92). Duration of HRT was statistically significantly associated with increased breast cancer risk (p trend < 0.001). BMI was associated with breast cancer and exhibited a statistically significant interaction with menopausal status: for postmenopausal women, HRs (95% CIs) for the BMI categories in ascending order were 1.11 (1.04 to 1.18), 1.21 (1.10 to 1.34), and 1.30 (1.11 to 1.53), respectively. For alcohol intake, exceeding one drink per day, compared with nondrinking, was statistically significantly associated with an increased breast cancer risk.
Tests for heterogeneity showed differential risk associations for FTP, number of FTPs, age at first FTP, body height, and country by ER status ( Table 2 and  Table 8 in Appendix). Parity (one single FTP, age at FTP ≤ 20 years) compared with nulliparity was associated with a statistically significant reduction in ER+ tumor risk (HR = 0.81, 95% CI = 0.71 to 0.91). Among parous women, having three or more FTPs was associated with a further risk reduction for ER+ tumors compared with one single FTP (HR = 0.87, 95% CI = 0.80 to 0.95), and delayed age at first FTP was associated with increased ER+ tumor risk (p trend < 0.001). In addition, every 10-cm increment in body height was associated with a 19% increase in ER+ tumor risk (95% CI = 1.15 to 1.24). None of these factors, however, was statistically significantly associated with ER-tumor risk. Table 8 in Appendix shows the coefficients for different countries by ER status. Based on the same heterogeneous risk factor profiles, we also estimated the risk associations using the WHI data (Table 2), which were largely comparable to those from the EPIC study, with the exception of age at menarche, and especially for ER-tumors, FTP, number of FTP, and age at first FTP. Table 3 shows the predictive performance of the ER-specific models (C-statistic and E/O) corrected by the fivefold cross-validation. Model ER+ , Model ER-and the omnibus model shared a C-statistic of 0.68. Elimination of the country effect reduced the C-statistic notably to 0.64 for Model ER+ , 0.59 for Model ER-, and 0.63 for the omnibus model. A minor difference in C-statistic was observed between premenopausal and postmenopausal women. The omnibus model exhibited a higher C-statistic for ER+ than for ER-tumors (0.64 vs 0.59). Model ER+ significantly overestimated the 5-year tumor risk by 10% (E/O = 1.10, 95% CI = 1.05 to 1.14), particularly among premenopausal women (13%). Model ERnon-significantly underestimated the risk (E/O = 0.96, 95% CI = 0.88 to 1.05) overall and by menopausal status.

Model validation
External validation with the WHI data resulted in a C-statistic of 0.59 (95% CI = 0.58 to 0.60) for Model ER+ and 0.53 (95% CI = 0.50 to 0.57) for Model ER- (Table 4). Model Gail yielded an overall C-statistic of 0.57 (95% CI = 0.56 to 0.59) with a markedly lower C-statistic of 0.53 (95% CI = 0.50 to 0.57) for ER-tumors. Regarding calibration, an overestimation was observed for ER+ tumors (E/O = 1.09, 95% CI = 1.03 to 1.14) whereas a statistically non-significant underestimation was   . 1a); for ER-tumor risk, overestimation was observed mainly among low-risk individuals whereas underestimation was observed mainly among high-risk individuals (Fig. 1b). Among WHI women, the overestimation by Model ER+ and the underestimation by Model Gail were largely systematic ( Fig. 1c and e). The statistically non-significant underestimation by Model ER-in the WHI women showed no clear pattern (Fig. 1d). Figure 2 shows the net benefit curves of Model ER+ and Model Gail . The net benefit curves of the two models started to diverge from the treat-all strategies at the risk threshold of 0.55%, which was roughly the minimal risk projected by both models. Model ER+ would yield higher net benefits than both the treat-all strategy and the treat-none strategy (denoted by the x-axis at y = 0) if the risk threshold lay between 0.55% and 2.5%, corresponding to an assumption that the benefit of chemoprevention was 180 to 40 times the    harm. In contrast, Model Gail would yield lower net benefits than the treat-all strategy if a risk threshold below 2% were selected, including 1.67%, the currently adopted risk threshold for chemoprevention in the USA, and would yield negative net benefits if a risk threshold above 4% (i.e. benefit ≈ 25 × harm) were selected. The clinical applicability of Model Gaili , as indicated by the sum of Area A and Area B shown in Fig. 2, was 3.0 × 10 − 6 . The clinical applicability of Model ER+ was 12.7 × 10 − 6 (Area C).

Discussion
The heterogeneous risk associations in our ER-specific risk prediction models are consistent with the established knowledge that FTP, number of FTPs, and delayed childbirth are associated with ER+ tumors but not with ER-tumors [6][7][8]. Our study also confirms a largescale meta-analysis of epidemiological data showing that BC risk increases with prolonged duration of HRT use [21]. Data from the WHI randomized trial showed a statistically significant increase in the incidence and mortality of invasive BC in the estrogen-plus-progestin arm compared with the placebo arm [22,23], whereas estrogen alone decreased BC incidence and mortality among postmenopausal women with prior hysterectomy [24,25]. Stronger positive associations for estrogen plus progestin than for estrogen alone were reported for BC [26,27]. In the present study, we could not separate estrogen alone and estrogen plus progestin due to unknown HRT compounds among former users in EPIC. Among current HRT users at baseline, use of estrogen plus progestin was more common in EPIC than in the WHI cohort (76% vs 44%, respectively). However, similar associations between the duration of lifetime HRT use and BC risk were observed in both the EPIC and the WHI study.
In ER-specific risk models, statistically significant and homogeneous risk associations were fitted for age at menopause and age at menarche, in line with a pooled analysis of previous investigations where nearly identical effects were observed for ER+ tumors and ER-tumors [28]. The present study demonstrated a null association between breastfeeding and BC risk, inconsistent with previous investigations where inverse associations were reported [6,8,29]. We note that most previous studies were case-control studies, which were subject to recall bias. In fact, the inverse association disappeared in some cohort studies [30,31]. In a more recent pooled analysis, breastfeeding was not associated with ER+ and/or PR+ tumors but was inversely associated with ER-/PR-tumors [32].
In a pooled analysis of prospective cohort data, every 10-cm increment in body height was statistically significantly associated with ER+ tumor risk (HR = 1.18) but had null association with ER-tumor risk [33], supporting the way we modeled body height in the present study.
Prediction of ER+ tumor risk might be practically more useful than prediction of overall BC risk [3]. The reason for this is twofold. First, projecting subtype-specific risks allows for accurate estimation of the risk associations of factors that are etiologically heterogeneous and as a result might increase the discriminatory power. Second, since currently used chemoprevention only reduces the risk of ER+ tumors [34], there is a need for a model that can specifically predict the risk of developing ER+ tumors.
The discriminatory accuracy of Model ER+ in internal validation performed no better than most of the current omnibus models using questionnaire-derived data, suggesting limited improvement in discrimination after accounting for etiological heterogeneity. This was not surprising given that ER+ tumors are the dominant subtype and the omnibus model shared nearly equivalent parameters (data not shown) with Model ER+ in the present study. According to the only study so far that has modeled ER-specific risks, the discriminatory power of the ER+/PR+ model was moderately higher than that of the ER-/PR-model (0.64 vs 0.61) [10]. In that study, risk factors with heterogeneous associations included age, menopausal status, BMI, age at first birth, and past use of postmenopausal HRT, and its subtype-specific models were based on a relatively small number of tumors (1281 ER+/PR+ tumors, 417 ER-/PR-tumors). Notably, in that study there was no correction for potential overfitting by either internal or external approaches.
When externally validated in the WHI cohort, Model ER+ exhibited moderate discriminatory accuracy comparable to that of Model Gail . Women in the USA with 5-year BC risk of 1.67% or higher, projected by Model Gail , are considered potentially eligible for chemoprevention [35]. This risk threshold would lead to coverage of 36,265 (44.0%) women in our WHI validation population, of whom 1239 were subsequently diagnosed with ER+ tumors and 194 with ER-tumors. According to Model ER+ , a risk threshold of 1.97% would cover the same number of women with 16 more prospective ER+ tumors and 2 fewer prospective ER-tumors.
The decision curve analysis provided some interesting insight into the clinical applicability of Model ER+ and Model Gail . As indicated by the net benefit curves, Model ER+ would demonstrate no advantage over the treat-all strategy if the benefit-to-harm ratio of chemoprevention were higher than 180, equivalent to any risk threshold below the minimal risk projection (≈ 0.55%), while such a boundary benefit-to-harm ratio was 50 for Model Gail . Interestingly, the treat-all strategy would even outperform Model Gail when the risk threshold was situated at 1.67%. In contrast to Model Gail , Model ER+ had a wider threshold range where higher net benefits could be obtained by a model-based decision-making than by either the treat-all or the treat-none strategy. Considering the unknown benefit and harm associated with chemoprevention, Model ER+ thus has broader applicability than Model Gail , as indicated by the areas formed by the two models' net benefit curves and the two extreme strategies. As shown in Fig. 2, the lowest benefit-to-harm ratio for chemoprevention against BC to produce a positive net benefit is 25, whereas such a benefit-to-harm ratio for chemoprevention against ER+ tumors is 40, suggesting that chemoprevention against ER+ tumors might be 1.6 times (40/25) more efficient than chemoprevention against all types of BC.
Among both the EPIC women and the WHI women, Model ER+ overestimated the 5-year risk by about 10%, possibly due to potential misspecifications of our models, such as imperfect fit of the baseline hazard functions (the baseline hazard estimates are given in Table 9 in the Appendix). More importantly, this overestimation was systematic rather than in an overfitting pattern, i.e. underestimation occurs in low-risk individuals and overestimation occurs in high-risk individuals [36].
We derived ER-specific models from a large prospective cohort and validated them in another large independent cohort for external validation. This is a strong approach to robust parameterization and assessment of model performance. However, some limitations characterize the present study. Our models did not include some established risk factors such as family history of BC (FHBC) and previous breast biopsy, as these variables were not available in the EPIC study. A complete-case analysis of EPIC women with known FHBC (n = 138,257, 49% of the sample) showed positive homogenous associations between FHBC and tumor subtypes (HR ER+ = 1.64, 95% CI = 1.49 to 1.81; HR ER-= 1.50, 95% CI = 1.23 to 1.91; p heterogeneity = 0.57), suggesting that inclusion of this factor would increase the predictive power of the model, though not differentially across the hormonal receptor status of the tumors. Another limitation of our study was the underestimation of baseline hazards due to EPIC tumors with unknown ER status, which accounted for about 25% of BC diagnoses. Under the assumption of ER-status data missing at random, parameter estimates are expected to be unbiased, a necessary requisite to carry out proper external validation, whereas the underestimated baseline hazard would be replaced with the actual baseline hazard function of the test population.

Conclusions
In summary, we found that modeling heterogeneous risk associations of epidemiological factors yields little improvement in BC risk prediction. Nevertheless, compared with the current omnibus models, a model specifically predictive of ER+ tumor risk could be more applicable in risk assessment for chemoprevention.

Appendix
Description of the Gail model The Gail model, also known as the Breast Cancer Risk Assessment Tool, has been adopted to estimate the 5-year absolute risk of developing invasive breast cancer among women aged 35 years or older. Women with a 5-year absolute risk of 1.67% or higher as projected by the Gail model are regarded as eligible for chemoprevention by tamoxifen. The Gail model includes the following predictor variables: age, ethnicity, age at menarche, age at first live birth, number of first-degree relatives with breast cancer, number of previous breast biopsies, and history of atypical hyperplasia. The relative risks of these risk factors were estimated from a case-control study within the Breast Cancer Detection Demonstration Project (BCDDP). The baseline age-specific hazard rates were also calculated from the BCDDP as the observed age-specific hazard rates times 1 minus the population attributable fraction [1]. Five-year breast cancer risk projection in the Women's Health Initiative study using the Gail model has been detailed elsewhere [3].     Values are percentages unless otherwise indicated BMI body mass index, ER estrogen receptor, FTP full-term pregnancy, HRT hormone replacement therapy, WHI Women's Health Initiative  Risk factor profiles for baseline hazard: country = France (only for EPIC), menopausal status = 1 (postmenopausal), age at menopause ≤ 45 years, duration of breastfeeding = 0, age at menarche ≤ 11 years, duration of hormone replacement therapy = 0, full-term pregnancy = 0 (nulliparous), body mass index < 25 kg/m 2 , height_d10 (body height in cm divided by 10) = 16, alcohol intake (drinks/day) = 0