Novel 18-gene signature for predicting relapse in ER-positive, HER2-negative breast cancer

Background Several prognostic signatures for early oestrogen receptor-positive (ER+) breast cancer have been established with a 10-year follow-up. We tested the hypothesis that signatures optimised for 0–5-year and 5–10-year follow-up separately are more prognostic than a single signature optimised for 10 years. Methods Genes previously identified as prognostic or associated with endocrine resistance were tested in publicly available microarray data set using Cox regression of 747 ER+/HER2− samples from post-menopausal patients treated with 5 years of endocrine therapy. RNA expression of the selected genes was assayed in primary ER+/HER2− tumours from 948 post-menopausal patients treated with 5 years of anastrozole or tamoxifen in the TransATAC cohort. Prognostic signatures for 0–10, 0–5 and 5–10 years were derived using a penalised Cox regression (elastic net). Signature comparison was performed with likelihood ratio statistics. Validation was done by a case-control (POLAR) study in 422 samples derived from a cohort of 1449. Results Ninety-three genes were selected by the modelling of microarray data; 63 of these were significantly prognostic in TransATAC, most similarly across each time period. Contrary to our hypothesis, the derived early and late signatures were not significantly more prognostic than the 18-gene 10-year signature. The 18-gene 10-year signature was internally validated in the TransATAC validation set, showing prognostic information similar to that of Oncotype DX Recurrence Score, PAM50 risk of recurrence score, Breast Cancer Index and IHC4 (score based on four IHC markers), as well as in the external POLAR case-control set. Conclusions The derived 10-year signature predicts risk of metastasis in patients with ER+/HER2− breast cancer similar to commercial signatures. The hypothesis that early and late prognostic signatures are significantly more informative than a single signature was rejected. Electronic supplementary material The online version of this article (10.1186/s13058-018-1040-9) contains supplementary material, which is available to authorized users.


Introduction
Five years of adjuvant endocrine therapy is standard treatment for patients with primary oestrogen receptor positive (ER+) breast cancer, and clearly improves prognosis (1). Multiparameter molecular assays are increasingly used to estimate prognosis and guide treatment decisions of patients with primary ER+ breast cancer. These include the Oncotype DX Recurrence Score (RS) (2), Prosigna PAM50 (3), Breast Cancer Index (BCI) (4), EndoPredict (5) and IHC4 (6). All of them have been evaluated in the TransATAC series of samples that were established from patients with ER+ primary breast cancer randomised to treatment with 5 years' anastrozole or tamoxifen in the ATAC trial (7).
It has become clear that following surgery the risk of recurrence in ER+ primary breast cancer is not constant across all breast cancer subtypes, as evident between intrinsic subtypes. For example, data from 2985 tumours showed basal subtypes experiencing high recurrence rates early that decline after 5 years after surgery, contrary to patients categorised as luminal A, who experienced a lower but relatively constant recurrence rate during the first decade (8).
There are additional molecular differences beyond the subtype of a tumour. In TransATAC we have previously shown that the oestrogen-module of RS was prognostic within five years of surgery (during endocrine therapy), however it became non-informative for recurrences beyond five years thus weakening the overall prognostic value of RS (9). In the same data set, patients with high ER expression by RT-PCR were twice as more likely to have a relapse 5-10 years after surgery than within the first 5 years. Bianchini et.al. reported risk stratification by integrating the mitotic kinase score (MKS) and an ER-related score (ERS), both based on genes constituting the proliferation and oestrogen modules of RS. Women with high MKS and ERS tumours were at greater risk of late recurrence (10). More recently, improved risk estimation beyond 5 years by RS was reported when integrated with dichotomised ER expression assessed by RT-PCR (11).
Extending endocrine therapy beyond five years has been shown to reduce late recurrence rate (12,13), however those most likely to benefit from such therapy need to be identified. While some of the widely used prognostic assays for ER+ patients have been shown to be prognostic for risk beyond five years (14)(15)(16)(17), none of them have been optimised to quantify residual risk after five years free from recurrence and their ability to predict late relapse varies substantially (18). The different timedependent performance of multiparameter molecular signatures indicates that molecular features of ER+ breast cancers may be identified to improve prediction of residual risk in order to spare those patients with significantly low low risk of late recurrence from extended endocrine therapy.
We therefore hypothesised that prognostic signatures optimised specifically for the early (0-5 years) and late (5-10 years) follow-up periods respectively would be more prognostic than a single signature optimised for the whole 10-year follow-up period. To test this hypothesis we developed time-dependent prognostic signatures in patient samples from the TransATAC series for early, late and 10 year follow-up periods. The prognostic performance was tested in an independent sample set and against commercial signatures already assessed in TransATAC. Our primary aim was to compare the prognostic value of the newly developed signature(s) added to clinical treatment score (CTS) (6) and that of PAM50 ROR-P added to CTS.

Patient cohorts
An in silico analysis drew on four published breast cancer cohorts (GSE6532, GSE9195, GSE17705, GSE26971) analysed on either Affymetrix HG-U133A (GPL96) or HG-U133 Plus 2.0 (GPL570) microarray platforms. The two platforms shared 22,277 probes to which we restricted our analyses. This cohort had 747 unique patient samples that matched our selection criteria: ER+, HER2-, treated with five years of endocrine therapy, chemotherapy naive, with either distant metastasis-free survival (DMFS) or relapse-free survival (RFS) available with a long follow-up. Details of the inclusion criteria are listed in Supplementary Methods and a full list of samples included in the analysis is shown in Supplementary Table S1.
In the TransATAC cohort, RNA was available from 948 formalin-fixed paraffin-embedded (FFPE) tumours from the ATAC (Arimidex, Tamoxifen, Alone or in Combination) trial, previously extracted by Genomic Health Inc. (GHI) (19). Eligibility required hormone receptor (HR)+, HER2-disease, without chemotherapy treatment and at least 500ng RNA available. One hundred and eighty three recurrence events were recorded for this cohort. This study was approved by the South-East London Research Ethics Committee and all patients gave informed consent.
The POLAR (Predictors Of early versus LAte Recurrence in ER+ breast cancer) samples were identified from archives of The Royal Marsden Hospital (RMH), London, United Kingdom and Lund University Hospital (LUH) Biobank, Sweden. Eligibility criteria were patient with ER+, HER2-early breast cancer diagnosed between January 2000 and December 2004, treated with curative intent and with a follow-up data cut-off at May 2014. Patients must have received five years of adjuvant endocrine therapy (unless relapse occurred within this time); (neo)adjuvant chemotherapy was permitted. A 422 sample case:control design was used; controls were randomly selected according to matching criteria from the remaining cohort of patients who did not relapse during follow-up. The total number of patients that was drawn upon was 1449. The four matching criteria used in this study were: 1) age at diagnosis (<50 or >50 years), 2) NPI category (<3.4; 3.4-5.4, >5.4), 3) type of adjuvant endocrine therapy (tamoxifen only vs. any aromatase inhibitor (AI)) and 4) chemotherapy use (yes or no). Two-hundred and forty-seven recurrence events were recorded. The POLAR study was approved by the RMH Research Ethics Committee (CCR: 4122) and the ethics committee of Lund University Hospital (LU 240-01).

Study end points
The primary endpoint was time to any recurrence that was defined as either locoregional (ipsilateral breast, contralateral breast and regional lymph nodes) and/or distant recurrence. Secondary endpoint was time to distant recurrence, which was the time from diagnosis until metastasis from the primary tumour at distant organs, excluding contralateral disease and locoregional and ipsilateral recurrences. Death before recurrence was treated as a censoring event for both endpoints.

Analytic procedures
In the in silico data set 454 probes representing 454 genes (Supplementary table 3) were analysed at univariate level; those significant in univariate analyses in a particular setting were entered into multivariate analyses. Additional details are in the Supplementary Methods. For TransATAC, RNA was extracted by GHI for the RS study (19). One-hundred ng RNA was used with the nCounter platform (NanoString Technologies, Seattle, WA) to assay the 93 endogenous and 7 reference genes selected in the process of the in silico analysis in 948 TransATAC samples.
For POLAR, RNA was extracted from three 3x10micron unstained sections with more than 40% tumour cellularity using RNeasy FFPE kit (Qiagen) according to the manufacturer's instructions. RNA was quantified by Nanodrop (Thermo Fisher Scientific). Between 50 and 200ng RNA was used to profile the expression of 27 endogenous and 5 reference genes with the NanoString nCounter.
NanoString expression data was background corrected by subtracting the mean of the eight negative control probes, normalised with the geometric mean of five reference genes that had a correlation of Pearson's r>0.8 with all endogenous genes. The data set was then log (base 2)-transformed and zscore transformed. KIF20A gene was detected in <10% of samples in the TransATAC cohort and was removed from the data set. CTS, which carries information on tumour size, nodal status, grade, age and type of endocrine therapy was calculated as published previously (6).
We trained separate early, late and 10-year signatures by performing elastic net analysis in the TransATAC training cohort. Our objective was to test if the early and late signatures had statistically significantly more prognostic power than the 10-year signature. If so, we would test the validity of the early and late signatures in the non-chemotherapy-treated subpopulation of POLAR and also test their performance in the chemotherapy-treated POLAR cohort. If the early and late signatures were not statistically significantly more prognostic than the overall signature, we would test the validity of the overall signature in the chemotherapy-naïve POLAR group and explore its performance in the chemotherapy-treated POLAR group.
Statistical analyses in the in silico cohort were carried out at the ICR using R version 3.03. Statistical analyses using the TransATAC cohort were performed at Queen Mary University of London with STATA version 13.1 and R version 3.0.3. Statistical work on POLAR was carried out at the Royal Marsden Hospital using the Statistical Analysis Plan version 2.0 and PRISM 6.0c. The statistical analysis plan for the TransATAC study was approved by the Long-term Anastrozole vs Tamoxifen Treatment Effects (LATTE) committee and for the POLAR study by the RMH Committee for Clinical Research before data analysis took place and is described in the Supplementary Methods. All statistical tests were two-sided.

Candidate gene selection and in silico analysis
In order to derive time-dependent prognostic signatures we shortlisted 585 candidate genes representing proliferation, oestrogen signalling, immune infiltration and immune signalling. These genes were tested for prognostic significance in silico from publically available gene expression sets in ER+ endocrine treated breast cancer. A flow-chart illustrating the approach is shown in Figure 1.
Sixty-seven genes of interest that are part of the PAM50, OncotypeDX RS, EndoPredict and BCI profilers were also included. Additional genes likely to be related to benefit from endocrine therapy were identified from 81 patients by reanalysing our previously published neoadjuvant endocrine therapy treated set of samples (20) (https://www.synapse.org/#!Synapse:syn16243). From this dataset we identified 164 candidate genes by examining correlation of individual gene expression from untreated biopsies with change in: 1) Ki67, 2) proliferation-associated gene cluster, 3) oestrogen-associated gene cluster, 4) expression of mGIDE (modified version of the Global Index of Dependence on Estrogen) (21) genes following two weeks of AI treatment. An additional 354 genes were selected based on literature searches. Genes from published gene modules of the proliferation-associated gene cluster, oestrogen-associated gene cluster and inflammatory response signature (20), the tumour invasion/metastasis module (PLAU) (22) and IGG-14 module (immunoglobulin gamma) (23) were also included. The complete list of candidate genes and the reason for their inclusion is detailed in Supplementary Table 1.
Seven hundred and fourty-seven samples from microarray expression dataset was compiled from four publically available breast cancer cohorts to investigate the relationship between genes and outcome (Supplementary table 2) (5, [24][25][26]. Expression data was available for 454 genes (Supplementary table 3). We performed univariate Cox proportional hazards regression analyses for early, late and 10-year follow-up periods using relapse-free survival (RFS) and distant metastasis-free survival (DMFS) as end-points respectively that identified 212 genes that were significant at p<0.01 in any of the analyses (Supplementary Table 4). Genes significantly prognostic in a particular timeperiod were taken forward for multivariate analyses performed by Cox proportional hazards regression with DMFS and RFS as endpoints respectively in the early, late and 10-year follow-up settings. This resulted in 88 genes being selected in the models (Supplementary Table 5), of which 17 genes were removed due to high correlation of expression with other candidates already selected (Supplementary Table 6). An additional 29 genes were added that included candidates without probes available in the in silico analyses, some recently emerging candidates and also seven reference genes (Supplementary Table 7).

Expression profiling and signature building in TransATAC
Sample availability in TransATAC is shown in Figure 2a. Expression data for the 100 selected genes (including housekeeping genes; Supplementary Table 9) were obtained for 948 patient samples in TransATAC using the NanoString nCounter. We assessed the prognostic value of these molecular variables in TransATAC for early, late and 10-year time periods for RFS. Sixty-three genes were statistically significant in at least one of the time-windows assessed (Supplementary Figure 1, Supplementary Table 7). We found different prognostic properties between early and late periods for 20 genes. Six genes were prognostic early but not in the late period (CD79, IL6ST, LRRC48, MPZL1, PGR, PIGV); 14 genes were not significantly prognostic early but gained prognostic significance in the late setting (ANP32E, ANXA1, CTSL2, EPB41L2, ESR1, FOXA1, ICOS, IL17RB, MMP9, MYCBP2, NR2F1, PDZK1, SLAMF8 and TCF7L2).
The TransATAC cohort was then randomly split into 2/3 (n=634) training and 1/3 (n=314) validation sets while ensuring that the recurrence rate was similar in the two subgroups. Demographics for the training, validation and overall cohorts are presented in Supplementary Table 10. We aimed to select prognostic variables independent of clinico-pathological features that are commonly used for prognosis. To achieve this, on top of the 63 statistically significant genes in univariate analyses, CTS was also entered into multivariate selections for early, late and 10-year time-periods respectively. Elastic Net Penalised Cox regression with leave one out cross-validation was used for feature selection in the TransATAC training set. CTS was selected in all three signatures in addition to 18 genes in the 10-year, 16 genes in the early and 15 genes in the late follow-up analyses. The variables and their coefficients derived from the elastic net models are listed in Table 1. CTS had the highest coefficient in each of the time periods.
Given that the early-and the late-signatures were not statistically significantly more prognostic than the 10-year signature in the respective periods they were optimised for, we rejected our primary hypothesis that signatures optimised separately for the early and the late follow-up periods respectively are more prognostic than a 10-year signature but we proceeded to assess the validity of the 18-gene, 10-year signature in an independent cohort and to compare its performance with that of commercial signatures.

Signature test of 10-year validity in POLAR cohort
A matched case-control set of samples was compiled from the Royal Marsden Hospital and Lund University Hospital archives (POLAR) to validate the 10-year signature (Figure 2b, Supplementary Table 11). Our aims were to test the validity the 10-year signature in an endocrine-therapy only cohort similar to the training set, and also to explore if the prognostic property (if any) extends to a higher risk, chemotherapy-treated population. The latter cohort was of interest in the 5-10 year period because of the potential for its use in selecting patients for extended adjuvant endocrine therapy.
We also assessed whether the 10-year signature added significant prognostic information above CTS alone using likelihood ratio tests (Table 3). In the overall POLAR cohort (n=422), CTS was prognostic across 10 years and in the early follow-up period (CTS 0-10 years period LRχ 2 =11.23; 0-5 years period LRχ 2 =22.09) but not in the 5-10 year period. The 10-year signature was prognostic in all three followup periods and contributed to CTS with significant prognostic information in the 10-year and early periods (0-10 years period ΔLRχ 2 : CTS+10-year signature vs. CTS=7.74; 0-5 years period ΔLRχ 2 : CTS+10-year signature vs. CTS=7.59) but not in the 5-10 year period. Both CTS and the 10-year signature were marginally more informative across the 10 years in the chemotherapy-treated POLAR cohort compared with the endocrine only population despite the latter having more patients and events (patients: n=170 vs. n=252; events: 99 vs. 148). Additionally, the 10-year signature added significantly more prognostic information to CTS in the chemotherapy-treated group (ΔLRχ 2 : CTS+10year signature vs. CTS=6.71) when compared to those receiving endocrine therapy only (ΔLRχ 2 : CTS+10-year signature vs. CTS=2.47).
Prognostic properties of the 18 individual genes constituting the 10-year signature were assessed in POLAR and compared to data obtained in TransATAC. In POLAR only 8 out of the 18 genes were significantly prognostic at the univariate level (Supplementary Figure 2) but all genes except tumour necrosis factor alpha (TNF) showed the same prognostic direction both in TransATAC and POLAR.

Comparison of the 10-year signature with CTS, RS, PAM50 ROR, BCI and IHC4 in TransATAC
We have previously published data on the prognostic performance of CTS, RS, PAM50 ROR, BCI and IHC4 in TransATAC (6,16,19,27); data for all scores was available for 271 patients in the validation cohort. We assessed their prognostic information for 10 years after surgery using any recurrence and distant recurrence as endpoints respectively and compared them with the newly developed 10year signature (Table 4). For both any and distant recurrence the BCI provided the most added information beyond the CTS in this set (any recurrence: CTS: LRχ²=37.4; BCI: ΔLRχ²=9.5; distant recurrence: CTS: LRχ²=46.7; BCI: ΔLRχ²=14.5, respectively). The novel 10-year signature performed similarly to the other three scores in this respect.

Discussion
We developed novel time-specific prognostic signatures for early, late and 10-year follow-up periods for ER+, HER2-patients treated with endocrine therapy alone to allow us to test the hypothesis that sequentially applying early and late signatures could be more prognostic for risk of relapse than a single newly developed 10-year signature. This hypothesis was largely based around our observation that the performance of some components in many of the commercially available signatures varied between these time periods. For example, we found that ESR1 and the oestrogen module overall in the RS was less prognostic in years 5-10 than 0-5 (9). Analogous findings were made by Bianchini et al (10). Very recently the EBCTCG published data on clinicopathological and limited immunohistochemical data on over 60,000 women that were treated with 5 years of endocrine therapy (28). While progesterone receptor showed strong prognostic performance in years 0-5, it showed no significant relationship with prognosis thereafter. These data on markers associated with hormone responsiveness support the contention, but by no means prove, that cessation of endocrine treatment at 5 years may lead to increased recurrence risk in more hormonally responsive tumours. We therefore included in our assessment genes that we and others have found to be associated with the antiproliferative response of primary ER+ breast cancer to oestrogen deprivation. Our work involved an in silico discovery set of 747 samples, training and test sets of 634 and 314 TransATAC samples, respectively, and independent case:control series from 1449 eligible samples. As such this was one of the largest original gene expression analyses undertaken for evaluating prognosis in ER+ breast cancer.
Of the 92 genes selected from in silico data and assessed in univariate analyses in TransATAC we found 63 to be significantly prognostic (P<0.05) in any of the three time periods which is considerably more than expected by chance after allowing for multiple testing errors. For most genes the same prognostic pattern was observed for early and late periods however we observed some possibly different prognostic properties for 20 genes. Notably, consistent with the above arguments, higher levels of ESR1 and its pioneer factor FOXA1 showed a shift at 5 years to be associated with worse prognosis beyond 5 years but surprisingly over the 10 year period the two genes were associated with poor prognosis. The complementary role whereby upon stimulus ER binding to chromatin is dependent on the presence of FOXA1 is well established (29). In our dataset FOXA1 and ESR1 correlated highly (Pearson's R: 0.65); the possibility that increased expression of one or both may put patients at increased risk of late relapse merits further investigation, particularly with regard to whether the genes also identify patients who benefit from extended adjuvant therapy.
The optimised time-dependent signatures derived in the TransATAC training set were rather similar to one another in makeup. All genes in the 10-year signature featured in either (or both) of the early and late signatures with their coefficients being in the same direction. The early and late signatures had five and three variables respectively not present in the 10-year signature suggesting that the early and late signatures may not have captured time-specific features or that such time-specific features that exist exert a minor modulatory influence on the overall prognosis over 10 years. It is notable that CTS was consistently the most prognostic variable in the three time-dependent models and that its contribution was similar in both early and late recurrence. This is consistent with the data of the EBCTCG that classical clinico-pathologic features retain their strong prognostic influence beyond 5 years (28).
Given that the 10-year signature captured prognostic features of both early and late events, it is perhaps not surprising that no improvement was seen in the use of early and late signatures compared to the overall 10-year signature that led to the rejection of our hypothesis. Also, it should be noted that splitting of the 0-10 year time period into 0-5 and 5-10 year periods markedly reduces the power to detect prognostic contributions. At least a contributory factor for the lack of improvement may be because of the dominance of proliferation-related genes in our and other signatures. As shown in our earlier analysis of the RS, each of the individual proliferation genes and the integrated module are equally prognostic before and after 5 years (9). Notably this is also supported by the observation by the EBCTCG that Ki67 was equally prognostic before and after 5 years in their overview analysis of late recurrence (28).
The 10-year signature was nonetheless validated in the POLAR sample set and provided significant prognostic information in both chemotherapy-naïve and -treated cohorts. Moreover, it added independent prognostic information beyond that of CTS in the POLAR cohort. Comparison of the information provided by each gene showed that eight out of the 18 genes were significantly prognostic at univariate level in POLAR (four genes at P<0.05, two genes at P<0.01 and three genes P<0.001). TNF showed opposite prognostic direction in training and validation sets, thus weakening the performance of the signature in POLAR. TNF is a versatile pro-inflammatory cytokine that has both pro-and anti-tumour activities promoting lymphocytic infiltration, activating the nuclear factor-κB, c-Jun N-terminal kinase and mitogen-activated protein kinase pathways and is capable of inducing apoptosis through TNF receptor 1 and 2 (30). It may be that the inclusion of higher risk, chemotherapy-treated patients in POLAR contributed to the difference in TNF's prognostic pattern; further investigation is needed to explain the relationship of TNF and risk of relapse in these cohorts.
The 10-year signature was compared to established prognostic signatures in the TransATAC validation set. Importantly, the 10-year signature was developed for the endpoint of any recurrence contrary to the endpoint of distant recurrence used in the development of RS, PAM50 ROR, BCI and IHC4. In univariate assessments, BCI and the 10-year signatures were the most informative for both all and distant recurrence. When added to CTS, all signatures assessed provided similar amount of information, with CTS+BCI being the most informative for distant recurrence. This new signature did not outperform the established signatures even though it was based on a large and wide-ranging analysis of both established prognostic genes and novel genes with a clear rationale for inclusion. It seems unlikely that a step-change in prognostic performance with any further elaborations on gene expression profiles in this context; other approaches that assess response to treatment or integrate mutational profiles or by the use of circulating tumour DNA (ctDNA) are likely to be more fruitful.
The results presented here support the mounting evidence that better risk estimation can be achieved by combining molecular profilers with clinico-pathological factors. For the three timedependent signatures derived in TransATAC CTS was the most prognostic in all three timedependent signatures and provided more prognostic information than RS, ROR, BCI and IHC4 respectively. Additionall, all profilers added significant prognostic information to CTS leading to combined signatures being significantly more informative.
Our study has strengths and limitations. An advantage was that a large discovery cohort of 634 samples was used for signature training. All tumours were ER+, HER2-from postmenopausal patients who had five years of endocrine therapy without chemotherapy. This was a homogeneous group of breast cancers which reduced confounding factors such as tumour subtype, differing treatment lengths and types. Data for the clinical prognostic tests were obtained by the same methods as set out by the tests' developers. The same batch of RNA was used for the newly developed signatures presented here and for the clinical prognostic tests used in the comparisons, reducing intra-sample variation. The clinical data were derived from a registration standard trial with comprehensive follow-up over 10 years. Limitations include that CTS, IHC4 and the 10-year signature were derived in TransATAC therefore their performance in the comparisons was slightly overestimated compared to what we would see in independent cohorts. Also, although this study was relatively large compared to others, the splitting of the data into early and late signatures decreased the statistical power for comparisons within those time periods.
In summary, we found that early and late signatures are unlikely to be more informative for predicting relapse than a single signature optimised for 10 years. Further development of gene expression signatures for prognosis in endocrine-treated ER+ breast cancer patients is unlikely to achieve a substantial improvement in performance.

Acknowledgements
We would like to acknowledge professor Mårten Fernö and Professor Per Malmström for their work on providing data and samples for the Lund cohort. We thank Genomic Health Inc., NanoString Technologies and BioTheranostics for the data of their respective gene signatures.  Table 1. Variables and corresponding beta-coefficients of the time-dependent 10-year, early and late signatures. Table 2. HRs (95% CI) per standard deviation, LRχ², p-values and c-indices for CTS, 10-year, early and late signatures in TransATAC validation cohort. Table 3. LRχ² and p-values for CTS and 10-year signature in three groups of POLAR validation set.

Patient cohorts
In silico cohort: 102 duplicate samples that were originally published in GSE6532 and subsequently reanalysed under GSE17705 were included in our cohort only once. The ER status of samples in GSE6532, GSE9195 and GSE17705 cohorts were defined in the original publications by immunohistochemistry. For GSE26971, probe intensity of 205225_at (ESR1) was used with the cutoff of 1000 also used in the original publication resulting in the removal of 7 samples from GSE26971. The HER2 status in all four cohorts was defined by using HER2 probe intensity (216836_s_at) with the cut-off of 6000. Forty-four samples were found to be HER2-positive by this criterion and subsequently removed from the cohort. In the 0-10 year follow-up period 741 samples had 168 DMFS events; the 318 samples with RFS data available had 83 relapses recorded. Nodal status was available for 696 samples.

Analytic procedures
For POLAR archival formalin fixed paraffin-embedded (FFPE) tissue blocks from either surgical excision specimens or core biopsies were identified. For patients who had been treated with neoadjuvant therapy, the diagnostic core was used for analysis. Oestrogen-receptor (ER), progesterone receptor (PgR) and HER2 status were determined from histopathology reports at diagnosis. For patients in whom HER2 status was unknown at the original diagnosis, HER2 staining was performed initially by immunohistochemistry (IHC) (graded from 0 to 3+) with confirmation of HER2 2+ tumours by in situ-hybridization (D-DISH). If tumours were subsequently identified as HER2positive, they were excluded from the cohort.
effects were corrected using the COMBAT (sva R package, Surrogate Variable Analysis) empirical Bayes method (Johnson et al, 2007), directly removing known batch effects. Expression probes that had a <10% intensity of all probes were removed. Seventy-five genes had no corresponding probes in the assay, 510 genes had 933 probes associated in the assay, of which for each gene the highest variance was selected. Cox Proportional Hazard was used with both continuous and median split expression to identify significantly prognostic genes. Hazard ratios and Odds ratios were derived from the standard deviation of the Cox-model regression coefficient. Analyses were performed in the early, late and 10 year time periods with distant metastasis-free survival (DMFS) and relapse-free survival (RFS) as endpoints respectively. Statistically significant genes in univariate analyses were entered into multivariate Cox proportional hazards models.
In TransATAC the 92 genes were evaluated in the 948 sample set by continuous univariate Cox Proportional Hazard, genes significant at p<0.05 were taken forward for signature generation. For this, the 948 patients were randomly split into 2/3 training (n=634) and 1/3 validation (n=314) sets. Number of events were split similarly, nodal status and tumour size were matched between training and validation sets. Genes statistically significant in univariate analyses in either early, late or 10year periods and clinical treatment score (CTS) were entered into multivariate selection process. Elastic net penalised Cox regression was used for feature selection with leave one out crossvalidation. The minimum partial likelihood deviance was estimated for different alpha values. This was done by varying the lambda tuning parameter, that controls the overall level of shrinkage.
Leave-one-out cross-validation of the partial likelihood deviance was used to estimate the best lambda. The partial likelihood deviance given alpha and lambda was obtained by getting a 1 model to all data except one observation. Then the deviance difference was between all data, and a calculation not using the observation left out. This was repeated for all data points. The lambda for each alpha was chosen was based on a `one-standard-error' rule. This selects the model with deviance one standard error away from the minimum. Alpha was set at 0.2 for all three model selections. Composite scores were built using the selected features and the beta-coefficients were determined. Beta-coefficients were normalised by dividing the gene standard deviation of the training population.
In POLAR the primary analyses were performed on all POLAR patients for which NanoString data was available and passed QC criteria. A control was defined as a patient who did not relapse during follow up. Controls were randomly selected according to matching criteria from the remaining cohort of patients who did not relapse during follow-up. The four matching criteria used in this study were: (i) age at diagnosis (≤50 years, >50 years), (ii) Nottingham Prognostic Index (NPI) category (<3.4; 3.4-5.4, >5.4), (iii) type of adjuvant endocrine therapy (tamoxifen only, any aromatase inhibitor), (iv) chemotherapy use (yes, no). All parametric unpaired t-tests were performed using PRISM software (Version 6.0c). Conditional logistic regression was performed using STATA to test whether the 10-year signature, CTS and individual genes were associated with risk of recurrence in a non-pairwise fashion. A multivariate conditional logistic regression analysis was performed to see whether CTS and the 10-year signature were independent variables in a forward selection manner with a 5% significance level in a non-pairwise fashion. A log likelihood test was used to test the 10year signature and CTS score in a model provides a better fit than CTS alone with a 5% significance level. The two separate hospital cohorts constituting POLAR were also analysed separately, no significant difference in the results was found (data not shown).