Skip to main content

Gene expression modules in primary breast cancers as risk factors for organotropic patterns of first metastatic spread: a case control study



Metastases from primary breast cancers can involve single or multiple organs at metastatic disease diagnosis. Molecular risk factors for particular patterns of metastastic spread in a clinical population are limited.


A case-control design including 1357 primary breast cancers was used to study three distinct clinical patterns of metastasis, which occur within the first six months of metastatic disease: bone and visceral metasynchronous spread, bone-only, and visceral-only metastasis. Whole-genome expression profiles were obtained using whole genome (WG)-DASL assays from formalin-fixed paraffin-embedded (FFPE) samples. A systematic protocol was developed for handling FFPE samples together with stringent data quality controls to identify robust expression profiling data. A panel of published and novel gene sets were tested for association with these specific patterns of metastatic spread and odds ratios (ORs) were calculated.


Metasynchronous metastasis to bone and viscera was found in all intrinsic breast cancer subtypes, while immunohistochemically (IHC)-defined receptor status and specific IntClust subgroups were risk factors for visceral-only or bone-only first metastases. Among gene modules, those related to proliferation increased the risk of metasynchronous metastasis (OR (95% CI) = 2.3 (1.1–4.8)) and visceral-only first metastasis (OR (95% CI) = 2.5 (1.2–5.1)) but not bone-only metastasis (OR (95% CI) = 0.97 (0.56–1.7)). A 21-gene module (BV) was identified in estrogen-receptor-positive breast cancers with metasynchronous metastasis to bone and viscera (area under the curve = 0.77), and its expression increased the risk of bone and visceral metasynchronous spread in this population. BV was further orthogonally validated with NanoString nCounter in primary breast cancers, and was reproducible in their matched lymph nodes metastases and an external cohort.


This case-control study of WG-DASL global expression profiles from FFPE tumour samples, after careful quality control and RNA selection, revealed that gene modules in the primary tumour have differing risks for clinical patterns of metasynchronous first metastases. Moreover, a novel gene module was identified as a putative risk factor for metasynchronous bone and visceral first metastatic spread, with potential implications for disease monitoring and treatment planning.


Development of metastatic breast cancer is a complex multi-step process manifesting with diverse temporal patterns involving single or multiple organs [1]. Metastases to multiple bone or visceral sites may be recorded as synchronous (reported at the same time), metasynchronous (where reported metastases are separated by a short time period, typically months) or asynchronous events with a significant delay between distant recurrences [2,3,4]. The median survival of patients with metastatic breast cancer is 18 to 24 months, although the range in survival spans between a few months and many years and often depends on the pattern or burden of metastatic spread. Most clinicians recommend initial treatment with chemotherapy for rapidly progressive visceral disease or in women with severe symptoms related to metastatic breast cancer [5]. Patients with bone metastases are often treated with osteoclast inhibitors, as these agents have been shown to reduce the risk of skeletal related events such as fractures, the need for surgery or radiation to bone, spinal cord compression, and hypercalcaemia of malignancy. However, patients at risk of metasynchronous metastatic spread to bone and viscera may benefit from an alternative treatment strategy at the time of first metastatic presentation. Clinicians may pursue more aggressive therapy immediately (e.g. chemotherapy instead of endocrine therapy) in patients with bone metastasis who are at high likelihood of imminent visceral metastasis, or similarly add bone-directed therapy in patients with visceral metastasis who are at high likelihood of imminent bone metastasis. Identification of these patients at an early stage after primary diagnosis or during early metastatic disease is not well-established.

Various prognostic factors influence the overall survival of patients with metastatic breast cancer, including hormone receptor status and axillary lymph node status at diagnosis, previous adjuvant chemotherapy, and the number of involved organs [6, 7]. ER-positive disease has a predilection to metastasise to bone, whereas basal-like and claudin-low breast cancers are associated with brain and lung relapses as first site of metastasis and human epidermal growth factor receptor 2 (HER2)-positive tumours have a predilection to cerebellar metastasis [8,9,10,11,12,13,14]. Transcriptional features present in the primary invasive breast carcinoma can be intrinsic to metastatic progression [15] and are currently tested in clinical trials for patient stratification for treatment regimens [16, 17]. IntClust subtypes can stratify patients by disease-specific survival [18], but without attribution to specific metastatic patterns. Recently small-scale studies suggested circulating tumour-derived exosomes to be predictive of metastasis to individual bone or visceral metastatic sites [19]. It remains unclear, however, to what extent the primary tumour at the time of diagnosis confers risk factors for clinical patterns of disease progression, manifesting as diverse temporal patterns of metastatic spread to single or multiple different organ sites.

With the increasing use of small diagnostic biopsy procedures prior to systemic treatment in cancer patients, there are limited opportunities to also collect frozen tissue. The latter has been the prime resource for genomic analysis and subsequent publication of gene signatures for prognostic and predictive use. However, despite the potentially vast resource available within diagnostic formalin-fixed paraffin-embedded (FFPE) archives, they have remained largely untapped for exploratory genome-scale biomarker studies. The quality of RNA from FFPE material has been the key limitation to its subsequent use. Whilst modifications to the extraction techniques continue to make slight improvements to the degraded RNA, there have been greater developments in array-based gene expression profiling assays and emerging technologies for transcriptome sequencing from FFPE samples [20]. The Illumina Whole-Genome DASL (WG-DASL) assay is one such assay [21]. Several technical studies reported that DASL assays can produce reliable expression profiles from FFPE tumour tissue samples, given adequate RNA quality, and design and preprocessing of the resulting data [21,22,23,24,25,26,27,28].

Here, we designed a whole-genome expression profiling study using a case-control design to include primary invasive breast cancers with three clinically observed patterns of metastatic spread: (1) metasynchronous bone and visceral metastases (within 6 months of first metastasis); (2) bone only, with delayed or no visceral metastasis; and (3) viscera only, with delayed or no bone metastasis. Given the comprehensive time and detailed organ-site information in our cohort from which cases and controls were selected, we aimed to identify intrinsic molecular features of primary breast carcinomas associated with the distinct patterns of first metastatic spread observed in the clinic, and in particular those with metasynchronous bone and visceral metastases.


Study population, study design, and patient selection

The study population comprised 5061 patients diagnosed with invasive primary breast cancer without distant metastasis at the time of diagnosis between 1975 and 2005 from Guy’s Hospital, London UK. All patients had given consent for analysis of their tumour tissues. Median follow up was 11 years (time between entry and exit dates in the case-control design).

Three metastatic populations were defined according to the site and specificity of recurrence: (1) first recurrence to bone only, with no other metastatic site within 6 months (“bone-only”), (2) first recurrence to viscera (other organs) only, with no other metastatic site within 6 months (“visceral-only”), and (3) recurrence to bone and viscera within a period of 6 months (“bone and visceral”). Bone metastasis was defined as distant metastasis to bone or bone marrow, spinal cord compression, pathological fracture, or hypercalcaemia. Visceral metastasis included distant metastasis to lung, liver, brain, or ascites. In the study population, 1598/5061 (32%) developed distant metastasis: 413/1598 (26%) to bone as first site (bone-only), 747 (47%) to viscera as first site (visceral-only), and 438 (27%) to bone and viscera within a 6-month period (bone and visceral).

For each of the three metastatic populations (visceral-only, bone and visceral, and bone-only), individually matched controls were randomly sampled using a case-control incidence-based approach (Additional file 1: Supplementary methods). Briefly, for each calendar time, T (e.g. 15 January 1999) that a case is diagnosed, one or more controls are randomly selected from the other members of the cohort who, at time T, are still at risk of developing the outcome (distant metastasis). The controls are therefore matched to the case by time of event. A patient who is a control at one time can later become a case and/or a control again, and each of the control series therefore includes a combination of patients still at risk, which enables efficient estimation of risks in the clinical population (among 1200 case-control sets, 75% of patients selected as controls at any calendar time did not metastasize at all, and 25% went on to have a distant metastatic event). Case-control sets (n = 400) were randomly selected for each of the three metastatic populations, giving a total of 1200 selected case-control sets. Case-control sets were then selected for tissue assessment and RNA extraction from FFPE tissue blocks. Extracted RNA was available for a total of 742 case-control (1:1) pairs: 246 case-control pairs for visceral-only cases, 258 for bone and visceral cases, and 238 for bone-only cases. An overview of patient cases is shown in Fig. 1 and a detailed overview of case-control sampling, random selection, and extracted case-control pairs is shown in Additional file 2: Figure S1A and B. Additional file 3: Table S1 tabulates the number of cases with extracted case:control pairs (1:1), and the number of matched and unmatched cases and controls with available gene expression data.

Fig. 1
figure 1

Overview of cases and gene expression datasets. For each metastatic population (Visceral only, Bone + Visceral, and Bone only) 400 cases were sampled, and three possible controls were matched to each case by calendar time of event. A random sample of case-control sets was taken forward to tissue assessment for RNA extraction (Additional file 1: Supplementary methods). Extracted RNA was available for a total of 742 case-control (1:1) pairs, comprising a total of 1277 individual patients. A detailed overview of patients and samples included in the design and in the gene expression datasets is shown in Additional file 2: Figure S1A, B

RNA extraction and gene expression profiling

FFPE samples of breast carcinomas were micro-dissected following tissue review. A total of 1575 FFPE tissue blocks were assessed (H&E; Additional file 1: Supplementary methods). Primary tumour blocks from 1357 patients were taken forward to micro-dissection and RNA extraction, with a total of 1370 RNA samples (Additional file 2: Figure S1A). In addition, RNA was extracted from 100 matched positive axillary lymph nodes and from FFPE samples of six breast cancer cell lines. RNA extraction was outsourced to Gen-Probe Life Sciences Ltd (Manchester, UK). RNA sample quality, quantity and integrity were assessed before proceeding to Illumina HT-12 v4 BeadChips WG-DASL microarray. A detailed description of tissue selection, micro-dissection, RNA sample selection, hybridisation design and microarray data processing is provided in Additional file 1: Supplementary methods. Two gene expression datasets were produced following rigorous quality assessment: GWDb (containing primary tumour samples from 527 patients) and GWDa (containing primary tumour samples from 124 patients, after removing patients also present in GWDb). Patient characteristics for GWDb are provided by case-control series in Table 2. An overview diagram of each dataset is provided in Additional file 2: Figure S1A. Gene expression microarray data have been deposited to Array Express E-MTAB-4003.

Intrinsic subtype assignments and gene module scores

Prediction analysis of microarray 50 (PAM50) intrinsic subtype was assigned in accordance with Weigelt et al. [29] using median-centred data and matching probes to centroid identifiers by gene symbol. The nearest centroid identified by Spearman correlation was assigned to each sample. IntClust subtypes were assigned using the iC10 package (v1.1.2) [30] for R/Bioconductor. Gene module scores for a panel of previously reported gene modules were estimated using the Denoising Algorithm based on Relevance network Topology (DART) method [31] and further compared with weighted sum (weights (+1, −1) according to the direction of expression in the gene signature) (Additional file 1: Supplementary methods). Previously reported gene expression signatures were mapped to WG-DASL probes using Ensembl Gene ID, Entrez Gene ID or gene symbol, according to their original source (Additional file 3: Table S2). Where multiple microarray probes mapped to a single Entrez Gene ID, the probe with the most variable gene expression across the datasets was used (based on standard deviation in the relevant dataset).

Derivation and expression summary of gene module

To identify a candidate gene module for bone and visceral metastasis (BV gene module), GWDb was reduced to ER-positive case-control pairs, and top-ranked genes were identified using an exploratory differential expression analysis of the bone and visceral and the no metastasis groups, as follows. Top-ranked genes were identified by comparing bone and visceral vs. no metastasis (threshold false discovery rate (FDR)-adjusted p < 0.2; Mann-Whitney U test). This procedure resulted in a list of 21 genes (19 up, 2 down) together with the direction of differential expression between the bone and visceral and the no metastasis groups (up, down) which defines the BV gene module. To inspect the candidate BV module within expression datasets, the expression of the gene module was summarised using a weighted sum with weights (+1, −1) according to the original direction of differential expression in the gene module.

NanoString gene expression analysis

For a subset of 192 samples, expression was validated for 150 selected genes by analysing total RNA (200 ng) with the nCounter platform (NanoString Technologies). Expression data were normalised using the NanoStringNorm package in R [32]. Background correction was performed by subtracting the negative control probes (‘mean.2sd’). Expression values were normalised to the geometric mean of fifteen housekeeping genes. Expression values were log2-transformed and standardised within each sample (geometric mean). An expression score for the BV gene module was calculated among ER-positive samples using a weighted sum (weights (+1, −1) according to the direction in the BV module) of mean-centred, standard deviation-scaled BV genes.

Statistical analysis

For each case-control series, conditional logistic regression models (modelling individually matched pairs) and logistic regression models (unconditional, disregarding the case-control matching) were used to estimate odds ratios (ORs) and 95% confidence intervals (CIs). For intrinsic molecular subtypes, ORs were estimated for each subtype compared with a baseline subtype. Immunohistochemically (IHC)-derived subtypes were compared with ER-positive, HER2-negative tumours [33]. PAM50 subtypes were compared with the luminal A subtype (a good prognosis group [34]). IntClust subtypes were compared with the baseline IntClust3 cluster [30]. Gene module scores were scaled within each case-control series so that 95% of values lay within the range [−1, 1] [35]. FDR/Benjamini-Hochberg correction for multiple testing was applied to p values across the panel of reported gene modules within each test [36]. A quartile analysis, in which cases were binned according to the quartile thresholds of the respective control series and conditional logistic regression models fitted for each quartile compared with the first quartile, showed similar trends in OR to the models that treat gene module scores as continuous variables (not shown). The Wilcoxon test for matched pairs and Mann-Whitney U test (unpaired) was used to test for differences in gene module scores between cases and controls in each series. All statistical analysis was conducted in the R environment (v3.1.2) ( Conditional regression models were fitted using the function clogistic() in the package Epi (v2.0) [37]:

(Case_Control ~ x + strata(

Logistic regression models were fitted using the function glm(family = ‘binomial’) in the base package stats. A Sweave document is provided in Additional file 1: Supplementary methods.


Patient characteristics and sample processing

Clinico-pathological information for extracted RNA samples from 742 cases and their case-matched controls are summarised in Table 1. A rigorous inspection of extracted RNA and WG-DASL data was performed to ensure that expression profiles could be obtained across the span of storage times and inferior-quality data were excluded from further analysis (Additional file 2: Figures S5-7). Primary tumour samples that passed rigorous WG-DASL quality controls were assigned to a discovery set (GWDb, 527 patients) or a smaller independent dataset (GWDa, 124 patients) (Additional file 2: Figure S1A, B). Clinico-pathological information for the three case-control series in dataset GWDb is shown in Table 2. Before proceeding to the analyses of the three case-control series, the clinico-pathological characteristics for each set were inspected. Patient characteristics in GWDb and GWDa retained the originally selected distribution of organ-specific metastatic spread (Additional file 3: Table S1A lists the number of cases for the case:control pairs (1:1), and the number of matched and unmatched cases and controls with available gene expression data). On inspection of GWDb, there were predominantly IHC-defined ER-positive breast cancers (~68%), and 18% IHC-defined HER2-positive and 32% IHC-defined ER-negative breast cancers. Primary carcinomas were predominantly treatment-naïve and invasive ductal carcinoma of no special histological type. Patients with a bone-only pattern of first metastatic spread were more likely to report a visceral metastasis beyond a 6-month period from first metastasis (37%) than the visceral-only group to a later bone metastasis (16%) (Additional file 3: Table S1B). The visceral-only case series had a greater proportion of grade-3 primary tumours and a smaller proportion receiving endocrine therapies than the other two case series in GWDb, while the bone-only case series had the lowest proportion of patients treated with chemotherapy (Table 2). Additional file 2: Figure S2 provides an illustration and descriptive summary of the temporal patterns of the single and multiple sites of metasynchronous metastatic spread present amongst all carcinomas in GWDb.

Table 1 Patient characteristics in the case-control series: patients with extracted RNA sample
Table 2 Patient characteristics in the case-control series: patients in the discovery cohort GWDb

Metastatic spread among breast cancer subtypes

Next we asked to what extent patients with particular molecular subtypes of breast cancer, as currently defined in the research setting, were at risk of metasynchronous bone and visceral, bone-only, or visceral-only patterns of first metastasis observed in the clinic. Molecular subtypes in the GWDb set were defined by the IHC-defined status of ER/HER2 [33], and assigned to the PAM50 [34] and the IntClust subtypes [18, 30]. Initially, PAM50 estimates for each tumour were compared with IHC-defined subtypes and overall good accordance was observed: luminal A samples were 89% IHC-defined ER-positive; 62% of basal-like samples were of the triple-negative phenotype (IHC-defined ER-negative, PgR-negative, and HER2-negative; TNBC); and 73% of HER2-enriched samples were IHC-defined HER2-positive among samples with available IHC status.

Second, the molecular subtypes were tested within each case-control series using conditional logistic regression (Additional file 4: Table 3). IHC-defined ER-positive patients had increased risk of bone-only metastatic spread and decreased risk of visceral-only and bone and visceral metastatic spread. In GWDb, patients with the HER2-enriched PAM50 subtype of breast cancer had increased risk of visceral-only metastatic spread, and patients with the luminal B subtype had increased risk of bone-only metastatic spread, compared with patients with the luminal A subtype as baseline. Patients with tumours classified as IntClust5 had an increased risk of visceral-only spread compared to those with IntClust3 baseline tumours. Unconditional logistic regression models had similar OR point estimates (Additional file 5: Table 4). Patients with the IHC-defined ER-negative/HER2-positive subtype had an increased risk of visceral-only and bone and visceral spread compared to those with ER-positive/HER2-negative carcinomas. Among IntClust classes with IntClust3 as the reference class, patients with bone and visceral spread had similar risks to the visceral-only group with the exception that IntClust2 and IntClust4 showed increased risk for visceral-only and bone-only but not bone and visceral spread. Subtypes found to be risk factors for bone and visceral spread were also risk factors for visceral-only or bone-only events from either the conditional or unconditional logistic regression models, indicating that molecular subtypes did not confer risks specifically for bone and visceral events in this study.

Third, the tumour molecular subtypes of all patients in GWDb were tabulated and compared to the metastatic pattern of every patient irrespective of the case-control design, with the aim of providing a descriptive overview [30] of all primary tumours in GWDb (Fig. 2). As predicted, IHC-defined ER-negative and HER2-positive tumours were enriched for the visceral-only cases, and TNBC was decreased for bone-only events (Fig. 2, IHC). The breakdown of IHC subtypes in the bone and visceral group lies in between the bone-only and visceral-only groups and does not appear to be dominated by the IHC associations that would be expected for one case type or the other (bone only or visceral only). The differences were less clear across the PAM50 subgroups (Fig. 2, PAM50). IntClust5 had the highest prevalence of visceral-only events, whereas IntClust9 followed by IntClust6 had the most bone and visceral events (Fig. 2, IntClust). The patients in the bone-only group had mainly ER-positive breast cancers and were predominantly assigned to IntClust3, 4 and 7 subtypes. In contrast, patients with no reported metastases were enriched for luminal A and IntClust3 subtypes (IntClust in the “no metastasis” group; χ2, p < 1e-5), and bone and visceral events were present in all IntClust subtypes (IntClust in the bone and visceral group; χ2, p = 0.3). In summary, primary breast cancers with bone and visceral metasynchronous metastatic pattern were found across multiple molecular breast cancer subtypes in GWDb, indicating that there was no evidence of increased risk specifically for bone and visceral events among these current breast cancer classifications (Additional file 4: Table 3 and Additional file 5: Table 4).

Fig. 2
figure 2

Distribution of three metastatic groups across breast cancer subtypes within the dataset GWDb. Barplots illustrate the proportion of immunohistochemically defined (IHC), prediction analysis of microarray 50 (PAM50) and IntClust breast cancer subtypes present in each metastatic group (a) and the proportion of each metastatic group assigned to IHC, PAM50 and IntClust subtypes (b). The patient number for each group is shown on the top of each column. ER estrogen receptor, HER2 human epidermal growth factor receptor, TNBC triple-negative breast cancer

Prognostic gene modules are indicative of organ-specific metastatic predilection

Several studies have reported gene expression modules indicative of particular organ-specific metastatic spread (reviewed in [38, 39]). We therefore asked whether some of those modules were also activated across our three metastatic groups and to what extent the activation of individual gene modules in the primary tumour is a risk for the clinically observed patterns of metasynchronous bone and visceral, bone-only, or visceral-only first metastasis. Primary tumour expression modules were selected if they were previously reported to be associated with: (i) features of proliferation, cell motility, presence of stem-cell-like cells and immune/lymphocytic infiltration, and (ii) organ-specific metastasis (Additional file 3: Table S2).

The gene modules were each tested as a risk for each pattern of metastatic spread within GWDb using conditional logistic regression (Fig. 3a) and logistic regression on complete case-control pairs (Fig. 3b). In addition, logistic regression models were fitted using all controls and all cases (to avoid discarding both samples from a pair due to missing data from GWDb), and using ER-stratified data with or without discarding samples from ER-mismatching pairs (Additional file 2: Figure S3). Pairwise correlation of gene modules confirms that proliferation signatures are highly correlated in this dataset and there is also correlation between other modules previously reported to represent metastasis to individual sites and between immune-related signatures (Fig. 3c), consistent with other studies [35, 40]. The expression of genes associated with proliferation has been repeatedly shown to be associated with the prognosis in ER-positive breast cancer [41]. In our study, gene modules related to proliferation increased the risk of metasynchronous bone and visceral metastasis (e.g. in the conditional logistic regression model of PTEN module, OR (95% CI) = 2.3 (1.1–4.8); Additional file 6: Table S3) and visceral-only first metastasis (PTEN, OR (95% CI) = 2.5 (1.2–5.1); Additional file 6: Table S3) but not bone-only metastasis (PTEN, OR (95% CI) = 0.97 (0.56–1.7); Additional file 6: Table S3). Risk associations were observed by logistic regression modelling within ER-positive or ER-negative tumours (Additional file 6: Table S3, Additional file 2: Figure S3A).

Fig. 3
figure 3

Log (OR) estimated from univariate conditional logistic regression and logistic regression models of illustrative gene modules in the GWDb dataset for each metastatic group. a Conditional logistic regression (matched pairs). b Logistic regression using complete case-control pairs. Alternating grey and green bars indicate the broad categorisation of illustrative gene modules (“Proliferation”, “Immune”, etc.) as shown in the module labels on the left of the plot. c Heatmap of pairwise correlation for a panel of gene modules (Pearson correlation, “complete” clustering; irrespective of case type or case-control status). Gene modules are listed in Additional file 3: Table S2. ER, estrogen receptor; V, "visceral-only"; BV, "bone and visceral"; B, "bone-only"; Mets, distant metastasis at any site

In addition, with the alternative aim of providing a descriptive summary of gene module activation among all breast carcinomas present in GWDb, two exploratory analyses were performed irrespective of the case-control design: tumours with each pattern of metastatic spread were compared with all tumours with no metastasis using a logistic regression model (Additional file 2: Figure S3B), and time to metastasis to individual sites (lung, liver, bone, brain) was modelled irrespective of the patterns of first metastatic spread or any other metastases at any time during follow up (Additional file 2: Figure S3C, D). In agreement with other studies, neither the logistic regression with reference to tumours with no metastasis nor the time-to-event analyses can be interpreted in the standard epidemiological sense estimating associations between exposures and outcomes, due to the sample selection methods employed in this study. These models are presented here as exploratory hypothesis-generating tools only with no inference implied for the breast cancer population. Metastasis to any site was associated with proliferation signatures irrespective of ER status (Additional file 2: Figure S3C). A number of gene modules indicated nominal significance but would not pass a multiple testing correction (Additional file 2: Figure S3). In IHC-defined ER-negative breast cancers transforming growth factor (TGF)-β response [42] and hypoxia response gene sets [43] were activated in carcinomas with visceral-only metastases compared with those that did not metastasise (Additional file 2: Figure S3B), while TGF-β response and hypoxia response gene sets were associated with time to lung metastasis (Additional file 2: Figure S3D). A stem cell module, which is a strong indicator of short relapse in TNBC [44], was present in ER-negative breast cancers with visceral-only metastases, and a module related to intermediate tissue burden and progression from stemness/basal-like cells [45] was associated with the visceral-only and bone and visceral groups, while the low tissue-burden/basal-like module (derived from metastatic cells from tissues with low metastatic burden [45]) was under-expressed in the bone-only group compared with cancers with no reported metastases. Taken together, while we were able to recapitulate previously reported associations between gene signatures and metastasis to bone or visceral organs within our study, there were no specific gene module with distinctive risk factors for the group with clinically observed metasynchronous bone and visceral spread.

A prognostic gene module for metasynchronous metastatic spread

As the question remains whether any molecular features could be identified in primary tumours at the time of diagnosis for metasynchronous bone and visceral metastases, we then aimed to extract specific gene expression patterns associated with the bone and visceral metastasis group. In order to control for interactions between ER status and metastatic group, the discovery set (GWDb) was stratified by ER status by removing those case-control pairs with differing IHC-defined ER status (Additional file 1: Supplementary methods). We proceeded with a total of 175 ER-positive breast cancer patients. Exploratory differential expression analysis was performed between the bone and visceral group and tumours that did not metastasise, and the top-ranked genes (FDR-adjusted p < 0.2; see “Methods”) were taken forward for further exploratory analysis (Additional file 3: Table S4). In receiver operating characteristic (ROC) curve analysis the area under the curve (AUC) was 0.77 for a 21-gene set for the bone and visceral metastasis group (“BV module”; Fig. 4a, Additional file 3: Table S4), in comparison to an AUC of 0.66 and 0.56 for visceral-only and bone-only metastatic spread, respectively, while when combining all three series, the overall AUC was 0.83 for any metastatic site (Additional file 2: Figure S4).

Fig. 4
figure 4

BV module for metasynchronous metastatic spread. a Discovery of the BV module. BV module scores, shown as density plots for each metastatic group in the estrogen-receptor-positive case-control paired breast cancer cases from the discovery set (GWDb) together with the corresponding receiver operating characteristic curve for the bone and visceral group. b Estimated BV expression score based on NanoString nCounter quantification compared with those obtained from the WG-DASL platform. c Correlation plot displaying the BV gene set in primary tumours and their matched lymph node metastases. Points are colour-coded according to metastatic group. d BV module scores in lymph node metastases of the GSE46141 data, displayed according to their reported patterns of first metastatic spread

The risk of bone and visceral spread from the BV gene module was next estimated using the bone and visceral case-control series within GWDb and GWDa (Additional file 3: Table S5). A significant risk of bone and visceral spread was observed within GWDb (OR (95% CI) = 6.0 (3.1–12.2). In the independent dataset GWDa, the OR estimates were also positive (OR (95% CI) = 1.9 (0.38–9.7), and there was a shift towards increased BV module scores in the cases compared to the controls (Mann-Whitney U test, p = 0.3), however due to the small sample size it was not significant (Additional file 3: Table S5). Together these results indicated that this BV gene module might confer an increased risk of the bone and visceral pattern of metastatic spread.

To further explore the relevance of our BV gene module, orthogonal validation of the discovery was obtained on three levels: (i) with NanoString nCounter, by testing the expression of these genes in a representative subset of 192 samples, in which BV module scores were highly concordant with WG-DASL values (Fig. 4b; Pearson’s correlation = 0.78, p < 1e-4); (ii) the expression of the BV gene set was reproduced in matched lymph node metastases within our cohort (Fig. 4c); and (iii) we investigated the gene expression of the BV module in an external dataset of lymph node metastases from breast cancers with known metastatic disease [46]. The BV gene module exhibited increased expression in the lymph node metastases of those patients with a bone and visceral pattern of first metastatic spread compared with visceral-only (Fig. 4d; Mann-Whitney U test, p = 0.04) and bone-only groups (Fig. 4d; Mann-Whitney U test, p = 0.1 (not significant)). These results are in line with our exploratory analyses and together are the first demonstration towards developing an intrinsic risk factor in primary breast cancer for metasynchronous bone and visceral first distant recurrence. Inspection of the genes comprising the candidate BV module indicated enrichment for association with condensing chromosomes and the kinetochore (Additional file 2: Figure S8).


Metastasis represents the major cause of death in breast cancer patients. Over the last few years, numerous molecular-based prognostic tests of varying specificity have emerged, indicating that primary breast carcinomas display expression profiles associated with organ-specific dissemination; however, few studies have addressed synchronous and metasynchronous patterns of metastatic spread [47]. Treatment strategies and monitoring of patients could potentially be tailored if prediction of single or multi-organ metastasis could be estimated at an early stage. As a step towards this goal, this study estimated potential risks for particular patterns of metastatic spread associated with intrinsic subtypes and gene modules in the primary tumour. A gene expression module present in primary invasive carcinomas associated with concurrent or short-term delays between the development of bone and visceral metastasis was identified and validated in an independent series of lymph node metastases.

Predilection for metastatic spread in breast cancer has previously been associated with gene modules enriched in the primary tumour. A common feature of these are markers of cell proliferation, such as the GENE70, PTEN, the centrosomal kinase AURKA [48, 49] and multiple processes related to chromosomal instability including CIN70 [50]. In this study, we found that a gene module containing components of the kinetochore (CENPO, SPC25, CASC5, SKA3, CENPE; Gene Ontology (GO) CC term “kinetochore”) was associated with the occurrence of metasynchronous bone and visceral metastases within 6 months of the first metastasis. The regulation of genes encoding kinetochore components has been hypothesised to drive chromosomal instability [51], whereby the upregulation of kinetochore genes may reflect the activation of a cell division programme [52]. We speculate that the association of this gene module with rapid multitropic bone and visceral spread after first metastasis points to a mechanism of chromosomal instability, enabling the development of subclones and selection of metastatic tumour cells for invasion and adaptation at multiple bone and visceral sites. Gene modules related to proliferation or mammary stem cells might be expected to influence the synchronicity of multiple metastases, and were indeed found to be significant risks for multiple bone and visceral first metastases.

Limitations of this study include the imposition of a timeline that defines metasynchronous metastases: for example, in our datasets a change in the definition of metasynchronous from 6 months to 12 months would have led to 10% of bone-only and 5% of visceral-only metastases to have been considered metasynchronous (Additional file 2: Figure S2A), and other definitions of metasynchronicity could be imposed, which may affect the estimated risks for each case type. The AUC for the BV signature was higher for all metastases than for metasynchronous bone and visceral metastases: from the point of view of clinical translation, further work would need to establish whether BV or other putative signatures for patterns of metastatic spread could add value over existing signatures such as Oncotype or Mammaprint [16, 17].

Metastasis is a complicated, multi-step process and our understanding of the multiple factors involved is still partial. In the last decade, genomic profiling has attempted to fill this knowledge gap; however, these studies have primarily used fresh-frozen tissue, had restricted numbers of primary and metastatic cases, and incomplete information on the site and time to development of the metastatic spread. This has limited the utility and clinical applicability of these modules. There is evidence that some tumours have a predilection for colonising specific tissues in clinical populations (e.g. [10]), while animal model and recent next-generation sequencing studies also support a role for subclonal adaptations to the metastatic niche (e.g. [53]). In this study, we focused on the tumour as one part of this complex metastatic cascade, which is close to clinical diagnostic practise and patient management. We hypothesised that intrinsic subtypes and gene modules confer risk of particular patterns of metastatic spread in some tumours. We addressed this question by designing a case-control study for particular patterns of metastatic spread.

Pre-clinical models have contributed to our understanding of metastastic spread but they might not capture many of the processes that are important in a clinical setting - including alterations to the immune system or incorporation of specific latency periods to study multi-organ metastatic spread in parallel - and many gene signatures originate from ER-negative cell line and patient-derived xenograft (PDX) models. Multiple lines of evidence indicate that intrinsic subtypes and gene modules have different metastatic potential within clinical populations [18, 54, 55]. We sought to address whether gene modules could confer risk for specific temporal patterns of metastatic spread using a large tumour archive with detailed clinical follow up. An efficient case-control design is required given that multiple breast cancer subtypes and metastatic patterns are present in any clinical population. This study therefore focused on three specific patterns of metastatic spread based on epidemiological observations from the same clinical population [56].

As recently shown by Iddawela et al. [28] and others, the WG-DASL platform together with stringent quality control and data processing steps can produce reliable results from FFPE breast tumour tissue. Here, by starting with a well-characterised cohort of 5061 patients with long-term follow up of whom 1598 developed metastasis, we have created a well-annotated and sufficiently large cohort to investigate molecular risk factors for single and multiple organotropic metasynchronous metastatic disease. While many samples were excluded during the quality control steps, the processed gene expression datasets enabled an investigation of gene expression modules as risk factors for specific patterns of first metastatic spread.

The use of an incidence-based case-control design ensured that non-metastasising primary tumour samples were also included from the same range of calendar dates of diagnosis, and enabled efficient estimation of effects in a standard clinical population. Cunha et al. [57] recently reported using a case-control design to estimate the effect of ALK1 expression in frozen breast tumours as a molecular risk factor for metastatic spread. Here, our use of an incidence-based case-control design enabled the estimation of genome-wide molecular risk factors for three clinically observed patterns of first metastatic spread (bone-only, visceral-only, or bone and visceral within a 6-month time period). Further work on patterns of long-term disease progression may focus on a more defined population, such as ER-positive/HER2-negative or high-grade tumours, and make use of appropriate platforms for assaying fewer FFPE tumour samples from a stratified clinical population.

Patients at risk of metasynchronous bony and visceral metastases could potentially profit from closer disease monitoring and may benefit from a more radical bisphosphonate and chemotherapeutic combination treatment strategy up front in the metastatic setting. In ER-positive disease the benefits of hormonal therapy in managing visceral metastases from breast cancer are much lower than those offered by chemotherapy. Conversely, Perez et al. [58] suggest that, in patients with bone metastases, efforts should be made to select the least aggressive therapy to avoid excessive toxicity. Progress towards the optimisation of disease monitoring and treatment strategies fundamentally requires a better understanding of risks that could be estimated at primary diagnosis, together with an improved understanding of additional emerging risks as breast cancers progress to a metastatic setting (for example, establishment of a metastatic niche). Initiatives such as AURORA, a large multinational, collaborative metastatic breast cancer molecular screening programme [59] will further shed light on our knowledge of whether the gene expression patterns found in primary breast cancer is similar to those in metastatic material. This would add to our understanding of the metastatic process and guide treatment regimens for metastatic breast cancer. A liver-selective gene module was among the set of gene modules, where these genes were under-expressed in primary tumours from patients who subsequently developed liver metastases, but displayed high expression in the liver metastases [46]. We observed the low-level expression of these 17 genes in ER-positive cancers, which metastasised to the liver. Of note, this inverse correlation in the direction of transcript levels of some genes between primary tumours and metastases was not unique to this gene module, but was also seen in another gene module associated with infiltration of the blood-brain barrier [60] and has also been recently reported in ovarian cancers [61]. Due to the scarcity of available samples, a clear biological conclusion cannot be drawn about such inverse correlation and we hypothesise that the process of adaption to the new microenvironment or development of the pre-metastatic niche [62, 63] favours those primary tumour traits. Ultimately, larger cohorts with matched primary and metastatic lesions are required to elucidate the clinical relevance of these transcriptional changes after primary diagnosis.


In conclusion, by analysing gene expression profiles in a large cohort of well-characterised primary breast carcinomas and lymph node metastases in patients with long-term follow up and detailed information on metastatic spread, we were able to identify patterns informative of multiple organotropic metasynchronous metastatic spread. Further investigations are necessary in order to tease out the contributing components, which could be relevant for tailoring systemic therapeutic regimens, monitoring response or resistance to therapy, and warranting close imaging/biomarker follow up with the institution of early intervention strategies as required. These genome-wide expression data across extensive case-control series will provide a useful resource facilitating further studies of the biology and clinical relevance of single and metasynchronous metastatic spread, and might enable rational personalised treatment strategies to be developed for patients at risk of bone or visceral metastasis with subsequent metasynchronous metastasis.



Area under the curve


Benjamini-Hochberg method for multiple testing correction


Estrogen receptor


False discovery rate


Formalin-fixed, paraffin-embedded


Human epidermal growth factor receptor 2




Prediction analysis of microarray 50


Progesterone receptor


Triple-negative breast cancer


  1. Kimbung S, Loman N, Hedenfalk I. Clinical and molecular complexity of breast cancer metastases. Semin Cancer Biol. 2015;35:85–95.

    Article  CAS  PubMed  Google Scholar 

  2. Plunkett TA, Smith P, Rubens RD. Risk of complications from bone metastases in breast cancer: implications for management. Eur J Cancer. 2000;36(4):476–82.

    Article  CAS  PubMed  Google Scholar 

  3. Coleman RE. Clinical features of metastatic bone disease and risk of skeletal morbidity. Clin Cancer Res. 2006;12(20 Pt 2):6243s–9s.

  4. Marlow R, Honeth G, Lombardi S, Cariati M, Hessey S, Pipili A, Mariotti V, Buchupalli B, Foster K, Bonnet D, et al. A novel model of dormancy for bone metastatic breast cancer cells. Cancer Res. 2013;73(23):6886–99.

    Article  CAS  PubMed  Google Scholar 

  5. Wilcken N, Hornbuckle J, Ghersi D. Chemotherapy alone versus endocrine therapy alone for metastatic breast cancer. Cochrane Database Syst Rev. 2003;(2):CD002747.

  6. Largillier R, Ferrero JM, Doyen J, Barriere J, Namer M, Mari V, Courdi A, Hannoun-Levi JM, Ettore F, Birtwisle-Peyrottes I, et al. Prognostic factors in 1,038 women with metastatic breast cancer. Ann Oncol. 2008;19(12):2012–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Alanko A, Heinonen E, Scheinin T, Tolppanen EM, Vihko R. Significance of estrogen and progesterone receptors, disease-free interval, and site of first metastasis on survival of breast cancer patients. Cancer. 1985;56(7):1696–700.

    Article  CAS  PubMed  Google Scholar 

  8. Lai R, Dang CT, Malkin MG, Abrey LE. The risk of central nervous system metastases after trastuzumab therapy in patients with breast carcinoma. Cancer. 2004;101(4):810–6.

    Article  CAS  PubMed  Google Scholar 

  9. Harrell JC, Prat A, Parker JS, Fan C, He X, Carey L, Anders C, Ewend M, Perou CM. Genomic analysis identifies unique signatures predictive of brain, lung, and liver relapse. Breast Cancer Res Treat. 2012;132:523–35.

    Article  CAS  PubMed  Google Scholar 

  10. Kennecke H, Yerushalmi R, Woods R, Cheang MC, Voduc D, Speers CH, Nielsen TO, Gelmon K. Metastatic behavior of breast cancer subtypes. J Clin Oncol. 2010;28(20):3271–7.

    Article  PubMed  Google Scholar 

  11. Metzger-Filho O, Sun Z, Viale G, Price KN, Crivellari D, Snyder RD, Gelber RD, Castiglione-Gertsch M, Coates AS, Goldhirsch A, et al. Patterns of recurrence and outcome according to breast cancer subtypes in lymph node-negative disease: results from international breast cancer study group trials VIII and IX. J Clin Oncol. 2013;31(25):3083–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Sihto H, Lundin J, Lundin M, Lehtimaki T, Ristimaki A, Holli K, Sailas L, Kataja V, Turpeenniemi-Hujanen T, Isola J, et al. Breast cancer biological subtypes and protein expression predict for the preferential distant metastasis sites: a nationwide cohort study. Breast Cancer Res. 2011;13(5):R87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Smid M, Wang Y, Zhang Y, Sieuwerts AM, Yu J, Klijn JGM, Foekens JA, Martens JWM. Subtypes of breast cancer show preferential site of relapse. Cancer Res. 2008;68:3108–14.

    Article  CAS  PubMed  Google Scholar 

  14. Vaz-Luis I, Ottesen RA, Hughes ME, Marcom PK, Moy B, Rugo HS, Theriault RL, Wilson J, Niland JC, Weeks JC, et al. Impact of hormone receptor status on patterns of recurrence and clinical outcomes among patients with human epidermal growth factor-2-positive breast cancer in the National Comprehensive Cancer Network: a prospective cohort study. Breast Cancer Res. 2012;14(5):R129.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Mittempergher L, Saghatchian M, Wolf DM, Michiels S, Canisius S, Dessen P, Delaloge S, Lazar V, Benz SC, Tursz T, et al. A gene signature for late distant metastasis in breast cancer identifies a potential mechanism of late recurrences. Mol Oncol. 2013;7:987–99.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, Geyer Jr CE, Dees EC, Perez EA, Olson Jr JA, et al. Prospective validation of a 21-gene expression assay in breast cancer. N Engl J Med. 2015;373(21):2005–14.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Cardoso F, van't Veer LJ, Bogaerts J, Slaets L, Viale G, Delaloge S, Pierga JY, Brain E, Causeret S, DeLorenzi M, et al. 70-Gene signature as an aid to treatment decisions in early-stage breast cancer. N Engl J Med. 2016;375(8):717–29.

    Article  CAS  PubMed  Google Scholar 

  18. Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346–52.

  19. Hoshino A, Costa-Silva B, Shen T-L, Rodrigues G, Hashimoto A, Tesic Mark M, Molina H, Kohsaka S, Di Giannatale A, Ceder S, et al. Tumour exosome integrins determine organotropic metastasis. Nature. 2015;527:329–35.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Sinicropi D, Qu K, Collin F, Crager M, Liu ML, Pelham RJ, Pho M, Dei Rossi A, Jeong J, Scott A, et al. Whole transcriptome RNA-Seq analysis of breast cancer recurrence risk using formalin-fixed paraffin-embedded tumor tissue. PLoS One. 2012;7(7):e40092.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. April C, Klotzle B, Royce T, Wickham-Garcia E, Boyaniwsky T, Izzo J, Cox D, Jones W, Rubio R, Holton K, et al. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples. PLoS One. 2009;4(12):e8162.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Abramovitz M, Ordanic-Kodani M, Wang Y, Li Z, Catzavelos C, Bouzyk M, Sledge Jr GW, Moreno CS, Leyland-Jones B. Optimization of RNA extraction from FFPE tissues for expression profiling in the DASL assay. Biotechniques. 2008;44(3):417–23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Ravo M, Mutarelli M, Ferraro L, Grober OM, Paris O, Tarallo R, Vigilante A, Cimino D, De Bortoli M, Nola E, et al. Quantitative expression profiling of highly degraded RNA from formalin-fixed, paraffin-embedded breast tumor biopsies by oligonucleotide microarrays. Lab Invest. 2008;88(4):430–40.

    Article  CAS  PubMed  Google Scholar 

  24. Reinholz MM, Eckel-Passow JE, Anderson SK, Asmann YW, Zschunke MA, Oberg AL, McCullough AE, Dueck AC, Chen B, April CS, et al. Expression profiling of formalin-fixed paraffin-embedded primary breast tumors using cancer-specific and whole genome gene panels on the DASL(R) platform. BMC Med Genomics. 2010;3:60.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Saleh A, Zain RB, Hussaini H, Ng F, Tanavde V, Hamid S, Chow AT, Lim GS, Abraham MT, Teo SH, et al. Transcriptional profiling of oral squamous cell carcinoma using formalin-fixed paraffin-embedded samples. Oral Oncol. 2010;46(5):379–86.

    Article  CAS  PubMed  Google Scholar 

  26. Waddell N, Cocciardi S, Johnson J, Healey S, Marsh A, Riley J, da Silva L, Vargas AC, Reid L. kConFab, et al. Gene expression profiling of formalin-fixed, paraffin-embedded familial breast tumours using the whole genome-DASL assay. J Pathol. 2010;221(4):452–61.

    CAS  PubMed  Google Scholar 

  27. Mittempergher L, de Ronde JJ, Nieuwland M, Kerkhoven RM, Simon I, Rutgers EJT, Wessels LFA, Van't Veer LJ. Gene expression profiles from formalin fixed paraffin embedded breast cancer tissue are largely comparable to fresh frozen matched tissue. PLoS One. 2011;6:e17163.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Iddawela M, Rueda OM, Klarqvist M, Graf S, Earl HM, Caldas C. Reliable gene expression profiling of formalin-fixed paraffin-embedded breast cancer tissue (FFPE) using cDNA-mediated annealing, extension, selection, and ligation whole-genome (DASL WG) assay. BMC Med Genomics. 2016;9(1):54.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Weigelt B, Mackay A, A'Hern R, Natrajan R, Tan DS, Dowsett M, Ashworth A, Reis-Filho JS. Breast cancer molecular profiling with single sample predictors: a retrospective analysis. Lancet Oncol. 2010;11(4):339–49.

    Article  CAS  PubMed  Google Scholar 

  30. Ali HR, Rueda OM, Chin SF, Curtis C, Dunning MJ, Aparicio SA, Caldas C. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biol. 2014;15(8):431.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Jiao Y, Lawler K, Patel GS, Purushotham A, Jones AF, Grigoriadis A, Tutt A, Ng T, Teschendorff AE. DART: denoising algorithm based on relevance network topology improves molecular pathway activity inference. BMC Bioinformatics. 2011;12:403.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Waggott D, Chu K, Yin S, Wouters BG, Liu F, Boutros PC. NanoStringNorm: an extensible R package for the pre-processing of NanoString mRNA and miRNA data. Bioinformatics. 2012;28(11):1546–48.

  33. Lips EH, Mulder L, De Ronde JJ, Mandjes IAM, Koolen BB, Wessels LF, Rodenhuis S, Wesseling J. Breast cancer subtyping by immunohistochemistry and histological grade outperforms breast cancer intrinsic subtypes in predicting neoadjuvant chemotherapy response. Breast Cancer Res Treat. 2013;140:63–71.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160–7.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Ignatiadis M, Singhal SK, Desmedt C, Haibe-Kains B, Criscitiello C, Andre F, Loi S, Piccart M, Michiels S, Sotiriou C. Gene modules and response to neoadjuvant chemotherapy in breast cancer subtypes: a pooled analysis. J Clin Oncol. 2012;30(16):1996–2004.

    Article  CAS  PubMed  Google Scholar 

  36. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Stat Methodol). 1995;57:289–300.

    Google Scholar 

  37. Carstensen B, Plummer M, Laara E, Hills M. Epi: A package for statistical analysis in epidemiology. R package version 2.0. 2016.

  38. Nguyen DX, Bos PD, Massague J. Metastasis: from dissemination to organ-specific colonization. Nat Rev Cancer. 2009;9(4):274–84.

    Article  CAS  PubMed  Google Scholar 

  39. Vanharanta S, Massague J. Origins of metastatic traits. Cancer Cell. 2013;24(4):410–21.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Stover DG, Coloff JL, Barry WT, Brugge JS, Winer EP, Selfors LM. The role of proliferation in determining response to neoadjuvant chemotherapy in breast cancer: a gene expression-based meta-analysis. Clin Cancer Res. 2016;22(24):6039–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Weigelt B, Pusztai L, Ashworth A, Reis-Filho JS. Challenges translating breast cancer gene signatures into the clinic. Nat Rev Clin Oncol. 2012;9(1):58–64.

    Article  CAS  Google Scholar 

  42. Shipitsin M, Campbell LL, Argani P, Weremowicz S, Bloushtain-Qimron N, Yao J, Nikolskaya T, Serebryiskaya T, Beroukhim R, Hu M, et al. Molecular definition of breast tumor heterogeneity. Cancer Cell. 2007;11:259–73.

    Article  CAS  PubMed  Google Scholar 

  43. Lu X, Yan CH, Yuan M, Wei Y, Hu G, Kang Y. In vivo dynamics and distinct functions of hypoxia in primary tumor growth and organotropic metastasis of breast cancer. Cancer Res. 2010;70(10):3905–14.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Soady KJ, Kendrick H, Gao Q, Tutt A, Zvelebil M, Ordonez LD, Quist J, Tan DW, Isacke CM, Grigoriadis A, et al. Mouse mammary stem cells express prognostic markers for triple-negative breast cancer. Breast Cancer Res. 2015;17:31.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Lawson DA, Bhakta NR, Kessenbrock K, Prummel KD, Yu Y, Takai K, Zhou A, Eyob H, Balakrishnan S, Wang CY, et al. Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells. Nature. 2015;526(7571):131–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kimbung S, Johansson I, Danielsson A, Veerla S, Egyhazi Brage S, Frostvik Stolt M, Skoog L, Carlsson L, Einbeigi Z, Lidbrink E, et al. Transcriptional profiling of breast cancer metastases identifies liver metastasis-selective genes associated with adverse outcome in luminal A primary breast cancer. Clin Cancer Res. 2016;22(1):146–57.

    Article  CAS  PubMed  Google Scholar 

  47. Ho VK, Gijtenbeek JM, Brandsma D, Beerepoot LV, Sonke GS, van der Heiden-van der Loo M. Survival of breast cancer patients with synchronous or metachronous central nervous system metastases. Eur J Cancer. 2015;51(17):2508–16.

  48. Desmedt C, Haibe-Kains B, Wirapati P, Buyse M, Larsimont D, Bontempi G, Delorenzi M, Piccart M, Sotiriou C. Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res. 2008;14(16):5158–65.

    Article  CAS  PubMed  Google Scholar 

  49. Haibe-Kains B, Desmedt C, Loi S, Culhane AC, Bontempi G, Quackenbush J, Sotiriou C. A three-gene model to robustly identify breast cancer molecular subtypes. J Natl Cancer Inst. 2012;104(4):311–25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Carter SL, Eklund AC, Kohane IS, Harris LN, Szallasi Z. A signature of chromosomal instability inferred from gene expression profiles predicts clinical outcome in multiple human cancers. Nat Genet. 2006;38:1043–8.

    Article  CAS  PubMed  Google Scholar 

  51. Yuen KW, Montpetit B, Hieter P. The kinetochore and cancer: what's the connection? Curr Opin Cell Biol. 2005;17(6):576–82.

    Article  CAS  PubMed  Google Scholar 

  52. Thiru P, Kern DM, McKinley KL, Monda JK, Rago F, Su KC, Tsinman T, Yarar D, Bell GW, Cheeseman IM. Kinetochore genes are coordinately up-regulated in human tumors as part of a FoxM1-related cell division program. Mol Biol Cell. 2014;25(13):1983–94.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, Aas T, Alexandrov LB, Larsimont D, Davies H, et al. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat Med. 2015;21(7):751–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Sørlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–74.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006;98(4):262–72.

    Article  CAS  PubMed  Google Scholar 

  56. Purushotham A, Shamil E, Cariati M, Agbaje O, Muhidin A, Gillett C, Mera A, Sivanadiyan K, Harries M, Sullivan R, et al. Age at diagnosis and distant metastasis in breast cancer–a surprising inverse relationship. Eur J Cancer. 2014;50(10):1697–705.

    Article  CAS  PubMed  Google Scholar 

  57. Cunha SI, Bocci M, Lovrot J, Eleftheriou N, Roswall P, Cordero E, Lindstrom L, Bartoschek M, Haller BK, Pearsall RS, et al. Endothelial ALK1 is a therapeutic target to block metastatic dissemination of breast cancer. Cancer Res. 2015;75(12):2445–56.

    Article  CAS  PubMed  Google Scholar 

  58. Perez JE, Machiavelli M, Leone BA, Romero A, Rabinovich MG, Vallejo CT, Bianco A, Rodriguez R, Cuevas MA, Alvarez LA. Bone-only versus visceral-only metastatic pattern in breast cancer: analysis of 150 patients. A GOCS study. Grupo Oncologico Cooperativo del Sur. Am J Clin Oncol. 1990;13(4):294–8.

    Article  CAS  PubMed  Google Scholar 

  59. Zardavas D, Maetens M, Irrthum A, Goulioti T, Engelen K, Fumagalli D, Salgado R, Aftimos P, Saini KS, Sotiriou C, et al. The AURORA initiative for metastatic breast cancer. Br J Cancer. 2014;111(10):1881–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Bos PD, Zhang XH, Nadal C, Shu W, Gomis RR, Nguyen DX, Minn AJ, van de Vijver MJ, Gerald WL, Foekens JA, et al. Genes that mediate breast cancer metastasis to the brain. Nature. 2009;459(7249):1005–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Brodsky AS, Fischer A, Miller DH, Vang S, MacLaughlan S, Wu HT, Yu J, Steinhoff M, Collins C, Smith PJ, et al. Expression profiling of primary and metastatic ovarian tumors reveals differences indicative of aggressive disease. PLoS One. 2014;9(4):e94476.

    Article  PubMed  PubMed Central  Google Scholar 

  62. Lu X, Kang Y. Organotropism of breast cancer metastasis. J Mammary Gland Biol Neoplasia. 2007;12:153–62.

    Article  PubMed  Google Scholar 

  63. Zhang L, Zhang S, Yao J, Lowery FJ, Zhang Q, Huang W-C, Li P, Li M, Wang X, Zhang C et al. Microenvironment-induced PTEN loss by exosomal microRNA primes brain metastasis outgrowth. Nature. 2015;527:100–4.

Download references


The authors thank Dr. O. Agbaje for assistance with case-control selection and description of the protocol, Dr. S. Irshad for helpful comments on clinical perspectives, and anonymous reviewers for their constructive and insightful comments.


Patient samples and data were provided by King’s Health Partners Cancer Biobank, which is supported by the Department of Health via the National Institute for Health Research (NIHR) comprehensive Biomedical Research Centre award and the Experimental Cancer Centre at King’s College London. KL was supported by the CRUK and EPSRC Comprehensive Cancer Imaging Centre at KCL and UCL (C1519/A10331 and C1519/A16463) and European Union Framework Programme 7 HEALTH-2010 grant entitled ‘Imagint’ (grant number 259881). AG and AT were supported by the Breast Cancer Now Research Unit (former Breakthrough Breast Cancer Research) funding at King's College London and by the National Institute for Health Research Biomedical Research Centre based at Guy's and St Thomas' NHS Foundation Trust and King's College London.

Availability of data and materials

Gene expression microarray data have been deposited to Array Express E-MTAB-4003.

Author information

Authors and Affiliations



KL participated in data analysis and interpretation and helped to draft the manuscript. EP performed the WG-DASL assays and participated in data analysis. CN participated in acquisition of RNA samples. AM participated in acquisition of follow-up data. KO participated in data analysis and archiving. AT participated in study concept and design and data interpretation. SK participated in design and data acquisition for the independent cohort. IH participated in design and data acquisition for the independent cohort. JZ participated in interpretation of the data. HZ participated in interpretation of the data. RB participated in NanoString data acquisition. MD participated in NanoString data acquisition and critical reading of the manuscript. TN participated in study concept and design and data interpretation. SEP participated in study concept and design, and acquisition of RNA samples. PP participated in study concept and design, and data interpretation. LH designed the case-control series, and participated in the design of data analyses and interpretation. CEG participated in study concept and design and acquisition of RNA samples and helped to draft the manuscript. AG participated in study concept and design and design of data analyses and wrote the manuscript. AP conceived the study, participated in study design and interpretation, and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Anita Grigoriadis or Arnie Purushotham.

Ethics declarations

Ethics approval and consent to participate

Ethical approval was obtained from King's Health Partners Cancer Biobank, London, UK, a HTA Licensed (12121) and NHS REC approved Tissue Bank (12/EE/0493). All patient samples were pseudoanonymised by the Biobank.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Supplementary methods. Detailed description of tissue preparation including microtomy, de-waxing and staining, micro-dissection and preparation of materials, and gene expression processing and quality control. (PDF 243 kb)

Additional file 2:

Figures S1-S8.. (PDF 2279 kb)

Additional file 3:

Tables S1-S2 and S4-S5. (XLSX 365 kb)

Additional file 4: Table 3.

Estimated ORs for molecular subtype representation based on conditional logistic regression. (PDF 1 mb)

Additional file 5: Table 4.

Estimated ORs for molecular subtype representations based on logistic regression. (PDF 657 kb)

Additional file 6: Table S3.

Distribution of gene module scores for cases and controls in each case-control series, and estimated ORs using conditional logistic and logistic regression models. (XLSX 47 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lawler, K., Papouli, E., Naceur-Lombardelli, C. et al. Gene expression modules in primary breast cancers as risk factors for organotropic patterns of first metastatic spread: a case control study. Breast Cancer Res 19, 113 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Breast cancer
  • Metasynchronous metastases
  • Gene expression pattern