A BRCA1 deficient-like signature is enriched in breast cancer brain metastases and predicts DNA damage-induced poly (ADP-ribose) polymerase inhibitor sensitivity

Introduction There is an unmet clinical need for biomarkers to identify breast cancer patients at an increased risk of developing brain metastases. The objective is to identify gene signatures and biological pathways associated with human epidermal growth factor receptor 2-positive (HER2+) brain metastasis. Methods We combined laser capture microdissection and gene expression microarrays to analyze malignant epithelium from HER2+ breast cancer brain metastases with that from HER2+ nonmetastatic primary tumors. Differential gene expression was performed including gene set enrichment analysis (GSEA) using publicly available breast cancer gene expression data sets. Results In a cohort of HER2+ breast cancer brain metastases, we identified a gene expression signature that anti-correlates with overexpression of BRCA1. Sequence analysis of the HER2+ brain metastases revealed no pathogenic mutations of BRCA1, and therefore the aforementioned signature was designated BRCA1 Deficient-Like (BD-L). Evaluation of an independent cohort of breast cancer metastases demonstrated that BD-L values are significantly higher in brain metastases as compared to other metastatic sites. Although the BD-L signature is present in all subtypes of breast cancer, it is significantly higher in BRCA1 mutant primary tumors as compared with sporadic breast tumors. Additionally, BD-L signature values are significantly higher in HER2-/ER- primary tumors as compared with HER2+/ER + and HER2-/ER + tumors. The BD-L signature correlates with breast cancer cell line pharmacologic response to a combination of poly (ADP-ribose) polymerase (PARP) inhibitor and temozolomide, and the signature outperformed four published gene signatures of BRCA1/2 deficiency. Conclusions A BD-L signature is enriched in HER2+ breast cancer brain metastases without pathogenic BRCA1 mutations. Unexpectedly, elevated BD-L values are found in a subset of primary tumors across all breast cancer subtypes. Evaluation of pharmacological sensitivity in breast cancer cell lines representing all breast cancer subtypes suggests the BD-L signature may serve as a biomarker to identify sporadic breast cancer patients who might benefit from a therapeutic combination of PARP inhibitor and temozolomide and may be indicative of a dysfunctional BRCA1-associated pathway.


Introduction
Central nervous system metastases are diagnosed in approximately 10% to 16% of women with advanced breast cancer [1,2]. The total incidence of brain metastases is potentially higher than currently reported statistics, as most brain metastases are diagnosed in response to clinical symptoms rather than by an initial detection. Several risk factors have been associated with the development of brain lesions in patients with metastatic breast cancer (MBC), including a younger age [3], having more than two metastatic sites at diagnosis [3], negative estrogen receptor (ER) status [1,4,5], human epidermal growth factor receptor 2-positive (HER2+) disease [1,4], and BRCA1/ 2 mutation [6][7][8]. Survival for breast cancer patients with brain metastases is poor, with a one-year survival probability of approximately 20% [2]. These statistics highlight the crucial need to develop biomarkers for the prediction of brain metastasis risk and to identify the underlying biological pathways that promote brain metastasis for the development of potential targeted therapeutics.
Patients with HER2+ MBC tumors are two to four times more likely to develop brain metastases than patients with HER2-negative disease [1,4]. While systemic trastuzumab has proven efficacious for treating aggressive HER2+ breast cancer, its use has been associated with the central nervous system as the first site of relapse [9]. Thus, there is an urgent clinical need for biomarkers to identify patients at higher risk of developing brain metastases, as well as to identify alternative therapeutic approaches. In this study, we aim to identify gene signatures associated with HER2+ brain metastases for potential biomarker development as well as to provide insight into the underlying associated biological pathways.

Patients and clinical samples
Patient and primary tumor characteristics are presented in Additional file 1. The HER2 status was assessed by HER2 immunohistochemistry (IHC) and/or gene amplification, and tumor grading was determined as described previously [10]. The breast cancer brain metastatic specimens consisted of fresh frozen biopsies obtained from the MD Anderson Cancer Center between 1998 and 2001; in all 19 cases the brain was the first site of relapse. As patient-matched primary breast tumor specimens were not available for these brain metastatic samples, we obtained HER2+ primary breast cancer specimens from Massachusetts General Hospital; these samples were obtained from patients with either no relapse or relapse to sites other than the central nervous system and consisted of fresh frozen biopsies obtained between 1998 and 2006. These breast cancer brain metastatic specimens and breast tumors were matched for patient age upon primary tumor detection and the ER status of the primary tumor. Patient consent was obtained for study participation and the study was approved by the human research committees of the MD Anderson Cancer Center and the Massachusetts General Hospital in accordance with the National Institutes of Health human research study guidelines.
Laser capture microdissection, RNA extraction, and microarray hybridization RNA was isolated from a highly enriched population of 4,000 to 5,000 malignant epithelial cells procured by laser capture microdissection and was hybridized to Affymetrix X3P GeneChips (Affymetrix, Santa Clara, CA, USA) as previously described [11]. The data was deposited in the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) [12] and are accessible through GEO Series accession number GSE43837 [13].

Gene set enrichment analysis
Computation of gene expression was done using the MAS5 algorithm as implemented in the call.expers function in version 2.14.05 of the simpleaffy package of Bioconductor [14]. Gene set enrichment analysis (GSEA) analysis was performed using version 2.0 of GSEA run on all the gene sets in version 2.5 of the Molecular Signatures Database (MSigDB) [15].

Calculation of BRCA1 Deficient-Like metagene value
All the genes in the BRCA1_OVEREXP_DN gene set, which was experimentally derived as described [16], in version 2.5 of the MSigDB [17] were mapped as described below to microarray identifiers. The gene expression values for all those identifiers were then averaged to form the BRCA1 Deficient-Like (BD-L) metagene. Specific probes measured are indicated in Additional file 1 for each figure.
Mapping gene symbols to microarray identifiers Gene symbols were mapped to Entrez GeneIDs using the 2 February 2008 version of the gene information file from ftp.ncbi.nlm.nih.gov/gene/DATA. First the 'Symbol' column was searched and, if that failed, the 'Synonyms' column was searched. To map an Entrez GeneID to Affymetrix HG-U133A probe set identifiers, version na24 of the annotation file from the Affymetrix website was used. The 'Entrez Gene' column of that annotation file was augmented by trying to fill empty entries by using the corresponding entries in the 'UniGene ID' and 'Representative Public ID' columns to search the file Hs.data from build 209 of Unigene and the 2 February 2008 version of the gene2accession file from ftp.ncbi.nlm.nih.gov/gene/ DATA. An Entrez GeneID was then mapped to every probe set identifier that had the Entrez GeneID in the augmented 'Entrez Gene' column. To map Entrez Gen-eIDs to Rosetta spot IDs, we used [18] (downloaded 2 February 2008), the file Hs.data from build 209 of Unigene and the 2 February 2008 version of the gene2accession file from ftp.ncbi.nlm.nih.gov/gene/DATA.

Sequencing of genomic BRCA1
All 22 coding exons of the BRCA1 (NM_007294.3) gene were amplified and sequenced in 33 fragments using tumor DNA as previously demonstrated [19]. Primers were designed using Primer 3 software [20] to cover at least 20 base pairs at each 5' and 3' side of the exons. The amplified DNA fragments were sequenced by using the BigDye Terminator Cycle Sequencing kit on an ABI 3500xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA). Sequencing chromatograms generated by the analyzer were examined for variant detection using Mutation Surveyor software (SoftGenetics LLC., State College, PA, USA).

Statistical methods for correlative analyses
The P values quoted for Figures 1, 2, and 3A were obtained by applying the Wilcoxon test to all pairs within the figure and correcting the resulting P values for multiple hypothesis testing using the Holm method [16].

Cell culture and pharmacologic inhibition assay
All cell lines were obtained and maintained as previously described [21]. Independent pharmacologic inhibition assays were conducted in triplicate for each cell line. Cells were seeded at 20,000 per well in a 24-well plate. After 24 hours, cells were treated in triplicate with indicated concentrations of: DMSO, temozolomide (T2577, Sigma-Aldrich, St Louis, MO, USA), or AZD-2281 (S1060, Selleck Chemicals, Houston, TX, USA). After five days of incubation, cells were fixed in 4% formaldehyde and stained with 1% Crystal Violet (C0775, Sigma-Aldrich) for 10 minutes at room temperature. Cells were then washed to remove unincorporated dye and plates were inverted to dry overnight. Incorporated dye was extracted with 10% acetic acid and OD595 measurements were obtained within a linear range. Treated cells were normalized to the vehicle-treated control to obtain mean percentage viability.

Results and discussion
To identify gene expression patterns that differentiate HER2+ breast cancer brain metastases from HER2+ primary tumors that did not metastasize to the brain, we performed a comparative gene expression analysis between 19 brain metastasis specimens from breast cancer patients with the brain as a first-site metastasis and 19 non-patient-matched primary breast tumor specimens from patients who either did not experience a relapse (with minimum follow-up time of >6 years) or did not have a recurrence in the central nervous system. None of the patients with a brain metastasis received herceptin; four of the patients without a relapse received neoadjuvant or adjuvant herceptin. These specimens were matched based upon the age of patient at initial detection, and the HER2 and ER status of the primary tumors (Additional file 1). Although patient-matched primary tumors were not available for comparative gene expression analysis, we hypothesized that the direct comparison of brain metastases to non-patient-matched primary tumors would provide insight into the key molecular pathways underlying outgrowth in the brain microenvironment.
To compare gene expression, RNA derived from microdissected tissue was hybridized to Affymetrix X3P GeneChips and the resulting data was subjected to bioinformatic analyses. Standard MAS5 pre-processing of the data with a t-test comparison and a false discovery rate set at 0.25 failed to identify individually differentially expressed genes between the brain metastatic specimens and the non-patient-matched primary breast cancer specimens.
As no significant differential expression for individual genes was discovered, a GSEA using version 2.5 the Broad Institute MSigDB was conducted to determine if there were modulations of gene sets that comprise annotated biochemical pathways [22]. The analysis yielded 22 enriched gene sets with a false discovery rate of q value <0.25 (Additional file 1). The top gene set identified was BRCA1_OVEREXP_DN, which is comprised of probe sets that were downregulated by two-to fourfold after inducible expression of BRCA1 in the BRCA1low, ER + EcR-293 human embryonal kidney epithelial cell line [23]. A significant correlation of the HER2+ breast cancer brain metastases with a BRCA1-related signature was unexpected as several studies have reported a low frequency of HER2 expression in tumors of BRCA1 mutation carriers [24][25][26], and thus may reflect more upon the underlying biology of metastatic outgrowth in the brain rather than an aspect of HER2 signaling. Furthermore, sequencing analysis for 17 of the 19 HER2+ brain metastatic specimens for which sufficient residual tumor remained identified no previously known pathogenic or novel potentially pathogenic variants (Frameshift insertion/deletion, nonsense or essential splicing site variants) as classified by International Agency for Research on Cancer (IARC) recommendations [27]. As our identified signature consisted of genes that were downregulated when BRCA1 was overexpressed, we hypothesized that the converse upregulation of these genes may indicate an underlying deficiency in the BRCA1 functional pathway, either directly through BRCA1 or indirectly through a cooperating factor. Since the BRCA1_OVEREXP_DN signature was enriched in HER2+ breast cancer brain metastases that did not have known or potentially novel BRCA1 pathogenic variants, we designated the BRCA1_O-VEREXP_DN signature as the 'BRCA1 Deficient-Like' (BD-L) metagene. BD-L metagene values were calculated for each specimen, and significant association of the metagene with the brain metastatic samples was confirmed (Additional file 2, left panel; P value = 0.0082). Additionally, a significant difference in BRCA1 expression on the microarray between the primary tumors and brain metastases was not observed for two probe sets (Additional file 2, middle and right panel). Because several of the brain metastasis samples were exhausted during the previous analyses, a direct characterization of BRCA1 mRNA and protein expression was prohibited. However, a significant correlation between the two BRCA1 probe sets on the Affymetrix X3P Gene-Chip and the BD-L metagene was not observed for the entire cohort, suggesting that metagene value may not be merely tracking with BRCA1 mRNA expression (Additional file 3). While BRCA1 protein expression by IHC could not be examined, a previous study has suggested significant concordance between BRCA1 mRNA and protein expression in clinical specimens [28].
Although all patients in this cohort were confirmed to have clinical 3+ HER2+ breast cancer by IHC and/or fluorescent in situ hybridization (FISH), the possibility existed that they were misclassified. To confirm overexpression of HER2 across the cohort, the expression levels for all genes on the microarray were plotted on a histogram and indicated genes were denoted for each patient by a red line in Additional file 4. The expression of HER2 showed a clear enrichment compared to PSA, which is not reported to be highly expressed in breast cancer, and PRY, DAZ4, and CDY1, which are all located on chromosome Y and thus are not detected at high levels in female breast cancer samples. Thus, the consistently high level of HER2 expression across the cohort supports the clinical HER2+ diagnosis.
To validate our original observation that the BD-L metagene is enriched in breast cancer-derived brain metastases, gene expression data from an independent cohort consisting of 615 primary breast cancer specimens as well as breast cancer metastasis specimens from brain (n = 19), lung (n = 18), liver (n = 5), and bone (n = 15) was assessed for correlation with BD-L [29]. As demonstrated in Figure 1, a higher mean BD-L metagene value was observed in metastases to the brain as compared to primary tumors (P value = 0.0043), bone metastases (P value = 4 × 10 −6 ), and lung metastases (P value = 0.001), but not when compared to liver (P value = 0.38). A limitation in using this data set is the restricted number of metastatic samples in each group and the lack of annotation of ER-and HER2-receptor status for the metastatic data points. Given this limitation, the significant association observed may support the enrichment of the BD-L signature as being a feature of brain metastases irrespective of receptor subtype.
Having confirmed an enrichment of BD-L metagene value in brain metastases compared to primary tumors, we then hypothesized that a metagene of BRCA1 deficiency would also demonstrate increased values in primary tumor specimen derived from mutant BRCA1 carriers compared to noncarriers. When a publicly available gene expression data set was interrogated [30], a significantly higher mean BD-L value was found in mutant BRCA1 tumors (P value = 0.033) when compared to sporadic tumors (Figure 2). While the BD-L value in primary tumors between sporadic breast cancer patients and BRCA2 mutation carriers was not significant, there is little power in the analysis given the small sample size (n = 2). The BD-L values for sporadic primary tumors included a subset with elevated metagene values comparable to those of BRCA1 mutation carriers, which may be indicative of a subpopulation of sporadic tumors with characteristics similar to BRCA1 mutated tumors. The correlation of the BD-L signature with both brain metastases and BRCA1 mutation is consistent with the published literature as BRCA1 mutation carriers are reported to have an increased prevalence of breast cancer brain relapse as compared to noncarriers [8,31].
To investigate the correlation of the BD-L metagene with important molecular markers of sporadic breast cancer subtypes, we next evaluated the distribution of BD-L value by HER2 and ER status in the NKI295 [32], EMC286 [33]/MSK82 [34], and EMC192 [35] cohorts of sporadic primary tumors ( Figure 3A-C). As BRCA1 mutants represent a subpopulation within the triple negative breast cancer, an expected significantly higher BD-L metagene mean value was observed in ER-/HER-primary tumors when compared to ER+/HER2+ subgroups, with P values = 4.5 × 10 −6 (NKI295), 0.0025 (EMC286/MSK82), and 2.8 × 10 −5 (EMC192). Additionally, BD-L mean value was significantly higher in ER-/HER-tumors when compared and ER+/HER2-tumors, with P values = 1.1 × 10 −13 (NKI295), 4 × 10 −8 (EMC286/MSK82), and 8.7 × 10 −10 (EMC192). Although not consistently significant across the cohorts, a trend is observed when comparing ER-/ HER-tumors to ER-/HER2+ tumors, with P values = 0.0023 (NKI295), 0.097 (EMC286/MSK82), and 0.05 (EMC192). Despite the significant correlation with a negative ER and/or HER2 receptor expression, it was notable that a small subpopulation of tumors with high BD-L values was present within the ER + and HER2+ subtypes ( Figure 3A-C dot plots), suggesting that the BD-L phenotype may extend beyond primary tumors of BRCA1 mutation carriers and the sporadic ER-/ HER2-subtype. This is especially intriguing for primary ER + tumors because the brain is not a prevalent metastatic site for the ER + subtype [36]. Motivated by the possibility that the BD-L signature may extend across current breast cancer classifications of receptor expression or mutational status, we next sought to apply the BD-L signature to breast cancer cell lines independent of receptor and mutational status with an aim to identify a phenotype of pharmacological sensitivity.
We hypothesize that the BD-L metagene may identify breast cancers that fall within a spectrum of dysfunction for a BRCA1 functional complex or regulated pathway, either directly through BRCA1 or indirectly through a cooperating factor. Having demonstrated that BD-L was enriched in BRCA1 mutation carriers, we hypothesized that breast cancer cell lines with elevated BD-L values may exhibit increased sensitivity to therapeutic agents that target a dysfunctional BRCA1-associated pathway. Poly (ADP-ribose) polymerase (PARP) inhibitors represent an exciting class of drugs that have demonstrated promise in clinical BRCA1/2-related cancers as single agents [37,38] and in preclinical studies as single agents and in combination with certain classes of DNAdamaging agents [39,40]. Additionally, preclinical testing has revealed that disruption of proteins that cooperate either directly or indirectly with BRCA1/2 proteins can increase PARP inhibitor sensitivity [41][42][43]. Because we hypothesize the BD-L metagene may correlate with a spectrum of dysfunction, we chose to induce DNA damage to enhance the effectiveness of the PARP inhibitor. Therefore, we tested a panel of breast cancer cell lines for sensitivity to a combination treatment with the PARP inhibitor olaparib (AZD-2281), an oral PARP inhibitor in clinical use that has shown evidence of crossing the blood/brain barrier [44], and the DNA alkylating/ methylating agent temozolomide, a clinically utilized chemotherapeutic that can cross the blood/brain barrier and has demonstrated increased efficacy in combination with a PARP inhibitor [45][46][47][48]. Using a publicly available gene expression set, we determined BD-L metagene values for 51 well-defined human breast cancer cell lines as described in Neve et al. (Additional file 1) [21]. We rank-ordered the lines by increasing metagene value, and selected 12 cell lines predicted to be among either the most resistant or most sensitive to pharmacologic inhibition (Table 1). This panel included the BRCA1deficient HCC1937 cell line, which the BD-L metagene predicts to exhibit low sensitivity. While this may appear paradoxical, clinical trials have demonstrated that not all BRCA1 mutation carriers are responsive to PARP inhibitors [37,38]. Additionally, Figure 2A demonstrated that although the BD-L metagene is enriched in BRCA1-mutation carriers compared to noncarriers, a subset of BRCA1 mutation carriers have low metagene values. Because we hypothesize the BD-L metagene provides a measure of a BRCA1-associated pathway function rather than a BRCA1 gene mutation or the expression status, the metagene would also account for potential compensatory mechanisms.
Based upon known mechanisms of temozolomidespecific sensitivity and extensive in vitro pharmacological studies in cell lines [49], 100 uM was determined to be a physiologically relevant dose that does not demonstrate significant reduction in viability across the breast cancer cell line panel ( Figure 4A, top panel). Single treatment and combined treatment with temozolomide using increasing sub-physiological doses of olaparib identified significant inhibition upon combination treatment. (Figure 4A, Additional file 5). As originally hypothesized, there is a highly significant correlation (R 2 = 0.77; P value 0.00017) of the BD-L metagene with pharmacological response of cell lines to the combined administration of olaparib and temozolomide ( Figure 4A, lower panel). It is interesting to note that the metagene was able to correctly predict the response of the BRCA1-deficient HCC1937 cell line, suggesting the BD-L metagene may be a better indicator of pharmacological response than BRCA1 gene status. To further support the correlation with sensitivity, BD-L metagene values were calculated for seven of the tested cell lines from an independent gene expression data set described in Garnett et al. [50] and was plotted using our experimentally derived pharmacological response data. While single administration of either temozolomide or olaparib alone ( Figure 4B, top and middle panel) did not demonstrate a significant reduction in viability, a significant correlation (R 2 = 0.69; P value 0.02) is observed upon dual administration and supports our original observation ( Figure 4B, lower panel). Thus, using two independent gene expression data sets of cell lines derived from different microarray platforms, the BD-L metagene demonstrated a strong correlation with our experimentally derived DNA damage-induced PARP inhibitor sensitivity.
To determine the robustness of BD-L metagene in predicting sensitivity, we evaluated the performance of five published signatures [30,[51][52][53] of BRCA1/2 deficiency and/or function in predicting our observed pharmacologic responses of the breast cancer cell line panel using the gene expression data from Neve et al. [21] ( Figure 4C, Additional file 6A, C, E, G) and Garnett et al. [50] ( Figure 4D, Additional file 6B, D, F, H). In contrast to the BD-L metagene ( Figure 4A and B, bottom panels), all five BRCA1/2-related signatures failed to correlate with pharmacologic response (Figures 4C, D, Additional file 6A-H). The difference in predictive power is potentially due to the approach taken in discovering these signatures. The BD-L metagene was derived from changes in gene expression due to a modest overexpression of BRCA1 within a single cell line. This unbiased approach goes beyond indicating the BRCA1 mutational status or the acute response to a stimulus to provide a measure of BRCA1 pathway function that can include the contribution of BRCA1 and its interacting components. Alternatively, the genes that comprise the BD-L metagene may comprise functional networks that contribute to the observed PARP inhibitor sensitivity. Mapping of the 112 BD-L genes to functional networks using Ingenuity Pathway Analysis (IPA) identified a predominant network association with biological functions of proliferation, cell cycle control, and apoptosis (Additional file 1). While these functions have not previously been associated with response to PARP inhibitors, the potential for specific aspects of these functions for influencing sensitivity provide possible avenues for future investigation. In conclusion, the BD-L metagene may provide a measure of BRCA1 pathway function as opposed to indicating BRCA1 mutational status, direct expression levels, or response to an acute stimulus.

Conclusions
In summary, we identified a BRCA1 Deficient-Like metagene that is enriched in HER2+ brain metastases when compared with HER2+ primary tumors, and in an independent data set confirmed the enrichment of the metagene in brain metastases as compared to bone metastases, lung metastases, and primary breast tumors. Furthermore, we demonstrated that high BD-L metagene value is enriched in, but not limited to, primary tumors of BRCA1 mutation carriers and sporadic ER-/HER2-patients. When the BD-L signature is calculated for a breast cancer cell line panel using gene expression from two independent data sets, the BD-L metagene correlates with pharmacologic response to a combination treatment of olaparib and a temozolomide. Lastly, we demonstrated that the BD-L metagene outperforms extant classifiers of BRCA1/2 status in predicting pharmacological response to the drug combination in the breast cancer cell panel.
Since the clinical administration of PARP inhibitors is still in its infancy, there is a crucial need to both identify patients who will gain benefit from this class of drugs and to develop biomarkers that predict clinical response. Currently, BRCA1/2 status is the prevailing indicator of potential PARP inhibitor sensitivity, although not all BRCA1/2 breast cancers respond and there is preclinical evidence to suggest that PARP inhibitors may hold benefit in cancer populations beyond BRCA1/2 mutation carriers [54]. Herein, we provide evidence that the BD-L metagene may be enriched in clinically detectable breast cancer brain metastases and the metagene may implicate sporadic breast cancers across the conventional receptor and mutational status classifications that may benefit from a PARP inhibitor-based therapy while also identifying triple negative and BRCA1-mutant cancers that may prove refractory to treatment.