Identifying biomarkers of breast cancer micrometastatic disease in bone marrow using a patient-derived xenograft mouse model

Background Disseminated tumor cells (DTCs) found in the bone marrow (BM) of patients with breast cancer portend a poor prognosis and are thought to be intermediaries in the metastatic process. To assess the clinical relevance of a mouse model for identifying possible prognostic and predictive biomarkers of these cells, we have employed patient-derived xenografts (PDX) for propagating and molecularly profiling human DTCs. Methods Previously developed mouse xenografts from five breast cancer patients were further passaged by implantation into NOD/SCID mouse mammary fat pads. BM was collected from long bones at early, serial passages and analyzed for human-specific gene expression by qRT-PCR as a surrogate biomarker for the detection of DTCs. Microarray-based gene expression analyses were performed to compare expression profiles between primary xenografts, solid metastasis, and populations of BM DTCs. Differential patterns of gene expression were then compared to previously generated microarray data from primary human BM aspirates from patients with breast cancer and healthy volunteers. Results Human-specific gene expression of SNAI1, GSC, FOXC2, KRT19, and STAM2, presumably originating from DTCs, was detected in the BM of all xenograft mice that also developed metastatic tumors. Human-specific gene expression was undetectable in the BM of those xenograft lines with no evidence of distant metastases and in non-transplanted control mice. Comparative gene expression analysis of BM DTCs versus the primary tumor of one mouse line identified multiple gene transcripts associated with epithelial-mesenchymal transition, aggressive clinical phenotype, and metastatic disease development. Sixteen of the PDX BM associated genes also demonstrated a statistically significant difference in expression in the BM of healthy volunteers versus the BM of breast cancer patients with distant metastatic disease. Conclusion Unique and reproducible patterns of differential gene expression can be identified that presumably originate from BM DTCs in mouse PDX lines. Several of these identified genes are also detected in the BM of patients with breast cancer who develop early metastases, which suggests that they may be clinically relevant biomarkers. The PDX model may also provide a clinically relevant system for analyzing and targeting these intermediaries of metastases. Electronic supplementary material The online version of this article (doi:10.1186/s13058-017-0927-1) contains supplementary material, which is available to authorized users.


Background
Multiple prospective clinical trials have demonstrated that disseminated tumor cells (DTCs) found in the bone marrow (BM) of patients with early-stage breast cancer are highly correlated with early recurrent disease development and portend a poor prognosis [1,2], even many years after initial diagnosis [3]. BM DTCs are thought to be intermediaries in the metastatic process, transitioning in the BM, re-entering the circulation, and proliferating in distant organs with a favorable molecular microenvironment [4]. DTCs in the BM may be indicative of the systemic burden of micrometastatic disease in the patient [2]. Those patients with residual DTCs after chemotherapy are at very high risk of recurrence, indicating that those cells that survive chemotherapy have high metastatic potential [5]. Recent animal models suggest that early disseminated cells evolve in parallel to the primary tumor and have high metastatic potential [6,7]. To prevent the development of metastatic outgrowth, it is necessary to devise therapeutic strategies to target the intermediary cancer cells that evade conventional treatment.
To date, primary DTCs have been difficult to characterize. The rarity of these cells, the lack of uniform markers for detecting cells with metastatic potential, and the evolution of the cells while in a foreign micro-environment are the main constraints in identifying, isolating, and molecularly characterizing DTCs from patient BM specimens [8]. To address these limitations, we have investigated the use of patient-derived xenograft mouse models (PDX), wherein primary human breast carcinomas are transplanted and propagated in the mammary fat pad of mice, as a continuous, reproducible source of disseminated tumor cells for molecular characterization. Multiple studies have documented that the molecular profile, histopathological characteristics, and therapeutic sensitivities of PDX tumors recapitulate that of their primary tumor counterparts, and therefore should serve as an excellent model for tracking, studying, and testing interventions for metastatic disease development [9][10][11]. Evidence from other studies shows that primary, peripheral blood, circulating tumor cells (CTCs) from patients with breast cancer can also survive and propagate as mouse xenografts, again suggesting phenotypic parallels between PDX and human metastases [12,13]. Recently, detection of CTCs and DTCs has been reported in a PDX model [14].
In this report, using a PDX system established by transplanting primary tumors from pre-metastatic patients with breast cancer, we demonstrate that development of distant organ metastases correlates with the presence of BM DTCs. Comparative gene expression analysis of BM from these animals has allowed the identification of novel gene expression patterns associated with DTC colonization of BM and further supports the concept that DTCs present in the BM undergo epithelial to mesenchymal transition (EMT). Moreover, the expression of many of the genes identified in this PDX model distinguish BM from patients with breast cancer who develop early metastatic relapse from that of healthy female volunteers, suggesting potential value as prognostic and predictive biomarkers. We believe that the PDX model is an effective tool to identify and study the molecular characteristics of BM DTCs and their role in the metastatic process, and should allow for the development of new therapies to target these cells.

Patient population and establishment of PDX lines
After patients gave informed consent, human breast adenocarcinomas were prospectively collected using a protocol approved by the Institutional Review Board at Washington University in St. Louis, and transplanted into mice. All animal procedures were reviewed and approved by the Institutional Animal Care and Use Committee at Washington University in St. Louis. Briefly, after 3-week old female NOD-SCID mice were anesthetized, an inverted Y-shaped incision was made to expose the mammary glands. Using a dissecting microscope, the lymph-node and the vessel in the fat bridge between the fourth and fifth mammary fat pads were cauterized. The breast epithelium in this area was then excised to create the "cleared fat pad" into which human breast tissues were implanted without interference from the host's mammary epithelium. At 2 weeks post clearance, 500,000 immortalized green fluorescent protein (GFP)-labeled human breast fibroblasts were injected into each cleared fat pad. After an additional 2 weeks, the humanized fat pads received tumor implants. Breast biopsies were prepared for engraftment by placing tissue in ice-chilled high glucose DMEM, immediately transporting it to the laboratory, and mincing into 1-2-mm pieces for implantation in up to five mice. Further details of the development and maintenance of xenograft lines has been previously described [9][10][11]. Individual animals are designated using the labeling convention: primary tumor linepassageunique animal ID (e.g. 17-B-1141). Clinical features and pathological characteristics of the five patient tumor xenograft lines that were used for the present studies are listed in Table 1.

BM and RNA isolation
Mice were killed when the primary xenograft tumor reached approximately 1.5 cm in size (approximately 6-8 weeks after implanting the tumor tissues). The femur and tibia were dissected from surrounding tissue, avoiding potential contamination, and flushed with cold PBS to isolate BM cells. Normal mouse BM samples were collected from non-tumor-bearing NOD-SCID mice, both with and without transplanted human fibroblasts. BM from the four long bones of each animal was pooled and cells pelleted for RNA extraction. Total RNA was isolated from samples using Trizol reagent (Invitrogen) according to manufacturer's protocol. The extracted RNA was quantified and qualitatively assessed using an Agilent Bioanalyzer.

qRT-PCR
One microgram of RNA was used for synthesis of firststrand complementary DNA (cDNA) using the Retroscript (Ambion) kit with random hexamers. Resulting cDNA was diluted to an equivalent of 10 ng/μL of input RNA. qRT-PCR of the indicated genes was performed as described previously [8]. Human specific primer/probe sets for the genes tested were purchased from Applied Biosystems and the assay ids of the probes used are given in Additional file 1: Table S1. Each reaction consisted of 2 μL of cDNA, TaqMan Master Mix (Applied Biosystems) and primer/probe set in a total volume of 20 μL. For each transcript/sample, triplicate reactions were run in an ABI 7500 FAST Sequence Detection System. If a transcript was not detected in at least two replicates by cycle 40, it was considered absent in that sample and excluded from further analysis. Reactions with a cycle threshold (C T ) value difference >1.5 for the same probe were also excluded from further analysis. The C T values of each gene were normalized to mouse glyceraldehyde-3-phosphate dehydrogenase (GAPDH) C T values for the same sample. These delta C T (dC T ) values were then normalized to corresponding dC T values of the non-tumor-bearing mice for the same transcript and fold change calculated using the ddC T method. Transcripts that did not reach C T in nontumor-bearing control mouse samples after 40 cycles were assigned a C T value of 40 for calculation purposes.

Microarray analysis
Gene expression profiling was performed as previously described [8]. Total RNA was used for two-cycle biotinylated cRNA target synthesis (Affymetrix). Resulting biotinylated cRNA was quantified and samples that yielded >15 μg of cRNA were used for GeneChip microarray hybridization. Fragmented, biotinylated cRNAs were hybridized to Affymetrix Human Gene 1.0 ST microarrays following standard protocols. Arrays were hybridized, washed, and scanned following the manufacturer's protocol. GeneChip CEL files were processed with the RMA algorithm and normalized using Partek Genomics Suite software. Differential patterns of gene expression were identified from annotated, normalized microarray data as detailed in the "Results" section. All data filtering, visualization, and analysis of variance (ANOVA) was performed using Partek Genomics Suite software. A schematic of data sets utilized and analysis workflows are presented in Fig. 1. Gene expression data are available at Gene Expression Omnibus [GEA:GSE57947].

Development of metastatic tumors correlates with presence of human cells in mouse BM
To investigate the clinical relevance of PDX models for studying BM DTCs in patients with breast cancer, we utilized a set of previously characterized PDX mouse lines [9,11,15]. BM was collected from a total of 18 animals, spanning five different passages and representing initial implants from five different patients with a variety of molecular phenotypes ( Table 1). All but one patient (7192) developed distant, clinical metastatic disease.
To allow for multiple molecular analyses with limited amounts of BM, we employed a molecular screen to detect BM DTCs in each animal, based on detection of human-specific GAPDH (hGAPDH) transcript. As shown in Fig. 2a, 10 of 18 (55%) animals analyzed had detectable expression of hGAPDH in their BM.
hGAPDH expression clearly emanated from BM DTCs as other humanized xenograft animals and non-grafted controls, but fat pad humanized animals, had no detectable expression of hGAPDH (data not shown). Furthermore, all BM samples were assayed for GFP gene expression and found to be negative (data not shown), suggesting that hGAPDH expression emanated from actual DTCs and not from the GFP-labelled human fibroblasts that were implanted and that may have migrated from fat pad implantation. Although the actual number of human DTCs present in the BM of each mouse could not be calculated based on qRT-PCR data, assuming that hGAPDH expression levels per input mass of total RNA are proportional to DTC cell numbers, it is clear that WHIM17 mice maintained a much higher tumor burden in their BM, as compared to those from the WHIM12 line (Fig. 2a).
Among the DTC-positive mice, seven originated from the WHIM17 line and three originated from the WHIM12 line. These ten animals all developed distant solid metastases, primarily to the lung and liver (Table 1). In contrast, eight other animals, originating from lines WHIM11, WHIM12, WHIM13, and WHIM23 had neither detectable expression of hGAPDH in their BM nor any evidence of distant solid metastases, even after primary tumor growth had progressed to 1.5 cm at their greatest diameter at the time of sacrifice. The presence of BM DTCs and the development of distant metastases were highly correlated (p < 0.0001, Fisher's exact test), consistent with clinical observations of DTCs and metastatic disease development in patients [1,8]. In fact, although only 18 animals were analyzed in this study, detection of hGAPDH expression in BM was 100% specific and 100% sensitive for predicting metastatic spread of the xenograft tumor.

Human DTCs in mouse BM express markers of epithelialmesenchymal transition (EMT)
Data suggest that only those cancer cells that undergo extensive molecular and phenotypic adaptations, such as EMT, will successfully survive and proliferate in a foreign micro-environment [16]. We therefore examined the BM of the PDX mice for the expression of genes associated with both epithelial cell lineage and EMT using directed, qRT-PCR analyses for human-specific gene expression. EMTassociated transcripts included Snail1 (SNAI1), Gooscoid (GSC), and FOXC2. As expected, in control and non-metastatic mice without hGAPDH expression in BM, none of the epithelial and EMT marker genes were detected. In eight of the ten hGAPDH-positive mice, epithelial marker genes often used for DTC detection in human studies, i.e keratin17 (KRT17), mammaglobin (SCGB2A2), and EpCAM (TACSTD1) were also not detected. Keratin19 (KRT19) expression was detected in only one animal derived from the WHIM17 line (i.e. 17-B-1141). In contrast, expression of three EMT marker transcripts, Snail1 (SNAI1), Gooscoid (GSC), and FOXC2 were detected in many, albeit not all, of the seven WHIM17-derived animals ( Fig. 2b) but in none of the WHIM12 mice. Since hGAPDH expression in the WHIM12 animals was also lower, this may simply reflect lower tumor (DTC) burden in the BM of these animals.

Comparative molecular profiles of DTCs and their corresponding primary and metastatic tumors
To better understand the molecular evolution of tumor metastasis we utilized one PDX line (WHIM17) to Expression of each transcript in the BM of tumor bearing mice is represented relative to that in non-tumor-bearing humanized NOD-SCID mice (control), using the dd C T method. Since human-specific transcripts were not detected in the BM of control mice, for calculation purposes, a C T value of 40 was assigned. *Animals that developed metastatic tumors. The association between metastatic outcome and gene expression was statistically significant for hGAPDH (p < 0.0001), STAM2 (p = 0.004), DSCR3 (p = 0.004) and FOXC2 (p = 0.036) analyzed by the Fisher exact test compare patterns of gene expression between primary xenograft tumor, BM DTC populations, and distant solid organ metastasis. Of specific interest were patterns of gene expression that were unique to DTC populations, and groups of genes that were common to both DTCs and solid organ metastasis, but distinct from those of the primary tumor itself.
Gene expression microarray analysis was performed on the WHIM17 primary xenograft tumor (17-B-29), a splenic metastasis that developed in that animal (17-B-29), and one DTC-positive BM sample from the same animal and six from different passages from the same line. Although human-specific microarrays were used for this analysis, it was expected that some patterns of gene expression in the mouse BM samples could originate from cross-hybridization of transcripts from murine BM cells. Therefore, we also profiled two BM samples from both control mice and nonengrafted mice with humanized mammary fat pads. Expression data from these animals was used as a baseline to identify human DTC-specific expression in the BM of each of the WHIM17 animals. To validate this in silico approach, we selected six transcripts of genes previously implicated in tumorigenesis and metastasis [17][18][19][20][21][22][23][24][25][26] the expression of which was elevated at least threefold in all seven WHIM BM samples, as compared to control BM, and confirmed expression using human-specific primers and qRT-PCR (Fig. 3, Additional file 2: Table S2).
GLN3, ITGB3BP, MALAT1, and ITGB1BP1 were detected in the BM specimens from all WHIM17-derived mice, but not in the BM of mice derived from any other line. CD44 and ALCAM expression was detected in both mice that developed (WHIM17 and WHIM12) and mice that did not develop (WHIM23) metastatic disease. None of the non-tumor-bearing control mice with humanized mammary fat pads demonstrated expression of any of these genes (Fig. 3).
As shown in Fig. 4, global gene expression analysis of seven WHIM17 BM samples and corresponding tumor and metastatic lesion from the WHIM 17B29 animal showed specific clusters of genes with upregulated expression in both the primary tumor and the metastatic lesion. Surprisingly, there was considerable variability in gene expression among BM samples from the WHIM17 animals that appeared independent of tumor passage, pattern of metastatic spread, RNA quality, or other technical parameters. Not surprisingly, the WHIM17B29 BM expressed the greatest resemblance to the primary and metastatic lesion from the same animal, while four other BM samples shared a unique profile with a large number of transcripts that were over represented compared to primary tumor, metastasis, or other hGAPDH-positive BM samples.
We focused on clusters of 1979 unique "BM-specific" and 394 unique "metastasis-specific" transcripts to identify those that may be most relevant as biomarkers for the presence of DTCs. Additional file 3: Table S3 provides a complete list of those transcripts with significantly different expression between DTCs and primary tumors, and metastasis and primary tumor, while Tables 2 and 3 provide further filtered lists of those  (Tables 2 and 3). SLPI is appears in both gene sets indicating its enhanced expression in DTCs as well as metastatic tumor.
The molecular profiles of PDX DTCs are also found in BM from patients with breast cancer Since gene expression patterns strongly suggested that cells in PDX mouse BM are derived from their primary xenograft tumor, and that there is a robust association between their presence and metastatic outcome, we next investigated whether gene expression in the BM of mouse PDX models could also be detected in the BM of patients with breast cancer, prior to the development of overt metastatic disease. Using previously generated BM gene expression microarray data from a cohort of treatment-naïve, clinical stage II/III patients with breast cancer and healthy female controls [8], we examined the expression of 420 unique genes from the PDX BM data set to determine whether they could detect differences between these populations. From the original set of 1979 "BM-specific" transcripts identified in the xenograft model, we derived a set of 420 transcripts that both could be mapped to human microarray expression data probe sets and that were annotated in PubMed citations with the key words "metastasis", "invasion", and "epithelial mesenchymal transition" (Fig. 1). Globally (Fig. 5) the WHIM17 BM gene set did not distinguish healthy female BM from BM of patients with breast cancer and had no ability to classify those patients who did or did not experience a distant metastatic event. However, we did identify a subset of 17 genes with expression that after correcting for false discovery, could distinguish between healthy BM and BM from patients with breast cancer (Table 4, Fig. 1). Given that the expression of these genes (1) is frequently associated with biological processes such as tumor cell proliferation, invasion, and metastasis; (2) are detectable only in BM from PDX mice that develop DTCs in their BM and distant organ metastases; and (3) are expressed in patients with breast cancer, as compared to healthy human BM, we propose that they are excellent biomarker candidates for future prospective studies to evaluate whether they can stratify patients with breast cancer for risk of recurrent disease based upon detection and classification of BM DTCs.

Discussion
The presence of DTCs in the BM of patients with earlystage breast cancer identifies patients at high risk of recurrence [1]. Clinically, it is not clear whether the DTCs detected in the BM are the sole population of cells that later develop into metastatic foci, whether they represent the systemic burden of micrometastatic disease, or a combination of both. Regardless, clinical data demonstrate that DTCs can persist through chemotherapy and their presence after chemotherapy identifies a patient population at very high risk of recurrence, relative to those patients who clear their DTCs with chemotherapy [5,27,28]. In spite of these findings, DTC detection has not become a routine part of breast cancer patient management, primarily due to the limited number of targetable biomarkers and lack of a cost-effective, robust assay for detection which yields molecular information [8]. If the DTC phenotype is representative of the micrometastatic disease that will eventually become metastatic foci, then eliminating DTCs with targeted therapeutic approaches could prevent recurrent disease and result in a survival benefit to patients with breast cancer. Characterizing these cells is essential as our understanding of the parallel progression of the primary tumor and DTCs improves [6,29].
To date, it has been difficult to study the role of DTCs in the metastatic cascade and to perform molecular characterization due to their rarity. However, several studies have shown that PDX models recapitulate the molecular phenotype and biological behavior, and are predictive of clinical response of primary human tumors  Table 1) and gene expression clusters specifically upregulated in primary tumor, metastasis, and a subset of BM samples are highlighted [9][10][11]30]. Several studies have characterized CTCs using PDX models in breast cancer to better understand the biology of this process [12,13] and recently, DTCs have been reported using PDX models [14]. In the report by Giuliano et al. [14], DTCs were detected in 62% of PDX mouse BM.
In this study, we have focused on the combined use of PDX and patient BM samples to identify unique sets of gene transcripts that can both detect and classify breast cancer BM DTCs. A persistent limitation of this approach is the small number of stable PDX lines that can be created by implanting primary breast tumor tissues and, subsequently the number that demonstrate metastatic behavior. Of the five lines investigated in this study, only two (WHIM12 and WHIM17) developed solid organ metastases. Importantly, these were also the only two lines in which human-specific gene expression (presumably emanating from DTCs) could be detected in BM, strongly supporting the idea that DTC establishment is causal to or at least associated with distant organ metastasis in PDX mouse models. Furthermore, only one PDX line (WHIM17), derived from a patient with triple-negative breast cancer, consistently showed patterns of human gene expression that were reminiscent of a "mesenchymal-like" phenotype in multiple animals across multiple passages. It is curious that WHIM17 was the only tumor xenograft line to persistently propagate BM micrometastasis and it is recognized that the conclusions on gene expression biomarkers in human breast cancer BM samples may be necessarily constrained by this. Recently, Huang et al. [31] reported that later passages of the WHM17 tumor resembled a lymphoproliferative malignancy and not breast adenocarcinoma, based on RNA sequencing and phosphoproteomic studies. Such evolution of genomic features of PDX tumors when the tumor is propagated in mice has been reported recently [32]. Nevertheless, data from Li et al., who also performed molecular profiling of earlier passages of this tumor [11] similar to the specimens used in the current study, suggest that at least initially, WHIM17 is molecularly characteristic of other "basallike" breast cancers.
Using the WHIM17 line, array-based gene expression profiling was performed to compare primary tumor tissue and a solitary metastatic lesion to multiple BM samples among different animals and different passages of the WHIM17 line. By using a human-specific, shortoligonucleotide array platform (i.e. Affymetrix Gene-Chips) and comparing WHIM17 BM samples with those of control mice, it was inferred that the majority of the 13,000+ transcripts identified (Additional file 3: Table  S3) originated from human xenograft-derived tumor cells. Secondarily, by identifying those transcripts that were differentially expressed between the primary xenograft tumor and multiple WHIM17 BM samples, a list of candidate biomarkers of cells with high metastatic potential was created, which was further filtered and curated based upon association with published manuscripts related to metastasis biology (Fig. 1).
Among individual transcripts identified by this analysis were several genes known to be associated with the metastatic process (ALCAM, MALAT1) and EMT (SNAI1, GSC, FOXC2, SIP1) and breast cancer stem cells (CD44). Of equal importance, the absence of epithelialspecific transcripts was a conspicuous feature of these analyses. While expression of human epithelial genes such as cytokeratins in the BM have been the cornerstone for the identification of DTCs as tumor-derived cells, the absence of these genes is not surprising. Tumor cells undergo significant transformation in their morphology and molecular profiles during each stage of the metastatic process [33]. Only those tumor cells that successfully adapt to the unfamiliar molecular environments after being released from the primary tumor into circulation can survive and grow at a different anatomical location [34]. The BM environment has been shown to enhance this process through the action of various stromal cell populations [35]. Therefore, the loss of epithelial and breast-tissue-specific features in these cells can be attributed to the possible molecular transition which enables these cells to migrate to and survive in the BM matrix.
The clinical relevance of the gene expression patterns identified in WHIM mouse BM was evaluated in the BM of treatment-naïve patients with breast cancer as well. Although the xenograft used to create the WHIM17 line was a "triple negative" tumor, given the small number of BM gene expression data sets available, we considered BM samples from all patients, regardless of primary tumor molecular phenotype, as a single group. A small number of gene transcripts (Table 4) were differentially detected in patient BM as compared to healthy female controls, and the biological role and potential drug targetability of many of these genes (such as CD44, CD33, GLIPR1, and HEPB1) has been previously demonstrated. Although we were not able to determine the number of human DTCs present in the xenograft animals in the current study, previous studies have demonstrated that gene-expression-based detection of DTCs in human bone marrow samples using qRT-PCR can detect as few as 1 in 1 × 10^6 cells when analyzing 10^7 -10^8 nucleated cells from a 3-mL BM aspirate, depending upon the specific gene transcript analyzed [36]. Whether one or more of these transcripts can be routinely detected above background expression in normal human BM, and whether expression of these gene(s) are actually prognostic for metastatic recurrence are questions currently being addressed. Although we were unable to analyze the expression of these genes in the peripheral blood of xenograft mice, it will also be interesting and clinically relevant to determine whether the expression of these genes can be detected in peripheral blood of patients with breast cancer, possibly providing a less invasive assay to predict early metastatic recurrence.
Our data would argue that DTCs derived from the primary tumor are mesenchymal-like and express genes associated with the metastatic process in animals and in humans. Presence of these cells was limited strictly to the BM of mice with metastatic tumor development and they were present in BM at a pre-metastatic time point. It has been suggested that the parallel progression of primary and metastatic tumors can occur simultaneously with early dissemination of cancer cells from the primary tumor [6,7,37]. Since the molecular features of the tumor cells remain consistent across passages, if DTCs were a general occurrence rather than associated with metastatic potential, we would expect to find DTCs in all animals from the same line irrespective of their metastatic outcome.
Our results support the use of the PDX mice as a clinically relevant model to examine the molecular features of DTCs, alteration over time, and whether elimination of BM DTCs using targeted therapies will result in abrogation of metastatic disease development.