Gene expression signatures of morphologically normal breast tissue identify basal-like tumors
© Finak et al.; licensee BioMed Central Ltd. 2006
Received: 17 July 2006
Accepted: 20 October 2006
Published: 20 October 2006
The role of the cellular microenvironment in breast tumorigenesis has become an important research area. However, little is known about gene expression in histologically normal tissue adjacent to breast tumor, if this is influenced by the tumor, and how this compares with non-tumor-bearing breast tissue.
To address this, we have generated gene expression profiles of morphologically normal epithelial and stromal tissue, isolated using laser capture microdissection, from patients with breast cancer or undergoing breast reduction mammoplasty (n = 44).
Based on this data, we determined that morphologically normal epithelium and stroma exhibited distinct expression profiles, but molecular signatures that distinguished breast reduction tissue from tumor-adjacent normal tissue were absent. Stroma isolated from morphologically normal ducts adjacent to tumor tissue contained two distinct expression profiles that correlated with stromal cellularity, and shared similarities with soft tissue tumors with favorable outcome. Adjacent normal epithelium and stroma from breast cancer patients showed no significant association between expression profiles and standard clinical characteristics, but did cluster ER/PR/HER2-negative breast cancers with basal-like subtype expression profiles with poor prognosis.
Our data reveal that morphologically normal tissue adjacent to breast carcinomas has not undergone significant gene expression changes when compared to breast reduction tissue, and provide an important gene expression dataset for comparative studies of tumor expression profiles.
Despite significant advances in breast cancer treatment, 26% of patients with early disease develop metastasis and succumb to the disease . None of the current prognostic indicators can reliably predict the outcome for such patients [2–6]. Microarrays have been widely used for expression profiling of breast cancer and other malignancies and, because of their genome-wide nature, they allow for the identification of gene expression changes that have occurred between normal and tumor breast tissues. Using these approaches, several studies have successfully identified breast cancer subtypes and prognostic markers; however, the utility of such markers in the clinic remains open [7–11].
The majority of studies focusing on breast have used heterogeneous material from whole tissue sections with a few exceptions where epithelial cells have been specifically isolated . The presence of loss of heterozygosity in normal stromal breast tissue adjacent to, and distant from, the tumor site has been demonstrated, suggesting that changes in stroma may have occurred . Since surgery is the standard of care, normal cells harboring alterations that may be relevant to cancer progression may remain and, thus, could have important clinical implications.
The normal human breast consists of ductal epithelium and surrounding stroma. The stroma consists of two compartments (intralobular stroma and extralobular stroma), accounts for more than 80% of the breast volume, and provides nutrition and structural support for the normal epithelium. Carcinoma of the breast, as well as benign hyperplastic conditions, are thought to originate from epithelial cells or progenitor epithelial cells of the terminal duct-lobular unit . However, growing evidence indicates that stroma may play an important role in cancer initiation and progression [15–17]. Little is known regarding gene expression profiles in morphologically normal breast stroma or epithelium adjacent to breast tumor tissue.
At the clinical level, normal tissue is defined as morphologically normal. Laser capture microdissection (LCM) allows one to isolate nearly pure cell populations from a heterogeneous environment, and the material is suitable for microarray gene expression analysis [12, 18, 19]. This approach has allowed the comparison of gene expression profiles between normal human breast epithelium and tumor tissue . Epithelium derived from regions of the breast adjacent to tumor, considered normal by all histological and clinical standards, has been shown to have a distinct gene expression profile from tumor tissue . However, in these cases sample sizes have been small when comparing reduction and adjacent tissue (n = 3 reduction samples) and, furthermore, stroma was not considered . Thus, knowledge of gene expression patterns in normal tissue would be invaluable to improve the precision of gene expression signatures for poor or good prognosis.
In the present study, LCM was used to dissect normal epithelium and normal stroma derived from patients undergoing breast reduction mammoplasty or surgical treatment of breast cancer. Gene expression profiles reveal that morphologically normal stroma and epithelium from breast cancer patients are not statistically distinct from epithelium and stroma isolated from reduction mammoplasties and do not possess gene expression changes associated with standard clinical characteristics.
Materials and methods
Clinical data were collected for the samples from the Breast Cancer Functional Genomics Group clinical database. Cellular and fibrotic stroma were identified by visual inspection of hematoxylin and eosin stained tissue sections under a microscope. Cellular stroma was defined as tissue with more than 1,000 stroma cells uniformly distributed throughout the field of view (4× magnification), while fibrotic stroma was defined as tissue with less than 800 stroma cells in the field of view (4× magnification) and concentrated primarily around the ducts.
Tissue collection and staining procedures
All tissue specimens and associated clinical data were collected at McGill University Health Center (Montreal, Canada) between 2000 and 2004 in accordance with the protocols approved by the research ethics committee. Patient consent was obtained on an individual basis for all patients participating in this study. Of 44 patients selected for the study, 34 patients had invasive ductal carcinoma and 10 were healthy donors undergoing reduction mammoplasty. Tissue samples were collected within 30 minutes after surgery, embedded in TissueTek OCT (Somagen, Edmonton, Alberta, Canada) and stored in liquid nitrogen until use. Frozen specimens were cryosectioned in 10-micron slices, stained using a hematoxylin and eosin staining protocol and dehydrated in ethanol and xylene as recommended by the LCM manufacturer (Arcturus, Mountain View, CA, USA). Following dehydration, the slides were air dried for 20 minutes and subjected to LCM. All normal tissues adjacent to tumor were microdissected from regions at least 2 mm away from tumor margins. Normal and adjacent stroma were sampled exclusively from the extralobular stromal compartment.
LCM, RNA extraction and linear amplification
All tissues included in this study were re-examined by a clinical pathologist dedicated to the project. Tissue specimens were microdissected into epithelium and stroma using a PixCell IIe LCM system (Arcturus). All microdissections were performed within three hours following tissue staining. Total RNA was extracted from each population of microdissected cells using a GITC (guanidinium isothiocyanate) extraction protocol. Briefly, LCM caps were incubated for 5 minutes (room temperature) in 200 μl GITC extraction buffer (4 M GITC, 25 mM sodium citrate pH 7.0, 0.1 M β-mercaptoethanol, 0.5% N-lauroylsarcosine) supplemented with 1.6 μl β-mercaptoethanol. Subsequently, 20 μl of 2 M NaOAc, pH 4.0, 220 μl of water-saturated phenol and 60 μl of chloroform-isoamyl alcohol (23:1) were added to the extraction buffer. Following 15 minutes incubation on ice and centrifugation (12,000 rpm, 15 minutes) the aqueous phase was removed and RNA was precipitated with 2 μl glycogen (GenHunter, Nashville, Tennessee, USA) and 200 μl isopropanol. Samples were placed at -80°C for 30 minutes and centrifuged at 4°C (12,000 rpm) for 30 minutes to pellet RNA. Pellets were washed with 70% ethanol, air dried and subjected to DNAseI treatment (Roche, Basel, Switzerland). DNAseI treatment was performed in the presence of an RNase inhibitor (Invitrogen, Carlsbad, California, USA). Subsequently, samples were re-extracted as described above and re-suspended in 10 μl of diethylpyrocarbonate-treated water. RNA was quantified using a RiboGreen assay (Molecular Probes, Carlsbad, California, USA). Subsequently, 2 to 4 ng of total RNA was subjected to two rounds of T7 linear amplification using Ambion Amino Allyl MessageAmp kit (Ambion, Austin, Texas, USA) and labeled with Cy3 and Cy5 dyes according to the manufacturer's procedure. Prior to microarray hybridizations, amplified products were quantified using a spectrophotometer (Nanodrop, Wilmington, Delaware, USA) and subjected to BioAnalyzer to assay for quality (Agilent Technologies, Santa Clara, California, USA).
Whole Human Genome 44 K arrays (Agilent Technologies, product G4112A) were used for all experiments. RNA samples (500 ng) were subjected to fragmentation followed by 18 h hybridization, washing, and scanning (Agilent Technologies, model G2505B) according to the manufacturer's protocol (manual ID #G4140-90030). Samples were hybridized against Universal Human Reference RNA (Stratagene, ID #740000, La Jolla, California, USA). Duplicate hybridizations were performed for all samples using reverse-dye labeling.
Candidate tissue markers were validated by immunohistochemistry. Frozen tissue sections (10 μm thick) were defrosted at room temperature for 30 s, fixed in acetone (room temperature, 10 minutes) and air dried for 2 minutes. Subsequently, tissue sections were blocked with Peroxidase Blocking Reagent (DakoCytomation, Glostrup, Denmark). Primary antibodies were diluted at 1:50 and 1:15 for anti-c-kit (polyclonal rabbit anti-human CD117, DakoCytomation), and anti-CD31 (polyclonal mouse anti-human, DakoCytomation) and applied to the tissue sections for 45 and 15 minutes, respectively. Following a brief wash with TBS-T (tris-buffered saline tween-20), secondary antibodies were applied for 30 and 20 minutes, respectively. Labeled polymer-HRP anti-rabbit (EnVision+ System HRP(DAB), DakoCytomation) was used as a secondary antibody for c-kit staining and labeled polymer-HRP anti-mouse (EnVision+ System HRP(DAB), DakoCytomation) for CD31 staining. After a short wash with TBS-T, DAB Substrat-Chromogen Solution (EnVision+® System HRP(DAB) DakoCytomation) was applied for up to 5 minutes for color development.
Data preprocessing, normalization, and quality control
Microarray data were feature extracted using Feature Extraction Software (v. 7.11) from Agilent with the default parameters. Raw data were uploaded to the NCBI Gene Expression Omnibus database (GEO) and is accessible as data series GSE4823. Outlier features on arrays were flagged by the software. Arrays were required to have an average raw signal intensity of 1,000 in each channel, and a signal to noise ratio above 16 per channel. MvA plots were examined for signs of hybridization or labeling problems. Replicate arrays were required to have a concordance above 0.944. This level was established empirically using sets of known good replicate arrays in our database.
Data preprocessing and normalization were automated using the BIAS system . Raw feature intensities were background corrected using the RMA background correction algorithm [21, 22]. Resulting expression estimates were converted to log2-ratios. Within array normalization was performed using spatial and intensity-dependent loess . Median absolute deviation scale normalization was used to normalize between arrays .
Using class discovery under correlation distance and Euclidean distance metrics, 10,000 bootstrap iterations were performed to assess the significance of the observed clusters using the pvclust package for R. Multidimensional scaling was applied to reduce the dimensionality of the data and permit visualization. Chi-square tests and logistic regression were applied to discrete and continuous variables, repsectively, to test for association with data partitions (clusters). The variables tested included estrogen receptor (ER) status, progesterone receptor (PR) status, lymph node (LN) status, HER2 receptor status, menopause status, age, grade, tumor size, and recurrence.
Both the linear models for microarray analysis (LIMMA) and significance analysis of microarrays (SAM) algorithms were used to identify differentially expressed gene sets from which to build class predictors [26–29]. Genes from LIMMA were filtered for significance, (false discovery rate adjusted p value ≤ 0.01), fold change (≥2.0), intensity above background (A > 6.0), while genes identified by SAM were filtered by significance (q ≤ 0.3), fold change (≥2.0), and intensity (A > 6.0).
The prediction around medoids (PAM) algorithm was used to build predictors based on the filtered gene sets . Cross validation was used to test the predictors. This procedure included independent selection of candidate gene sets for each cross validation step. Differentially expressed genes were mapped onto Gene Ontology (GO), and GO terms were tested for overrepresentation using the hypergeometric distribution .
Assessing patient specific gene expression effects
We wanted to assess the relative contribution of different factors to the overall variability of gene expression observed in our data. Principal component analysis allows one to succinctly summarize data in a reduced number of dimensions (principal components) . The principal components are ordered by the amount of variation (or signal) in the data that they explain. We performed principal component analysis on the patient matched adjacent stroma and epithelial data. Consecutive sequences of the first 10 principal components were tested for association with clinical characteristics using multivariate analysis of variance (MANOVA). Bonferroni multiple testing correction was applied to the resulting p values .
Identification of tissue markers
LIMMA was used to identify differentially expressed genes between tissues in individual patients and obtain expression estimates for the matched data ([28, 32]. Genes not exhibiting differential expression in at least 50% of samples were excluded from further analysis (B-statistic > 0). A paired t-test was used to identify genes whose patient-matched LIMMA expression estimates were significantly different from zero over the panel of patients (false discovery rate adjusted p value < 1e-5).
Comparison with publicly available cancer datasets
The expression of gene signatures from a number of publicly available datasets was examined in normal tissue.
The stroma-specific and epithelium-specific gene lists identified by Allinen and colleagues  contained 231 and 97 unique genes, respectively, of which 189 and 89 were located (mapped) successfully on the Agilent chip. The activated and inactivated core serum response (CSR) genes from Chang and colleagues  contained 228 and 233 genes, respectively, of which 209 and 211 were mapped to the Agilent array. The intrinsic breast cancer gene list of Sorlie and colleagues  contained 553 genes, of which 473 were mapped to the Agilent array. The desmoid type fibromatosis (DTF) and solitary fibrous tumor (SFT) specific gene lists from West and colleagues  contained 493 and 293 genes, respectively, of which 415 and 238 were mapped to the Agilent array. Genes that were likely to be expressed in normal breast tissue were selected from these gene sets by selecting genes with variance >1 in the normal tissue data; 7.3% of genes in the normal dataset have variance >1, and enrichment for high variance genes in the various gene sets was measured by a χ2 goodness of fit test.
Genes from the Agilent whole genome arrays were mapped to the Agilent 24 K arrays used in the Netherlands cancer dataset . The 24 K arrays used by Van de Vijver and colleagues  contained 24,498 features. Approximately 10,000 contigs on the 24 K array could not be mapped to GenBank identifiers. Of the remaining 14,339 identifiers, 12,112 were mapped to features on the 44 K Agilent array. Expression of the genes from the normal tissue signature was then examined in the 295 breast cancer samples from the Netherlands cancer dataset .
The GEO accession number of the array data series is GSE4823.
Identification of stroma- and epithelium-specific gene expression profiles
Summary of clinical characteristics of patients sampled for this study
Lymph node status
Age (mean ± SD)
52.18 ± 12.54
Tumor size (mean ± SD)
24.76 ± 14.06
To identify the genes responsible for the tissue-specific clustering observed in Figure 2a, class distinction was applied to identify all genes differentially expressed between tissues. Markers were defined based on patient matched stromal and epithelial samples (22 patients and 44 samples; see Materials and methods; Table 1). In total, 883 markers were identified that showed differential expression between matched epithelium and stroma in at least 50% of individual samples (LIMMA log odds >0), as well as differential expression between pooled epithelium and stroma samples (false discovery rate adjusted p value 1e-5; Additional file 8). Using these markers, hierarchical clustering was applied to the complete sample set (44 patients, 66 samples), and resolved the samples into epithelial and stromal clusters, including correct classification of the three outlier samples (Figure 2b). These genes define a normal tissue gene expression signature.
Selected tissue markers identified for normal stroma and normal epithelium
Secreted frizzled-related protein 4
Amine oxidase, copper containing 3 (vascular adhesion protein 1)
Prostaglandin I2 synthase
TEK tyrosine kinase, endothelial (venous malformations, multiple cutaneous and mucosal)
Insulin-like growth factor binding protein 7
Collagen, type I, alpha 2
WNT1 inducible signaling pathway protein 2
CD36 antigen (collagen type I receptor, thrombospondin receptor)
Protein phosphatase 1, catalytic subunit, beta isoform
Human melanoma-associated antigen p97
TP53 apoptosis effector
Discoidin domain receptor family, member 1
Cadherin 1, type 1, E-cadherin (epithelial)
F11 receptor, junctional adhesion molecule 1
v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog
Keratinocyte associated protein 3.
E74-like factor 5 (epithelium-specific Ets transcription factor 2)
Normal stroma and epithelial specific gene sets are not predictive of clinical characteristics
To determine whether gene expression patterns in normal breast epithelium or stroma derived from breast cancer patients can predict clinical or pathological features of the corresponding cancers, we applied a class prediction  approach and constructed tissue specific predictors for ER, PR, HER2, grade, tumor size, age, menopause status, recurrence, and lymph node status (Additional files 2 and 3). We used cross validation at every step of predictor construction , including the initial step of candidate gene selection. None of the predictors had low prediction error or low variance, with an average 50% mean prediction error by cross validation (Additional files 2 and 3). This analysis demonstrated that any gene expression differences detected in normal epithlium and stroma were neither associated with, nor predictive of, the clinical characteristics of the primary tumors.
Morphologically normal samples from different individuals are expected to show variations in gene expression due to a number of factors, including noise, differences in tissues, inter-individual variation, potential clinical differences, and the simple fact that different genes are expressed at different levels. Our goal was to identify the relative contribution of each of these sources of variation to our data (Additional file 12, panel A). Principal component analysis and multivariate analysis of variance revealed that the primary sources of variation in the data could be attributed to differences between tissues (Bonferroni corrected p = 7.9e-16, principal components 2 and 3), representing 3.98% of the variation between genes (Additional file 12, panel B), and differences between individuals (Bonferroni corrected p = 4.9e-6, principal components 3 through 8), representing 3.58% of the variation between genes. The majority of the variation in the data (84.58%) could be attributed to variations in expression between genes within a single sample. The strong correlation between arrays introduced by the common reference design of our experiment caused this variation to be common across all arrays (Additional file 13). Together, these effects accounted for 92.13% of the observed variation in the data. The remainder of the variation in gene expression was not associated with any known factors.
The normal epithelium and stroma expression set identify subtypes of breast carcinoma
The identification of gene expression profiles for morphologically normal stroma and epithelium provide unique datasets that can be used to investigate breast cancer datasets for similarity to the normal tissue profile in order to gain a better understanding of breast cancer expression profiles. When our stroma and epithelium profile was compared to a dataset established by a serial analysis of gene expression (SAGE) approach from dispersed cells from one reduction mammoplasty sample , we observed a minimal overlap. Our normal stroma signature (562 unique genes) showed only a 25 gene overlap with that generated by SAGE for a mixture of fibroblast, endothelial and myofibroblast cells (mapped 189 unique genes), and a 2 gene overlap with the epithelium signature (mapped 89 unique genes), whilst our normal epithelium signature (321 unique genes) overlapped by 12 genes with the epithelium signature identified by SAGE. Although the overlaps are statistically significant (p = 1.33e-15 and p = 9.07e-12, respectively, hypergeometric test), the relatively low overlap between the signatures may be due to use of only a single patient in the SAGE data when compared to 44 patients in our dataset and our filtering criteria. However, the fact that no genes are in common between the epithelial gene set and that of the fibroblast data obtained by SAGE supports the purity of both cell populations in these studies.
The smaller of the two clusters consisted of 38 samples, which were identified as ER negative, HER2 negative, and PR negative (Figure 9). This ER/HER2/PR negative cluster was found to express many normal and basal subtype specific genes as defined by Sorlie and colleagues , including keratin-5, keratin-17, and gamma-glutamyl hydrolase (GGH). Based on expression of these markers, we identified the samples in this cluster as consisting of basal-like and normal-like cancer subtypes as defined previously . The remaining ER negative samples in the cancer dataset were HER2 positive and were located in the larger sample cluster. Notably, the cluster of basal-like and normal-like samples remained when the data was clustered using only our normal epithelium-specific gene set, whereas the cluster was not observed when normal stroma-specific genes were used in clustering (data not shown). This indicated that the basal subtype-specific patient cluster was enriched in genes expressed in normal epithelium when compared with other tumor subtypes.
Normal stroma is similar to DTF tumors and fibroblasts with an inactivated core serum response
Few datasets have been generated for stroma, and this is the first extensive dataset to be generated from normal stroma. To determine whether our normal stroma data set resembled other gene expression profiles for fibroblasts, a core set of genes shown to be differentially regulated when fibroblasts are stimulated with serum  was examined. We identified genes from the CSR profiles that were expressed in normal tissue (Additional files 6 (panel D) and 7) using a variance filtering criteria (see Materials and methods). Of the unstimulated fibroblast genes expressed in normal tissues, 84% were expressed in stroma, while 16% were expressed in epithelium, while the majority of genes activated in wounding were not expressed in either tissue (Additional file 6, panel C). These results indicate that both normal adjacent stroma and normal reduction stroma have expression profiles more similar to unstimulated fibroblasts.
To investigate the similarity of our normal stromal profile to that of fibroblastic tumors, normal stroma and epithelium expression profiles were compared to the gene signatures of DTF and SFTs . Normal stroma samples expressed significantly more DTF-specific genes than expected by chance (p ≤ 2e-16, χ2 goodness of fit test), while the number of SFT-specific genes was marginally significant (p = 0.038, χ2 goodness of fit test) (Additional files 6 (panel A), and 7). Interestingly, normal stroma showed a statistically significant enrichment for expression of DTF-specific genes (p = 2.48e-5) (Additional file 6, panel B).
Knowledge of the normal breast microenvironment in which a cancer develops is important in understanding cancer biology. However, gene expression patterns of normal stroma and epithelium in human breast cancers have not been extensively studied. Although several studies have identified loss of heterozygosity in morphologically normal breast epithelium [45–47] and stroma [42, 48] derived from breast cancer patients, other studies have proposed that these changes were distinct from the co-existing cancer . Hence, it is unclear whether genomic alterations observed in morphologically normal breast tissues represent early precursors of breast cancer, markers of increased risk, or population based polymorphisms. In this paper, we present the most complete study to date of gene expression in normal breast tissues. Using LCM and whole genome microarray analysis we have characterized tissue-specific gene expression and identified markers of normal epithelium and stroma.
A primary goal of our study was to establish if a cancer-associated expression signature could be detected in morphologically normal breast tissues obtained from patients with breast cancer. Several approaches were used to address this question. First, we compared gene expression in morphologically normal tissue derived from breast cancer patients to that of healthy individuals undergoing breast reduction surgery. Second, we investigated if the pattern of gene expression in normal breast tissues derived from breast cancer patients was associated with clinical or pathological features of the corresponding cancer. A combination of class discovery, class distinction and class prediction approaches was used to analyze gene expression in microdissected epithelial and stroma samples (Figure 1). The results of this analysis demonstrate that microdissected samples clustered according to tissue type, and not according to the clinical or individual characteristics of the patients (Figures 2, 3 and 6). Moreover, our inability to identify statistically or biologically relevant predictors of the adjacent and reduction classes (Additional files 2 and 3) demonstrates that cancer-adjacent and breast reduction normal tissues have essentially homogeneous expression profiles. Furthermore, variations in gene expression between groups of samples are not associated with clinical characteristics but can be explained by tissue- and patient-specific variability. These data are in agreement with a previous study  that demonstrated a lack of significant differences between breast reduction and cancer-adjacent epithelium (three samples) using cDNA microarrays. In addition, our study now demonstrates a lack of significant differences between breast reduction and cancer adjacent stroma.
Notably, ER status, which is often the most important classifier of tumors, both clinically and at the molecular level [4, 10], did not associate with any clusters observed in normal stroma or epithelium, nor were we able to identify any predictors for this clinical category. Identical approaches of class distinction, class prediction, and class discovery failed to identify biologically relevant or statistically significant predictors, or clusters associated with any of the other clinical characteristics tested (Additional files 2 and 3). These results suggest that, at the level of global gene expression, there is no significant cancer-associated expression signature detectable in normal breast tissues. We cannot, however, completely rule out the possibility that some subtle changes are present but are obscured by other effects, such as patient variability, or technical limitations.
While we were unable to identify predictors of clinical characteristics, there were genes differentially expressed between some of these clinical characteristics. In most cases the functional categories that were overrepresented consisted mostly of metabolic pathways and processes. Class discovery in normal adjacent stroma revealed two statistically significant clusters associated with stromal cellularity. While we were unable to identify a predictor of stromal cellularity, the differentially expressed genes identified in the class distinction were overrepresented in a number of interesting functional categories, including branching morphogenesis, endocytosis, neurogenesis, and patterning of blood vessels. For example, NOTCH4, a receptor for the Notch pathway that has been shown to inhibit angiogenesis , was elevated in the pauci cellular fibrotic stroma cluster when compared to the higher cellularity stroma, while JAG1, a Notch ligand shown to induce angiogenesis in some head and neck tumors , was elevated in highly cellular stroma compared to pauci cellular fibrotic stroma. Since we have been careful to sample stroma from the extralobular compartment, it is unlikely that these differences represent extralobular and intralobular stroma. However, we cannot rule out that these may be differences between stromal compartments that have previously not been identified based on morphology.
Comparison of our data to published data sets reveals the similarity of normal stroma and epithelium expression signatures with previously published gene expression profiles of epithelium and collective fibroblasts, endothelium, and myofibroblasts isolated from reduction mammoplasty samples . Previous studies have examined the gene expression of cultured fibroblasts in response to serum and demonstrated that this expression program resembled that of a wound response  as well as expression profiles from tumors with fibroblastic features . The serum/wound response expression profile was predictive of metastasis and progression in several carcinomas. Our normal breast stroma profile exhibits an expression pattern similar to unstimulated fibroblasts [44, 52] and demonstrates that DTF tumors are more related to normal stroma than a SFT signature . Since a DTF tumor profile has been shown to be associated with favorable outcome in breast tumors , the enrichment for DTF genes in our normal stroma profile is consistent with this finding.
Notably, clustering of a large breast cancer dataset  with the normal stroma and epithelium profile identified two significant clusters of samples (Figure 8). The smaller of the two clusters consisted of 38 samples, which were all identified as ER negative, HER2 negative, and PR negative. This cluster expressed genes specific to basal-like and normal-like cancer subtypes, including keratin-5, keratin-17, and GGH. The remaining ER negative samples were contained within the larger cluster of 266 samples. This cluster was composed of ER negative/HER2 positive, and ER positive/HER2 negative samples, which are characteristic of HER2 positive and luminal cancer subtypes, respectively [11, 35]. Clustering of the cancer data using only epithelium specific genes led to repeated observation of a distinct basal-like cluster, whereas clustering using only stroma-specific genes led to co-clustering of the basal-like, ER positive, and HER2 positive tumors. This is in contrast to a recent report showing successful prognostic prediction in breast tumor microarray data using, amongst others, a stroma based signature . The stroma based predictor used in that study was the wound response signature (similar to the CSR response signature), which we have shown is not expressed in normal stroma. Consequently, the predictive genes of the CSR (and wounding) signature are not selected as part of the intrinsic normal stroma signature, and thus we do not see association with prognosis when clustering using the intrinsic normal stroma genes.
This study provides the first in depth analysis of gene expression in morphologically normal epithelium and stroma adjacent to breast cancers as well as from reduction mammoplasty specimens. Analysis of the gene expression profiles revealed that there are no significant differences between tumor derived and reduction mammoplasty derived tissue. The analysis of these expression profiles in other breast cancer datasets identifies a distinct HER2/ER/PR negative subcluster that corresponds to a mixture of basal-like and normal-like cancer subtypes and reveals molecular similarities between normal breast epithelium and basal-like breast tumors with poor outcome. Moreover, the lack of any cancer-associated patterns of gene expression in morphologically normal breast tissues will enhance our understanding of early changes involved in cancer initiation. Furthermore, these data provide a base for the interpretation of breast cancer molecular profiling experiments and for the discovery of novel prognostic markers.
core serum response
desmoid type fibromatosis
laser capture microdissection
linear models for microarray analysis
prediction around medoids
serial analysis of gene expression SAM, significance analysis of microarrays
solitary fibrous tumor
tris-buffered saline tween-20.
We are grateful to D Cernea and N Bertos for comments on the manuscript, to H Chen and S Dumont for expert technical assistance and to Drs R Michel and D Haegert as well as D Hori, T Vilhena, L Pasyuk, and C Palko-Condron. This work was supported by operating grants from the Quebec Breast Cancer Foundation to MP and SM. GF was supported by a studentship from the CIHR McGill University Cancer Consortium Training Award in Cancer Research. SS was funded in part with a fellowship from the Cedars Foundation, and MP is a Canadian Institutes of Health Research Senior Scientist.
- Edwards BK, Brown ML, Wingo PA, Howe HL, Ward E, Ries LA, Schrag D, Jamison PM, Jemal A, Wu XC, et al: Annual report to the nation on the status of cancer, 1975–2002, featuring population-based trends in cancer treatment. J Natl Cancer Inst. 2005, 97: 1407-1427.PubMedView ArticleGoogle Scholar
- Elston CW, Ellis IO: Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology. 1991, 19: 403-410.PubMedView ArticleGoogle Scholar
- Henson DE, Ries L, Freedman LS, Carriaga M: Relationship among outcome, stage of disease, and histologic grade for 22,616 cases of breast cancer. The basis for a prognostic index. Cancer. 1991, 68: 2142-2149. 10.1002/1097-0142(19911115)68:10<2142::AID-CNCR2820681010>3.0.CO;2-D.PubMedView ArticleGoogle Scholar
- Shek LL, Godolphin W, Spinelli JJ: Oestrogen receptors, nodes and stage as predictors of post-recurrence survival in 457 breast cancer patients. Br J Cancer. 1987, 56: 825-829.PubMedPubMed CentralView ArticleGoogle Scholar
- Torregrosa D, Bolufer P, Lluch A, Lopez JA, Barragan E, Ruiz A, Guillem V, Munarriz B, Garcia Conde J: Prognostic significance of c-erbB-2/neu amplification and epidermal growth factor receptor (EGFR) in primary breast cancer and their relation to estradiol receptor (ER) status. Clin Chim Acta. 1997, 262: 99-119. 10.1016/S0009-8981(97)06542-X.PubMedView ArticleGoogle Scholar
- Brenton JD, Carey LA, Ahmed AA, Caldas C: Molecular classification and molecular forecasting of breast cancer: ready for clinical application?. J Clin Oncol. 2005, 23: 7350-7360. 10.1200/JCO.2005.03.3845.PubMedView ArticleGoogle Scholar
- van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415: 530-536. 10.1038/415530a.PubMedView ArticleGoogle Scholar
- van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, et al: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002, 347: 1999-2009. 10.1056/NEJMoa021967.PubMedView ArticleGoogle Scholar
- Ma XJ, Wang Z, Ryan PD, Isakoff SJ, Barmettler A, Fuller A, Muir B, Mohapatra G, Salunga R, Tuggle JT, et al: A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell. 2004, 5: 607-616. 10.1016/j.ccr.2004.05.015.PubMedView ArticleGoogle Scholar
- Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, et al: Molecular portraits of human breast tumors. Nature. 2000, 406: 747-752. 10.1038/35021093.PubMedView ArticleGoogle Scholar
- Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA. 2001, 98: 10869-10874. 10.1073/pnas.191367098.PubMedPubMed CentralView ArticleGoogle Scholar
- Ma X-J, Salunga R, Tuggle JT, Gaudet J, Enright E, McQuary P, Payette T, Pistone M, Stecker K, Zhang BM, et al: Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci USA. 2003, 100: 5974-5979. 10.1073/pnas.0931261100.PubMedPubMed CentralView ArticleGoogle Scholar
- Lakhani SR, Chaggar R, Davies S, Jones C, Collins N, Odel C, Stratton MR, O'Hare MJ: Genetic alterations in 'normal' luminal and myoepithelial cells of the breast. J Pathol. 1999, 189: 496-503. 10.1002/(SICI)1096-9896(199912)189:4<496::AID-PATH485>3.0.CO;2-D.PubMedView ArticleGoogle Scholar
- Bissell MJ, Radisky D: Putting tumors in context. Nat Rev Cancer. 2001, 1: 46-54. 10.1038/35094059.PubMedPubMed CentralView ArticleGoogle Scholar
- McCawley LJ, Matrisian LM: Tumor progression: defining the soil round the tumor seed. Curr Biol. 2001, 11: R25-R27. 10.1016/S0960-9822(00)00038-5.PubMedView ArticleGoogle Scholar
- Wiseman BS, Werb Z: Stromal effects on mammary gland development and breast cancer. Science. 2002, 296: 1046-1049. 10.1126/science.1067431.PubMedPubMed CentralView ArticleGoogle Scholar
- Wernert N: The multiple roles of tumor stroma. Virchows Arch. 1997, 430: 433-443. 10.1007/s004280050053.PubMedView ArticleGoogle Scholar
- Sgroi DC, Teng S, Robinson G, LeVangie R, Hudson JR, Elkahloun AG: In vivo gene expression profile analysis of human breast cancer progression. Cancer Res. 1999, 59: 5656-5661.PubMedGoogle Scholar
- Luo L, Salunga RC, Guo H, Bittner A, Joy KC, Galindo JE, Xiao H, Rogers KE, Wan JS, Jackson MR, et al: Gene expression profiles of laser-captured adjacent neuronal subtypes. Nat Med. 1999, 5: 117-122. 10.1038/4806.PubMedView ArticleGoogle Scholar
- Finak G, Godin N, Hallett M, Pepin F, Rajabi Z, Srivastava V, Tang Z: BIAS: Bioinformatics Integrated Application Software. Bioinformatics. 2005, 21: 1745-1746. 10.1093/bioinformatics/bti170.PubMedView ArticleGoogle Scholar
- Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003, 31: e15-10.1093/nar/gng015.PubMedPubMed CentralView ArticleGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Barclay YDB, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4: 249-264. 10.1093/biostatistics/4.2.249.PubMedView ArticleGoogle Scholar
- Smyth GK, Speed T: Normalization of cDNA microarray data. Methods. 2003, 31: 265-273. 10.1016/S1046-2023(03)00155-5.PubMedView ArticleGoogle Scholar
- Yang YH, Buckley MJ, Speed TP: Analysis of cDNA microarray images. Brief Bioinform. 2001, 2: 341-349. 10.1093/bib/2.4.341.PubMedView ArticleGoogle Scholar
- Suzuki R, Shimodaira H: Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics. 2006, 22: 1540-1542. 10.1093/bioinformatics/btl117.PubMedView ArticleGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5: R80-10.1186/gb-2004-5-10-r80.PubMedPubMed CentralView ArticleGoogle Scholar
- Ihaka RG: R: A Language for Data Analysis and Graphics. J Comput Graph Stat. 1996, 5: 299-314. 10.2307/1390807.Google Scholar
- Smyth GK: Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004, 3: Article3-PubMedGoogle Scholar
- Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.PubMedPubMed CentralView ArticleGoogle Scholar
- Tibshirani R, Hastie T, Narasimhan B, Chu G: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA. 2002, 99: 6567-6572. 10.1073/pnas.082099299.PubMedPubMed CentralView ArticleGoogle Scholar
- Raychaudhuri S, Stuart JM, Altman RB: Principal components analysis to summarize microarray experiments: application to sporulation time series. Pac Symp Biocomput. 2000, 5: 455-466.Google Scholar
- Smyth GK, Michaud J, Scott HS: Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005, 21: 2067-2075. 10.1093/bioinformatics/bti270.PubMedView ArticleGoogle Scholar
- Allinen M, Beroukhim R, Cai L, Brennan C, Lahti-Domenici J, Huang H, Porter D, Hu M, Chin L, Richardson A, et al: Molecular characterization of the tumor microenvironment in breast cancer. Cancer Cell. 2004, 6: 17-32. 10.1016/j.ccr.2004.06.010.PubMedView ArticleGoogle Scholar
- Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, Montgomery K, Chi JT, van de Rijn M, Botstein D, Brown PO: Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol. 2004, 2: E7-10.1371/journal.pbio.0020007.PubMedPubMed CentralView ArticleGoogle Scholar
- Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, et al: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA. 2003, 100: 8418-8423. 10.1073/pnas.0932692100.PubMedPubMed CentralView ArticleGoogle Scholar
- West RB, Nuyten DSA, Subramanian S, Nielsen TO, Corless CL, Rubin BP, Montgomery K, Zhu S, Patel R, Boussard TH, et al: Determination of stromal signatures in breast carcinoma. PLoS Biol. 2005, 3: e187-10.1371/journal.pbio.0030187.PubMedPubMed CentralView ArticleGoogle Scholar
- Luzzi V, Mahadevappa M, Raja R, Warrington JA, Watson MA: Accurate and reproducible gene expression profiles from laser capture microdissection, transcript amplification, and high density oligonucleotide microarray analysis. J Mol Diagn. 2003, 5: 9-14.PubMedPubMed CentralView ArticleGoogle Scholar
- Patel OV, Suchyta SP, Sipkovsky SS, Yao J, Ireland JJ, Coussens PM, Smith GW: Validation and application of a high fidelity mRNA linear amplification procedure for profiling gene expression. Vet Immunol Immunopathol. 2005, 105: 331-342. 10.1016/j.vetimm.2005.02.018.PubMedView ArticleGoogle Scholar
- Rudnicki M, Eder S, Schratzberger G, Mayer B, Meyer TW, Tonko M, Mayer G: Reliability of t7-based mRNA linear amplification validated by gene expression analysis of human kidney cells using cDNA microarrays. Nephron Exp Nephrol. 2004, 97: e86-e95. 10.1159/000078642.PubMedView ArticleGoogle Scholar
- Schneider J, Buness A, Huber W, Volz J, Kioschis P, Hafner M, Poustka A, Sultmann H: Systematic analysis of T7 RNA polymerase based in vitro linear RNA amplification for use in microarray experiments. BMC Genomics. 2004, 5: 29-10.1186/1471-2164-5-29.PubMedPubMed CentralView ArticleGoogle Scholar
- de Bruin EC, van de Pas S, Lips EH, van Eijk R, van der Zee MM, Lombaerts M, van Wezel T, Marijnen CA, van Krieken JH, Medema JP, et al: Macrodissection versus microdissection of rectal carcinoma: minor influence of stroma cells to tumor cell gene expression profiles. BMC Genomics. 2005, 6: 142-10.1186/1471-2164-6-142.PubMedPubMed CentralView ArticleGoogle Scholar
- Moinfar F, Man YG, Arnould L, Bratthauer GL, Ratschek M, Tavassoli FA: Concurrent and independent genetic alterations in the stromal and epithelial cells of mammary carcinoma: implications for tumorigenesis. Cancer Res. 2000, 60: 2562-2566.PubMedGoogle Scholar
- Simon R, Radmacher MD, Dobbin K, McShane LM: Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003, 95: 14-18.PubMedView ArticleGoogle Scholar
- Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, Montgomery K, Chi J-T, van de Rijn M, Botstein D, Brown PO: Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol. 2004, 2: E7-10.1371/journal.pbio.0020007.PubMedPubMed CentralView ArticleGoogle Scholar
- Deng G, Lu Y, Zlotnikov G, Thor AD, Smith HS: Loss of heterozygosity in normal tissue adjacent to breast carcinomas. Science. 1996, 274: 2057-2059. 10.1126/science.274.5295.2057.PubMedView ArticleGoogle Scholar
- Forsti A, Louhelainen J, Soderberg M, Wijkstrom H, Hemminki K: Loss of heterozygosity in tumor-adjacent normal tissue of breast and bladder cancer. Eur J Cancer. 2001, 37: 1372-1380. 10.1016/S0959-8049(01)00118-6.PubMedView ArticleGoogle Scholar
- Ellsworth DL, Ellsworth RE, Liebman MN, Hooke JA, Shriver CD: Genomic instability in histologically normal breast tissues: implications for carcinogenesis. Lancet Oncol. 2004, 5: 753-758. 10.1016/S1470-2045(04)01653-5.PubMedView ArticleGoogle Scholar
- Kurose K, Hoshaw-Woodard S, Adeyinka A, Lemeshow S, Watson PH, Eng C: Genetic model of multi-step breast carcinogenesis involving the epithelium and stroma: clues to tumor-microenvironment interactions. Hum Mol Genet. 2001, 10: 1907-1913. 10.1093/hmg/10.18.1907.PubMedView ArticleGoogle Scholar
- Larson PS, de las Morenas A, Bennett SR, Cupples LA, Rosenberg CL: Loss of heterozygosity or allele imbalance in histologically normal breast epithelium is distinct from loss of heterozygosity or allele imbalance in co-existing carcinomas. Am J Pathol. 2002, 161: 283-290.PubMedPubMed CentralView ArticleGoogle Scholar
- Leong KG, Hu X, Li L, Noseda M, Larrivee B, Hull C, Hood L, Wong F, Karsan A: Activated Notch4 inhibits angiogenesis: role of beta 1-integrin activation. Mol Cell Biol. 2002, 22: 2830-2841. 10.1128/MCB.22.8.2830-2841.2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Zeng Q, Li S, Chepeha DB, Giordano TJ, Li J, Zhang H, Polverini PJ, Nor J, Kitajewski J, Wang CY: Crosstalk between tumor and endothelial cells promotes tumor angiogenesis by MAPK activation of Notch signaling. Cancer Cell. 2005, 8: 13-23. 10.1016/j.ccr.2005.06.004.PubMedView ArticleGoogle Scholar
- Chang HY, Nuyten DSA, Sneddon JB, Hastie T, Tibshirani R, Sorlie T, Dai H, He YD, van't Veer LJ, Bartelink H, et al: Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci USA. 2005, 102: 3738-3743. 10.1073/pnas.0409462102.PubMedPubMed CentralView ArticleGoogle Scholar
- Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, van't Veer LJ, Perou CM: Concordance among gene-expression-based predictors for breast cancer. N Engl J Med. 2006, 355: 560-569. 10.1056/NEJMoa052933.PubMedView ArticleGoogle Scholar
- Clarke RB, Howell A, Potten CS, Anderson E: Dissociation between steroid receptor expression and cell proliferation in the human breast. Cancer Res. 1997, 57: 4987-4991.PubMedGoogle Scholar
- Petersen OW, Hoyer PE, van Deurs B: Frequency and distribution of estrogen receptor-positive cells in normal, nonlactating human breast tissue. Cancer Res. 1987, 47: 5748-5751.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.