The CD44+/CD24- phenotype is enriched in basal-like breast tumors

Introduction Human breast tumors are heterogeneous and consist of phenotypically diverse cells. Breast cancer cells with a CD44+/CD24- phenotype have been suggested to have tumor-initiating properties with stem cell-like and invasive features, although it is unclear whether their presence within a tumor has clinical implications. There is also a large heterogeneity between tumors, illustrated by reproducible stratification into various subtypes based on gene expression profiles or histopathological features. We have explored the prevalence of cells with different CD44/CD24 phenotypes within breast cancer subtypes. Methods Double-staining immunohistochemistry was used to quantify CD44 and CD24 expression in 240 human breast tumors for which information on other tumor markers and clinical characteristics was available. Gene expression data were also accessible for a cohort of the material. Results A considerable heterogeneity in CD44 and CD24 expression was seen both between and within tumors. A complete lack of both proteins was evident in 35% of the tumors, while 13% contained cells of more than one of the CD44+/CD24-, CD44-/CD24+ and CD44+/CD24+ phenotypes. CD44+/CD24- cells were detected in 31% of the tumors, ranging in proportion from only a few to close to 100% of tumor cells. The CD44+/CD24- phenotype was most common in the basal-like subgroup – characterized as negative for the estrogen and progesterone receptors as well as for HER2, and as positive for cytokeratin 5/14 and/or epidermal growth factor receptor, and particularly common in BRCA1 hereditary tumors, of which 94% contained CD44+/CD24- cells. The CD44+/CD24- phenotype was surprisingly scarce in HER2+ tumors, which had a predominantly CD24+ status. A CD44+/CD24- gene expression signature was generated, which included CD44 and α6-integrin (CD49f) among the top-ranked overexpressed genes. Conclusion We demonstrate an association between basal-like and particularly BRCA1 hereditary breast cancer and the presence of CD44+/CD24- cells. Not all basal-like tumors and very few HER2+ tumors, however, contain CD44+/CD24- cells, emphasizing that a putative tumorigenic ability may not be confined to cells of this phenotype and that other breast cancer stem cell markers remain to be identified.


Introduction
Human breast cancer is a truly complex disease with a large inter-tumoral and intra-tumoral heterogeneity resulting in highly variable clinical behavior and response to therapy. The maintenance of the heterogeneity of cells within a tumor is not fully understood. Possibly, every cell within a tumor may have a capacity to proliferate and form new tumors, although the likelihood for each cell is very low. Alternatively, only a small subset of cells with distinct characteristics has the capacity to maintain tumor growth. CK = cytokeratin; DAB = diaminobenzidene; EGFR = epidermal growth factor receptor; ER = estrogen receptor; PgR = progesterone receptor; SR = steroid receptor; TMA = tissue microarray.
A population of CD44 + /CD24 -/low cells has been demonstrated to have tumor-initiating properties in breast cancer [1]. This tumorigenic phenotype has been associated with stem cell-like characteristics [2], with enhanced invasive properties [3], with radiation resistance [4] and with distinct genetic profiles suggesting correlation to adverse prognosis [5]. The prevalence of CD44 + /CD24 -/low cells within breast tumors, however, has not been significantly associated with clinical characteristics -although tumors with a higher fraction of CD44 + /CD24cells were more commonly found in patients diagnosed with distant metastases [6].
Breast cancers have been classified based on their gene expression profiles into luminal A and B, basal-like, HER2+ and normal breast-like subtypes [7][8][9][10]. These subtypes have been associated with diverse tumor characteristics and clinical outcome. The luminal subtypes are associated with expression of the estrogen receptor (ER), while basal-like and normal-like tumors are essentially all ER-negative, as are the majority of HER2+ tumors. Multiple studies have demonstrated basal-like tumors to have a particularly poor prognosis [8,9,11], although it is unclear whether basal-like tumors have a significantly worse clinical outcome than other ER-negative tumors [12]. Immunohistochemical features have been used to characterize basal-like tumors as typically negative for ER, for the progesterone receptor (PgR) and for HER2 but positive for basal cytokeratins (CK5/6/14/17), for epidermal growth factor receptor (EGFR) and/or for c-kit [13][14][15].
A correlation of the CD44 + /CD24 -/low phenotype to specific breast cancer subtypes has not yet been reported in human breast tumors. In the present article we have determined the expression of CD44 and CD24 in human breast tumors using double-staining immunohistochemistry and have correlated the presence of CD44 + /CD24cells to subgroups of breast cancer, classified using the expression of ER, PgR, HER2, CK5/14 and EGFR, as well as by mRNA expression profiles. We demonstrate an association of the CD44 + /CD24phenotype to basal-like and BRCA1 hereditary breast cancer.

Patients
We studied 240 tumors from a cohort of 445 patients surgically treated for stage II breast cancer (age 31 to 81 years), diagnosed in the South Swedish Health Care Region between 1985 and 1994 and originally participating in two randomized clinical trials [16,17]. All patients received 2 years of adjuvant tamoxifen treatment, without stratification according to ER status. The median follow-up time for patients alive and free from metastasis at the last follow-up visit was 5.3 years. The current study was approved by the Lund University Medical Ethics Committee.

Tumor characteristics
Fresh-frozen tumor tissue was used for routine determination of cytosolic ER and PgR, as well as the S-phase fraction, using an enzyme immunoassay and DNA flow cytometry, respectively, as described earlier [18,19]. Cores of 0.6 mm diameter formalin-fixed, paraffin-embedded tumor tissue were used to generate tissue microarrays (TMAs) for the 445 cases. Three cores from every individual tumor were arrayed. These TMAs have been used for immunohistochemical staining of CK5, CK14, EGFR and cytokeratin clone AE1/AE3 as described previously [12,20]. A pathologist re-evaluated the histological type on whole formalin-fixed paraffin-embedded sections [21].

BRCA1 tissue microarray
An additional TMA consisting of tumors from 23 BRCA1 germline mutation carriers, diagnosed in Sweden between 1980 and 2001, was used for evaluation purpose. This TMA was generated as described above, including two or three cores from each tumor.

Immunohistochemical staining
Sections (4 μm) of the TMA blocks were mounted on Dako REAL™ Capillary Gap Microscope Slides (DAKO, Glostrup, Denmark), were deparaffinized in xylene and were rehydrated in ethanol. Antigen retrieval was achieved either by placing slides in Tris-ethylenediamine tetraacetic acid buffer (pH 9.0) at 125°C in a 2100 Retriever (PickCell Laboratories, Amsterdam, the Netherlands) for 5 minutes (CD44/CD24), or by microwaving the slides in Tris-ethylenediamine tetraacetic acid buffer (pH 9.0) for 7 minutes at 800 W followed by 15 minutes at 350 W (HER2).
HER2 was detected with a rabbit polyclonal primary antibody (A0485, 1:1,000; DAKO) followed by EnVision™ on a Tech-Mate™ (DAKO). All slides were counterstained with hematoxylin for the identification of nuclei.

Immunohistochemical evaluation
The scoring was performed twice by one person in a blinded fashion. All unclear cases were discussed with a pathologist. In case of discrepant staining between the three cores from the same patient, an average was used.
HER2 scoring was carried out according to the standard procedure (DAKO): 0, < 10% of the tumor cells showed (page number not for citation purposes) membranous staining; 1, > 10% of the tumor cells were positive but not circumferential; 2, weak staining around the whole membrane in > 10% of the tumor cells; and 3, strong staining around the membrane in > 10% of the tumor cells. CD44 staining was detected mainly in the membrane and the scoring was as follows: 0, 0% positive tumor cells; 1, 1% to 10% positive cells; 2, 11% to 50% positive cells; 3, 51% to 75% positive cells; and 4, 76% to 100% positive cells. CD24 staining was detected mainly in the cytoplasm and the scoring was performed as described for CD44.
The proportion of CD44 + /CD24tumor cells was determined as the percentage of cells positive for Permanent Red staining but negative for DAB staining. The frequencies of CD44 -/ CD24 + cells and of CD44 + /CD24 + cells were determined in a similar fashion.

Statistical analysis
Associations between the presence of CD44, CD24 or different CD44/CD24 phenotypes and clinical variables as well as breast cancer subgroups were assessed by Fisher's exact test, except for age where the Mann-Whitney U test was used. The Kaplan-Meier method was used to estimate distant disease-free survival, and the log-rank test to compare survival between two strata. All tests were two-sided and P < 0.05 was considered significant. Statistical analyses were carried out using Stata 10.0 software (Stata Corporation, College Station, TX, USA).

Microarrays and data analysis
For 168 of the 445 tumors in our cohort, mRNA expression analysis has previously been performed using cDNA microarrays with 27,648 reporters [22,23]. The microarray data for these 168 tumors are available through the Gene Expression Omnibus database (accession numbers GSE6577 and GSE5325). Data pre-processing and filtering for the selected 168 tumors were performed using the BioArray Software Environment [24] as previously described [23], leaving 15,040 reporters that were used for all subsequent analyses.
Three independent sets of gene signatures were used to further characterize the tumors. Genes were matched across datasets based on gene symbols, and matched genes were centralized in our dataset across the 168 tumors. Nearest centroid classifiers were used for the Sørlie and colleagues' [9] and the Hu and colleagues' [10] subtype classifications, with each tumor classified based on to which centroid it was most correlated using Pearson correlation. The average expression level for the genes in a signature was used to characterize tumors for the Shipitsin and colleagues' gene signatures [5].
Hierarchical clustering was performed using MeV in the TM4 system [25] with complete linkage, Pearson correlation distance and gene centralization. Genes with different expression levels in tumors containing CD44 + /CD24cells and tumors lacking cells with this phenotype were identified using a twosided t test. To account for multiple comparisons, the false discovery rate was calculated for gene lists. Sixty-nine of the 168 tumors with gene expression profiles were among the 240 tumors for which CD24 and CD44 immunohistochemical stainings were obtained.

Immunohistochemical expression of CD44 and CD24
We analyzed the presence of CD44 and CD24 antigens on human breast cancer tissues using double-staining immunohistochemistry. The CD44 and CD24 expression was successfully determined in 240 cases after excluding tumors with scarce tissue on the TMA. These 240 tumors did not significantly differ from the excluded 205 tumors in regard to tumor size, lymph node status, S-phase fraction, ER status or disease-free survival. The median age was slightly higher in the included patients (64 years versus 62 years for excluded patients), although not reaching significance.  Table 1. CD44 showed mainly membranous staining (Permanent Red) with only six tumors displaying cytoplasmic staining, four of which had also membrane staining. Thirty-two percent (77/240) of the tumors had ≥ 1% cells with membranous and/or cytoplasmic CD44 expression and were considered CD44 + . CD24 was almost exclusively detected in the cytoplasm, with only four tumors displaying membrane DAB staining (all four with positive cytoplasmic staining as well). Forty-six percent (110/240) of the tumors had ≥ 1% cells with CD24 staining and were considered CD24 + .
The presence of CD44 -/CD24 + tumor cells was solely associated with strong HER2 staining (P = 0.002) and not with any other tumor characteristics. The presence of double-positive (CD44 + /CD24 + , n = 15) tumor cells was not associated with any tumor features, although an increase of tumors of medullary type was indicated compared with tumors lacking cells with this phenotype (23% versus 2%).

CD44 + /CD24status and survival
We did not see any correlation between CD44 + /CD24status and distant disease-free survival, nor between CD44 + /CD24status and site of distant recurrence. Factors significantly correlated to favorable distant disease-free survival at 5-year follow-up (in this cohort of patients receiving adjuvant tamoxifen therapy) included positive ER and PgR status (P = 0.005 and P = 0.037, respectively) and negative CK5/14 and EGFR status (P = 0.035 and P = 0.005, respectively), but not HER2 and lymph node status, while tumor size and a high S-phase fraction reached marginal significance. Lymph node status, however, was correlated to distant disease-free survival in the entire cohort of 445 tumors (P = 0.005).

Definition of breast cancer subgroups by tumor markers and correlation with gene expression subtypes
The CD44 + /CD24phenotype was clearly related to certain tumor biological characteristics. To study this relation in more depth, we used five protein markers available for 232 out of the 240 cases with CD44/CD24 data to define five tumor subgroups. Tumors positive for ER and/or PgR were designated steroid receptor positive (SR+). The SR and HER2 status was used to broadly divide tumors into four subgroups; SR+HER2-(n = 150), SR+HER2+ (n = 14), SR-HER2+ (n = 24), and SR-HER2-(triple negative, n = 44). The latter group was further subdivided into a subgroup expressing basal CK5/14 and/or EGFR (SR-HER2-basal-like, n = 30), and a subgroup negative for all five markers (SR-HER2-nonbasal, n = 14).
Gene expression data were available for 69 of the 232 tumors, which allowed us to correlate our five subgroups, as defined by the five tumor markers, with breast cancer subtypes defined by gene expression profiling and intrinsic gene lists [9,10]. As seen in Figure 2, SR+HER2-tumors were clearly enriched for luminal A tumors, while SR-HER2-basal-like tumors corresponded prominently to the basal-like subtype defined by gene expression profiling. As expected, SR-HER2+ tumors correlated well with the HER2+ subtype. The SR-HER2-nonbasal subgroup showed no clear association with any tumor subtype -and only two tumors were SR+HER2+, making it difficult to draw any conclusions for this subgroup. We therefore conclude that tumor classification based on a combination of five commonly used tumor protein markers is biologically relevant for subgroup analysis in our tumor material.

CD44/CD24 status in different breast cancer subgroups
The expression of CD44 and CD24 differed significantly between the subgroups (P = 0.001 and P = 0.035, respectively) ( Table 3). CD44 was highly expressed in the SR-HER2basal-like subgroup, with 63% of the tumors being positive compared with 32% for the entire cohort, and was very lowly expressed in the HER2+ groups, with only 14% and 17% of the tumors being positive in the SR+HER2+ and SR-HER2+ groups, respectively. CD24 was highly expressed in the SR-HER2+ group (75% compared with 47% for the entire cohort).
The frequency of tumors positive for the CD44 + /CD24phenotype varied significantly between the different subgroups (P < 0.001, Table 3). While close to two-thirds (63%) of tumors resembling the basal-like subtype (SR-HER2-basal-like) expressed CD44 + /CD24cells, this phenotype was very low in HER2+ tumors, regardless of SR status: 8% in SR-HER2+ tumors and 14% in SR+HER2+ tumors ( Table 3). The divergence between subgroups remained when the cutoff level for CD44 + /CD24was raised to 10% or 50% (P = 0.003 and P = 0.033, respectively). The frequency of CD44 + /CD24cells within tumors positive for this phenotype was also higher in basal-like tumors, as indicated in Figure 2 for the 69 tumors with gene expression data.

CD44 + /CD24status in BRCA1-defective tumors
Since there was a clear correlation between the CD44 + / CD24status and the basal-like tumor subtype, we extended the immunohistochemical analysis to an additional TMA including tumors from BRCA1 hereditary breast cancer patients, known to be of predominantly basal-like phenotype [9]. A basal-like status was verified for nine of our BRCA1 hereditary tumors for which gene expression data were available.
Seventeen of the 23 BRCA1-defective tumors were successfully stained for CD44 and CD24 expression. The frequency of tumors with different proportions of CD44 + /CD24cells is presented in Table 4. Ninety-four percent (16/17) of the tumors were positive for this phenotype, thus corroborating the finding of a high proportion of CD44 + /CD24cells in the SR-HER2-basal-like subgroup (Table 4). BRCA1 germline mutant tumors, however, also had a high proportion (70%) of cells positive for the CD44 -/CD24 + phenotype, considerably higher than the 40% seen among SR-HER2-basal-like tumors.

Correlation to published prognostic gene signatures
We correlated the presence of CD44 + /CD24cells to two gene signatures with prognostic value specific for either CD44 + or CD24 + breast cancer cells published by Shipitsin and colleagues [5]. Signatures A and B are associated with shorter and longer distant recurrence-free survival time, respectively. A positive correlation of CD44 + /CD24cells to signature A was seen if the cutoff level for CD44 + /CD24was 50% or 75% positive cells (P = 0.05 and P = 0.008, respectively) but not when using lower cutoff levels. A negative correlation to signature B was seen when using cutoff levels of 0%, 10% or 75% positive cells (P < 0.001, P < 0.001 and P = 0.01, respectively).

Gene expression signature of the CD44 + /CD24phenotype
To identify a gene expression signature for the CD44 + /CD24phenotype, we used the 69 tumors with gene expression data and looked for genes differentially expressed between tumors containing CD44 + /CD24cells and tumors lacking such cells. The top 20 genes are displayed in Table 5. Interestingly, CD44 emerged as the second ranked gene, demonstrating a good correspondence between protein and mRNA levels for this gene -although the top 20 genes collectively had a relatively high false discovery rate of 28%.

Discussion
The concept of cancer stem cells relies on the presence of a subpopulation of cells within tumors that drives tumorigenesis, as well as giving rise to a large population of differentiated progeny that constitute the bulk of the tumor but lack tumorigenic potential [26]. Multiple studies indicate that CD44 + / CD24breast cancer cells have tumor-initiating properties [1][2][3].
In the present study, we have explored the dual expression of CD44 and CD24 in a sample of 240 stage II breast tumors with specific regard to breast cancer subtypes. The CD44 staining was almost exclusively membranous, which is concordant with prior literature [27], while CD24 predominantly stained the cytoplasm. Earlier publications have shown either membranous and/or cytoplasmic CD24 staining [28,29]. Bircan and colleagues observed a cytoplasmic CD24 staining pattern in neoplastic breast tissue while it was mainly detected in the cell membrane in normal breast [30]. Intracytoplasmic CD24 expression has been suggested to reflect overexpression of the protein or disturbance of the protein distribution or degradation in neoplastic cells [31]. It is reasonable to suspect that cells with a cytoplasmic CD24 pattern also express CD24 protein on the cell surface. The specificity of the CD24 antibody has earlier been ascertained by flow cytometric analysis [32].
Overall, we saw a large heterogeneity of CD44 and CD24 expression between tumors, but also within tumors where the proportion of positive cells varied considerably. Interestingly, tumor cells were mostly positive for either CD44 or CD24 and rather few tumors contained double-positive cells, although it was quite common that individual tumors contained both CD44 + and CD24 + cells. A recent study by Shipitsin and coworkers implicated that CD24 + and CD44 + cells within breast carcinomas represent defined cell populations with distinct genetic profiles [5]. They showed that CD24 + cells were more differentiated while CD44 + cells had more progenitor-like features, suggesting that CD24 + cells might be derived from CD44 + cells. Findings from our study could support this hypothesis, in that the variable presence of tumor cells that are largely either CD44 + or CD24 + may reflect the current state of a tumor undergoing constant cell renewal, differentiation and death at a pace defined by their intrinsic machinery and interaction with surrounding stroma.
Contradictive to results by Al-Hajj and colleagues demonstrating CD44 + /CD24cells in all their breast cancer samples [1], we only detected cells with this phenotype in 31% of our tumors. This discordance could depend on their study involving mainly metastatic tissues, including only one primary tumor. Metastatic tumor cells may have a more stem cell-like Table 4 Scoring
We found that the CD44 + /CD24status was associated with low/negative HER2 expression and with elevated expression of CK5/14 and EGFR, as well as with medullary histological type, all known characteristics of the basal-like subtype of breast cancer. This motivated further analysis of the prevalence of CD44 + /CD24cells in different tumor subtypes.
Since previous studies have demonstrated a good resemblance of subgroups defined by common tumor markers to molecular subtypes defined by mRNA expression patterns [13,14], we used cytosolic protein levels of ER and PgR and used TMA immunostaining of HER2, CK5/14 and EGFR to classify the material into five tumor subgroups. Using available gene expression data for 69 of the tumors and using published intrinsic gene lists [9,10], we could demonstrate a reasonable correlation to molecular subtypes for three of our five sub-groups, justifying a subgroup analysis of CD44 + /CD24expression. We could thereby demonstrate an association between the presence of CD44 + /CD24tumor cells and a basal-like subgroup of breast cancer. This finding is consistent with a recent publication where Sheridan and colleagues observed a correlation between breast cancer cell lines with a basal/myoepithelial origin and CD44 + /CD24expression [3].
Moreover, we observed that basal-like tumors often had a higher proportion of CD44 + /CD24cells, while tumors of other subtypes that contained CD44 + /CD24cells generally had a lower number of cells with this phenotype. This observation corroborates prior work indicating that basal-like tumors have a greater stem cell-like phenotype [35]. These tumors may originate from the most primitive ER-negative stem/progenitor cells, suggesting a block in differentiation upstream of ER-positive progenitor cells [36]. Tumors developing in BRCA1 germline mutation carriers are typically of basal-like subtype, possibly due to the critical role BRCA1 plays in the differentiation of ER-negative stem/progenitor cells to ER-positive luminal cells [37]. Accordingly, we found that all but one of the 17 BRCA1 tumors contained CD44 + /CD24cells, further illus- trating the correlation between CD44 + status and basal-like/ BRCA1-like tumors.
Interestingly, the HER2+ tumor subtype, generally considered an aggressive form of breast cancer, displayed a low frequency of tumors containing CD44 + /CD24cells and was in fact often positive for the CD44 -/CD24 + phenotype. CD24 expression has been shown to contribute to the more differentiated state of committed cells [38], but has also been associated with rapid cell spreading, increased cell motility and invasion [39]. In normal breast epithelium, basal/myoepithelial cells but not luminal epithelial cells express CD44 [27] while CD24 is highly expressed in luminal cells [40]. The predominantly CD24 + phenotype of HER2+ tumors may reflect the origin for at least some of these tumors from a CD24 + luminal epithelial cell type [40].
We did not see any association between the CD44 + /CD24status and markers known to be important for the clinical outcome, including tumor size, nodal status or S-phase fraction. These results are partly in accordance with previous observations by Abraham and colleagues [6], although these authors reported that tumors with a higher fraction of CD44 + /CD24cells were more commonly found in patients diagnosed with distant metastases. We saw no such trend in our material, regardless of whether analyzing the whole material or different subgroups separately.
Since all patients in our study received adjuvant tamoxifen therapy the lack of correlation to survival should however be cautiously interpreted. We therefore evaluated prognostic gene signatures specific for either CD44 + cells (signature A) or CD24 + cells (signature B) published by Shipitsin and colleagues [5], shown to be associated with short and long distant recurrence-free survival, respectively, in patients not undergoing adjuvant systemic therapy. Interestingly, we saw a significant positive correlation of tumors with a high (> 50%) proportion of CD44 + /CD24cells to signature A, and a significant negative correlation between the presence of CD44 + / CD24cells and signature B. This observation indicates that a high proportion of CD44 + /CD24cells could therefore be a marker of aggressive phenotype also in our material, although we saw no correlation with prognosis.
We detected a gene expression signal for tumors containing CD44 + /CD24cells. Although the false discovery rate was high, the signature is strengthened by the fact that CD44 occurs as the top second gene. Interestingly, α 6 -integrin also appears as one of the top overexpressed genes. This gene (also known as CD49f) has previously been used to identify mammary epithelial stem cells [41] and was recently demonstrated to be necessary for tumorigenicity of MCF7 breast cancer cells [42].

Conclusion
Our results demonstrate a clear variation in the prevalence of CD44 + /CD24tumor cells between breast tumors of different subtypes. The occurrence of this phenotype is high in basallike tumors -and especially in BRCA1 hereditary tumors -is lower in tumors of luminal type and is particularly low in the HER2+ tumors, irrespective of ER status. These results emphasize the biological heterogeneity of breast cancer and an enrichment of putative tumor-initiating cells in the aggressive basal-like tumor subtype. Far from all basal-like tumors contain CD44 + /CD24cells, however, and their scarcity in HER2+ tumors suggests that tumorigenicity may not be confined to cells of this phenotype and that other markers remain to be identified. Moreover, the obvious heterogeneity of cells with various CD44/CD24 expression within individual tumors may be indicative of a cancer stem cell subpopulation giving rise to more differentiated and committed cell populations. This does by no means exclude the coexistence of cancer cell clones of independent origin, evolution and tumorigenic ability.