Skip to content


  • Poster Presentation
  • Open Access

Subclassification and molecular characterization of early stage breast carcinomas using Applied Biosystems Human Genome Survey Microarrays

  • 1,
  • 2,
  • 3,
  • 1,
  • 2,
  • 1,
  • 2 and
  • 1
Breast Cancer Research20057 (Suppl 2) :P4.27

  • Published:


  • Array Platform
  • Predictor Gene
  • Basal Subtype
  • Intrinsic Gene
  • Supervise Analysis


Gene expression profiling has been used to define molecular phenotypes of complex diseases such as breast cancer. The Luminal A and Basal subtypes have been repeatedly identified and validated as the two main subtypes out of the total of five of breast tumors originally identified by Perou and colleagues [1]. These two subtypes of breast tumors have also been associated with a significant difference in clinical outcome: the Luminal A subtype patients have been correlated with a significantly longer overall survival or they lived considerably longer before experiencing relapse or metastatic disease, whereas patients with Basal subtype tumors showed the shortest overall survival time or experienced much shorter disease-free time intervals [2, 3]. To further substantiate the prognostic value of such expression-based phenotypes in diagnosis/prognosis of breast cancer, we report here an extended study on identification and molecular characterization of clinically relevant subtypes in early stage breast carcinomas.


In this study, we profiled 20 biopsy tissues from early stage breast carcinomas using the Applied Biosystems Human Genome Survey Microarrays, which is a relatively new array platform containing 31,700 60-mer oligonucleotide probes representing a set of 27,868 individual human genes, with single-color chemiluminescence detection. To identify the subtypes in these tumors, we first carried out a centroid correlation analysis coupled with an unsupervised hierarchical clustering analysis. We utilized the 'intrinsic' gene list consisting of 534 genes that have been used to define the five subtypes of breast tumors and their core expression centroids in 122 previously published breast tumors samples [3]. Using the mapped 526 intrinsic genes, we computed the Pearson's correlation coefficient of each sample from this study to each of the five centroids and assigned each sample to the subtype to which it showed the highest correlation. As a second approach, we applied a supervised analysis using the 'Nearest Shrunken Centroid classifier' and the PAM software [4]. We took the previously published 122 Norway/Stanford tumor samples and the mapped 526 intrinsic genes as the training set to identify the predictor genes for the five subtypes. We then used this classifier to predict the subtypes of each of the 20 early stage carcinomas analyzed in this study. The same analyses were applied on parallel datasets generated from Stanford cDNA Arrays and Agilent Human Whole Genome Arrays. Welch-ANOVA analysis coupled with Benjamini and Hochberg False Discovery Rate multiple testing corrections were performed to identify the 'signature' genes that are most differentially expressed between the subtypes. PANTHER™ protein classification analysis (Applied Biosystems, Foster City, CA, USA) [5, 6] and PathArt™ (Jubilant Biosys Ltd) pathway analysis were carried out to identify molecular mechanisms underlying these 'signature genes'. A minimal set of genes that best discriminated the two identified subtypes were determined using PAM analysis on the combined datasets generated on the three different array platforms.


Both unsupervised and supervised analysis identified the two main clinically relevant subtypes of breast cancer, Luminal A (correlated with a relatively good outcome) and Basal-like (correlated with the poorest outcome). The identification of the Luminal A and Basal subtypes in these early stage breast carcinomas was further validated by parallel data generated from Stanford cDNA Arrays and Agilent Human Whole Genome Arrays. Statistical analysis identified 1210 genes as signature genes characterizing the two subtypes of breast cancer. Protein function and biological pathway analysis on these signature genes revealed different molecular mechanisms descriptive of the two expression-based subtypes: signature genes of the Luminal A subtype were over-represented by genes involved in fatty acid metabolism and steroid hormone-mediated signaling pathways, in particular estrogen receptor-signaling, while signature genes of the Basal-like subtype were over-represented by genes involved in cell proliferation and differentiation, the p21-mediated pathway, and the G1-S checkpoint of cell cycle signaling pathways. Finally, we identified a minimal set of 59 predictor genes to best discriminate and characterize the Luminal A and Basal subtypes using PAM analysis on the combined data from the three array platforms. These predictor genes were further verified by TaqMan® expression assays.


We have identified and validated the two previously defined clinically relevant subtypes, Luminal A and Basal, in early stage breast carcinomas. This finding further substantiates the prognostic value of such expression-defined phenotypes in breast cancer at an earlier stage. Signature genes characterizing these two subtypes also revealed that distinct molecular mechanisms have been preprogrammed at an early stage in the different subtypes of the disease. Our results provide further evidence that these breast tumor subtypes represent biologically distinct disease entities and may require different therapeutic strategies. Finally, validated by multiple gene expression platforms, the set of 59 predictor genes identified in this study define potential prognostic molecular markers for breast cancer.

Authors’ Affiliations

Applied Biosystems, Foster City, California, USA
Department of Genetics, The Norwegian Radium Hospital, Montebello, Oslo, Norway
Celera Genomics, Rockville, Maryland, USA


  1. Perou CM, Sørlie T, Eisen MB, et al: Nature. 2000, 406: 747-752. 10.1038/35021093.View ArticlePubMedGoogle Scholar
  2. Sørlie T, Perou CM, Tibshirani R, et al: Proc Natl Acad Sci USA. 2001, 98: 10869-10874. 10.1073/pnas.191367098.View ArticlePubMedPubMed CentralGoogle Scholar
  3. Sørlie T, Tibshiranhi R, Parker J, et al: Proc Natl Acad Sci USA. 2003, 100: 8418-8423. 10.1073/pnas.0932692100.View ArticlePubMedPubMed CentralGoogle Scholar
  4. Tibshirani R, Hastie T, Narasimhan B, et al: Proc Natl Acad Sci USA. 2002, 99: 6567-6572. 10.1073/pnas.082099299.View ArticlePubMedPubMed CentralGoogle Scholar
  5. PANTHER™ Classification System. []
  6. Thomas PD, Kejariwal A, Campbell MJ, et al: Nucleic Acids Res. 2003, 31: 334-341. 10.1093/nar/gkg115.View ArticlePubMedPubMed CentralGoogle Scholar


© BioMed Central 2005