Human basal-like breast cancer is represented by one of the two mammary tumor subtypes in dogs

Background About 20% of breast cancers in humans are basal-like, a subtype that is often triple-negative and difficult to treat. An effective translational model for basal-like breast cancer is currently lacking and urgently needed. To determine whether spontaneous mammary tumors in pet dogs could meet this need, we subtyped canine mammary tumors and evaluated the dog–human molecular homology at the subtype level. Methods We subtyped 236 canine mammary tumors from 3 studies by applying various subtyping strategies on their RNA-seq data. We then performed PAM50 classification with canine tumors alone, as well as with canine tumors combined with human breast tumors. We identified feature genes for human BLBC and luminal A subtypes via machine learning and used these genes to repeat canine-alone and cross-species tumor classifications. We investigated differential gene expression, signature gene set enrichment, expression association, mutational landscape, and other features for dog–human subtype comparison. Results Our independent genome-wide subtyping consistently identified two molecularly distinct subtypes among the canine tumors. One subtype is mostly basal-like and clusters with human BLBC in cross-species PAM50 and feature gene classifications, while the other subtype does not cluster with any human breast cancer subtype. Furthermore, the canine basal-like subtype recaptures key molecular features (e.g., cell cycle gene upregulation, TP53 mutation) and gene expression patterns that characterize human BLBC. It is enriched in histological subtypes that match human breast cancer, unlike the other canine subtype. However, about 33% of canine basal-like tumors are estrogen receptor negative (ER−) and progesterone receptor positive (PR+), which is rare in human breast cancer. Further analysis reveals that these ER−PR+ canine tumors harbor additional basal-like features, including upregulation of genes of interferon-γ response and of the Wnt-pluripotency pathway. Interestingly, we observed an association of PGR expression with gene silencing in all canine tumors and with the expression of T cell exhaustion markers (e.g., PDCD1) in ER−PR+ canine tumors. Conclusions We identify a canine mammary tumor subtype that molecularly resembles human BLBC overall and thus could serve as a vital translational model of this devastating breast cancer subtype. Our study also sheds light on the dog–human difference in the mammary tumor histology and the hormonal cycle. Supplementary Information The online version contains supplementary material available at 10.1186/s13058-023-01705-5.

A-C. K-means (A), consensus clustering (ConsensusClusterPlus) (B), and permutationbased hierarchical clustering (pvclust) (C) were applied using the top 10% most variably expressed genes in samples of the discovery set (top plots) or the validation set (bottom plots).These approaches all yield two subtypes, the same as the NMF strategy shown in Fig. 1.D. The proportions of samples assigned to the same subtype by each approach indicated in A-C as the NMF strategy shown in Fig. 1.E-F.Heatmaps of the discovery (E) and validation (F) sets as presented in Fig. 1, with NMF clustering conducted on the top 2,000 most variably expressed genes within each cohort.I.An example of cross-species PAM50 classification using canine and human microarray data.Human tumors (30 tumors per PAM50 subtype) were randomly sampled from a microarray data study [3] as described in the Methods section.The figure is presented as described for Fig. 2B.
J. Principal component analysis (PCA) plots using machine learning (ML)-selected feature gene sets that separate hBLBC from hLumA, LumB, and HER2 (top three plots, from top to bottom, respectively) breast cancer samples from TCGA.Bottom PCA plot is done using all 143 subtyped canine mammary tumors with the union of feature genes indicated in the top three plots (115 genes in total), with the number of feature genes specified by "n".
LumB, and HER2 (top three plots, from top to bottom, respectively) breast cancer samples from TCGA.Bottom PCA plot is done using all 143 subtyped canine mammary tumors with the union of feature genes indicated in the top three plots (n = 130).
L-O. PCA plots of all 143 subtyped canine mammary tumors using different combinations of ML-selected feature gene sets, including hBLBC and hLumA feature genes combined (L), hBLBC-unique feature genes (M), hLumA-unique feature genes (N), and feature genes shared by both hBLBC and hLumA (O).

Fig. S4. Differentially expressed (DE) gene analysis indicates enrichment of
hBLBC signatures in cBLMT of the validation set; related to Fig. 3 and Table S3.
A. Heatmap of the row scaled log2(TPM) values of the 761 DE genes between cBLMT and cNBLMT of the validation set, presented as described for Fig. 3A.
A-B. Distributions of ssGSEA enrichment scores for metabolic pathways using signature genes identified for each pathway as described [6],  S6.
A-B. Venn diagrams of genes positively (A) or negatively (B) correlated with PGR, ESR1, and/or PRLR, using the same thresholds as described for Fig. 7
Fig. S2.Validation of canine mammary tumor subtyping results shown in Fig. 1 a n B a s a l-li k e a n d c a n in e B a s a l-li k e h u m a n L u m A a n d c a n in e n o n B a s a l-li k e Number of random samples Fig. S3.Dog-alone and cross-species PAM50 classification using canFam4 and

Fig. S6 .
Fig. S6.Purine de novo synthesis and serine synthesis are more activated in cBLMTs and hBLBCs, compared to