Age-associated genes in human mammary gland drive human breast cancer progression

Aging is a comorbidity of breast cancer suggesting that aging-associated transcriptome changes may promote breast cancer progression. However, the mechanism underlying the age effect on breast cancer remains poorly understood. We analyzed transcriptomics of the matched normal breast tissues from the 82 breast cancer patients in The Cancer Genome Atlas (TCGA) dataset with linear regression for genes with age-associated expression that are not associated with menopause. We also analyzed differentially expressed genes between the paired tumor and non-tumor breast tissues in TCGA for the identification of age and breast cancer (ABC)-associated genes. A few of these genes were selected for further investigation of their malignancy-regulating activities with in vitro and in vivo assays. We identified 148 upregulated and 189 downregulated genes during aging. Overlapping of tumor-associated genes between normal and tumor tissues with age-dependent genes resulted in 14 upregulated and 24 downregulated genes that were both age and breast cancer associated. These genes are predictive in relapse-free survival, indicative of their potential tumor promoting or suppressive functions, respectively. Knockdown of two upregulated genes (DYNLT3 and P4HA3) or overexpression of the downregulated ALX4 significantly reduced breast cancer cell proliferation, migration, and clonogenicity. Moreover, knockdown of P4HA3 reduced growth and metastasis whereas overexpression of ALX4 inhibited the growth of xenografted breast cancer cells in mice. Our study suggests that transcriptome alterations during aging may contribute to breast tumorigenesis. DYNLT3, P4HA3, and ALX4 play significant roles in breast cancer progression.


Background
In addition to being a major risk for breast cancer, age was also shown to be associated with increased mortality in women with breast cancer [1]. The positive correlation between age and breast cancer incidence and mortality is believed to be due to the progressive accumulation of genetic and epigenetic alterations in breast epithelial cells as well as epigenetic changes in their microenvironment, which lead to changes of gene expression. However, the mechanism underlying the age effect on breast cancer remains poorly understood.
The genomic and epigenetic data of human adjacent normal tissues and tumor tissues, from public databases including The Cancer Genome Atlas (TCGA), have facilitated numerous findings in various cancers. Because age is a major risk for cancer and epigenetic alteration is one of hallmarks for both cancer and aging [2][3][4], it appears logical that if the expression of a gene is upregulated during aging and higher in tumor tissues than in their matched adjacent normal tissues, it is likely oncogenic. Conversely, if its expression is downregulated during aging and lower in tumors than their matched adjacent normal tissues, it is likely tumor-suppressive. Following this logic, we should be able to identify the genes associated with both age and breast cancer.
However, whether these in silico-identified age-and cancer-associated genes contribute to cancer development and progression, or whether they are no more than accidental correlation as passengers instead of drivers, remains inconclusive from statistical data mining. To answer these questions, experiments, where potential tumor-promoting genes are knocked down and potential tumor-suppressive genes are overexpressed in various in vitro and in vivo models, are needed to validate their direct effect on cancer progression.
In this study, we identified 38 novel age-and breast cancer-associated genes, which we call them ABC genes, by mining TCGA breast cancer data and validating using GTEx normal breast tissue data. We selected two upregulated and two downregulated ABC genes with significant prognostic values for various subtypes of breast cancer and further demonstrated that the upregulated ABC genes, DYNLT3 and P4HA3, have a tumorpromoting function whereas the downregulated gene ALX4, but not WDR86, showed tumor-suppressive function by in vitro and in vivo experiments. As such, our study provides novel molecular links between aging and breast cancer development and progression.

Processing of TCGA RNAseq data
The RSEM estimated raw read counts data were downloaded from TCGA Data Portal on July 25, 2014, and normalized by library sizes, log2 transformed, and further normalized by quantile normalization. Removal of 25% of genes with the lowest mean expressions resulted in subsequent analyses on 15,398 genes. In identifying genes associated with age (birth_days_to), the genes whose expression were considered to be correlated with age were selected by three criteria: (1) R 2 value among the highest 5%, (2) absolute value of slope for age among the highest 5%, and (3) adjusted P value for the slope among the lowest 5%. In identifying genes differentially expressed comparing post-menopausal normal against pre-menopausal normal, and comparing matched tumors against match normal samples, a combination of methods using moderate hierarchical t test (R/limma) and moderate fitting based on negative binomial distribution (R/DESeq2), with and without surrogate variable analysis for potential confounders (R/SVA) with cutoff as FDR < 0.05, were implemented. The common overlapping genes from four methods were determined as the final list for differentially expressed genes.

Correlation between gene expression and subtypes of breast cancer
Two metrics were implemented to test the correlation between gene expression pattern of the 38 ABC genes and subtypes of breast cancer. The first metric was the sum of gene expression levels of 14 upregulated genes minus the sum of gene expression of 24 downregulated genes.
Score ¼ X g in genes ExpressionðgÞ Â DirectionðgÞ The second metric was Pearson's correlation coefficient between expression levels of 38 genes with their expression direction, 1 for upregulated and − 1 for downregulated genes. Score = Pearson Correlation (Expressions, Directions) Both metrics were calculated for each patient then both metrics from all patients were examined against their subtypes.

Gene set enrichment analysis
Normalized log2 expression data from TCGA breast cancer cohort were rank-ordered by fold change between any two subtypes. The signature comprised two sets, one for 14 upregulated genes and the other one for 24 downregulated genes. Genes in the expression data were not collapsed but directly compared with the signature. Gene signatures were tested using default enrichment GSEA statistics with 1000 random permutations.

Survival analyses
Survival analyses, including disease/progression-free survival, were performed with "survival" package from R for patients in TCGA cohort. The split ratio to group patients into high and low groups based on gene expression was searched from 30% quantile to 70% quantile, and the best quantile which yields the lowest Cox regression P value was used as the best split ratio. To determine the prognostic value, Cox regression adjusted by diagnosis age was performed on two groups determined by the best split ratio.
For relapse-free survival on patients with aggregated microarray data, kmplot.com was used [5]. "Auto select best cutoff" was checked, and "only JetSet best probe set" was used to choose probeset for each queried gene. For the analysis on panel of genes, the weighted mean expression of all probeset was used to split patients.
For survival analysis of each individual gene, analyses were conducted on all patients and patients stratified by subtypes, in both TCGA RNASeq cohort and kmplot.com microarray cohort. For analysis on a panel of genes, analyses were conducted on all patients.

Validation on GTEx dataset
The age information in unit of year of all donors in the GTEx database (www.gtexportal.org/home/) was applied and downloaded via dbGaP-controlled access. Among them, RNAseq gene counts data from 90 donors with normal breast tissue donation were extracted. This expression data was then normalized by library sizes, log2 transformed, and further normalized by quantile normalization. The 38 ABC genes identified from TCGA dataset were then queried in the normalized GTEx data, by studying the correlation between the gene expression and age. Multi-testing was addressed by Benjamini-Hochberg correction.

Cell lines and culture conditions
The human breast cancer cell lines (MDA-MB-231, MDA-MB-468, ZR-75-1, SK-BR-3, Hs 578T, and BT-474) were purchased from the American Type Culture Collection (ATCC, Manassas, VA, USA), and MCF-7 cells were obtained from the Michigan Cancer Foundation in 1994. They were cultured in McCoy's 5A medium supplemented with sodium pyruvate, amino acids, vitamins, penicillin, streptomycin, sodium bicarbonate, and 10% fetal bovine serum (FBS) in a humidified atmosphere containing 5% CO 2 at 37°C as previously described. These cells were authenticated in late 2016 with short tandem repeat assay by the DNA Lab in our university. They were routinely tested to be mycoplasma-free at least once a year with a luminescence-based mycoplasma detection kit from Lonza (Cat. # LT07-118).

Transfection of siRNAs and screening
Three siRNAs per gene were purchased from Sigma-Aldrich. A scrambled siRNA (Sigma-Aldrich) was used as a negative control. Dried siRNAs were reconstituted with DEPC-treated water (10 μM Stock). Lipofectamine RNAiMAX reagent (Invitrogen) was diluted by adding 0.3 to 25 μl basic medium in each well. siRNAs (0.75 μl at 10 μM) were diluted in 25 μl basic medium and then mixed with suspended cells to reach 50 nM working concentration. Diluted RNAiMAX and siRNAs were incubated for 20 min at 37°C. Cell suspension in 100 μl containing 3000 MDA-MB-231 cells, 3000 MDA-MB-468 cells, 7000 MCF7 cells, 5000 ZR-75-1 cells, 7000 SK-BR-3 cells, or 9000 Hs 578T cells in culture medium with 15% FBS were added to pre-incubated transfection reagent. Cell proliferation was monitored, and confluency was analyzed every 3 h in IncuCyte for 1 week.

Molecular cloning and lentiviral vectors
MISSION® TRC2 pLKO.5-puro plasmids containing shRNAs against DYNLT3 and P4HA3 and MISSION® TRC2 pLKO.5-puro Empty Vector Control Plasmid were purchased from Sigma-Aldrich. MISSION® TRC2 pLKO.5-puro Empty Vector Control Plasmid was used as a control for these knockdowns. Plasmids harboring full-length CDS of ALX4 (HsCD00294887) and WDR86 (HsCD00436836) were purchased from DNASU Plasmid Repository (dnasu.org). Full-length CDS of ALX4 and WDR86 were then cloned by PCR with primers. The plasmid pWPI (addgene: 12254) was inserted at PmeI site with linker harboring BamHI, MluI, and NdeI sites, denoted as pLenti-Control. The cloned CDS of ALX4 and WDR86 were then inserted into plasmid pLenti-Control linearized by BamHI and PmeI, to construct pLenti-ALX4 and pLenti-WDR86 correspondingly. The plasmid pLenti-Control was used as a control for the overexpression.

Viral production and infection of cell lines
293T packaging cells were transfected lentiviral vectors and packing vectors to generate lentiviral shRNAs (DYNLT3, P4HA3, and control) or CDS (ALX4, WDR86, and control) to infect MDA-MB-231 and BT-474 cells. Knockdown and overexpression were then examined with quantitative RT-PCR and Western blot analyses.

RNA isolation and quantitative RT-PCR
Cells were lysed with QiaGen RLT Lysis buffer, and total RNAs were extracted using RNeasy Mini Kit (QiaGen: Cat No./ID: 74104) following the manufacturer's protocol. cDNA was generated using random primers and M-MLV reverse transcriptase from Invitrogen (Grand Island, NY), and real-time PCR was performed using SYBR reagent (Invitrogen) as described before [6].

Cell viability assay
Knockdown of DYNLT3 and P4HA3 or overexpression of ALX4 and WDR86 in MDA-MB-231 and BT-474, and corresponding knockdown or overexpression control cells were plated in at least triplicate wells at 2000 MDA-MB-231 cells or 5000 BT-474 cells per well into 96-well plates. In the following 1 week, MTT solution (50 μl, 2 mg/ml in PBS) was added to each well and allowed to incubate for 2 to 4 h in a cell culture incubator. After removing the culture medium, 100 μl DMSO was then added into each well with the plates gently agitated on a shaker. The absorbance was measured at 595 nm with a microplate reader (BioTek Instrument, Winooski, VT), which is proportional to the number of viable cells.

Transwell migration assay
Cells were seeded at a density of 20,000 cells per well for MDA-MB-231 and 80,000 cells per well for BT-474 in serum-free medium in the upper 24-well Boyden chambers with 8-μm-size inserts (BD Biosciences, Durham, NC, USA). A serum-containing medium was added to the lower chamber. After allowing the cells to migrate for 16 to 18 h in cell culture incubator at 37°C, the cells on the top of the membrane were removed and the migrated cells were fixed and stained with HEMA 3 STAIN SET from the Fisher Scientific Company. Picture of the stained cells on the insert was captured under a microscope at × 4 magnification for counting [7].

Soft agar assay
Soft agar assays were carried out as described previously [7]. Growth of cells in soft agar was determined by plating 10,000 cells in triplicate in 0.3% Noble agar in 6-well tissue culture plates. Three weeks after plating, soft agar plates were stained and counted as described previously [7].

Xenograft experiment in mice
Animal experiments were conducted following appropriate guidelines. They were approved by the Institutional Animal Care and Use Committee and monitored by the Department of Laboratory Animal Resources at the University of Texas Health Science Center at San Antonio. Five million exponentially growing luciferase-and GFPexpressing BT-474 control or DYNLT3 knockdown, or P4HA3 knockdown cells in 0.1 ml volume were injected into the left inguinal mammary fat pad of ten 6-weekold female NOD scid gamma (NSG) mice purchased from The Jackson Laboratory. Similarly, nine 6-week-old female NSG mice were inoculated with 5 × 10 6 exponentially growing luciferase-and GFP-expressing BT-474 control or ALX4 OE, or WDR86 OE cells. About 10 days after inoculation, tumor size was recorded twice a week. Tumor volumes were calculated as tumor volume (mm 3 ) = length × width × width/2. For the detection and quantification of lung metastasis biofluorescence imaging with IVIS, mice were euthanized at the termination of the experiment. D-luciferin (200 μl at 15 mg/ml) was injected intraperitoneally. Ten minutes later, the lung tissues from each mouse were excised and transferred to wells of a 24-well plate and were immediately imaged to obtain photo flux. Living image software was used for analysis. Tumor tissues were then excised, weighted, and cut into pieces to be frozen at − 80°C or fixed in 10% formalin for various analyses.

Statistics and reproducibility
Results are represented as described in the figure legends, with mean ± sem (standard error of mean). For the experiments with two groups and normally distributed data, the difference of means was tested for significance with two-tailed unpaired Student's t test with the assumption of unequal variance. For multiple independent groups, one-way ANOVA was used. All experiments were repeated at least three times, and representative images and figures were shown.

Identification of ABC genes
To identify genes with age-dependent expression, we extracted whole transcriptome profiling data of tumormatched adjacent normal tissues from 82 female patients with age at diagnosis and menopausal status available in TCGA cohort (Supplementary Figure 1A). We applied simple linear regression to study the association between the gene expression level and age at diagnosis on the matched adjacent normal samples from all 82 patients, and from 54 post-menopausal patients to reduce the leverage effect of pre-menopausal patients with limited age range. We identified 210 upregulated and 98 downregulated genes in all patients (Supplementary Figure 1B), as well as 103 upregulated and 164 downregulated genes in post-menopausal patients only with the selection criteria defined in the "Materials and methods" section (Supplementary Figure 1C). The unions of these genes (258 upregulated (Supplementary Figure 1D) and 240 downregulated (Supplementary Figure 1E)) were considered as genes affected by age. Many of these genes were also influenced by menopausal status. Therefore, by comparing post-menopausal to pre-menopausal patients, we identified 493 upregulated (Supplementary Figure 1F Supplementary Table 1.

Characteristics of ABC genes
These 38 ABC genes clearly showed an age-dependent expression pattern in the 82 adjacent normal samples (Fig. 1a), while such pattern was not observed in those corresponding matched tumor samples (Supplementary Figure 2). Interestingly, most of these 38 genes were not altered in other cancers. Only CTHRC1 and LPAR5 (Fig. 1b) were upregulated and MAGI1 (Fig. 1c) was downregulated in lung adenocarcinoma, thyroid carcinoma, and kidney and renal papillary cell carcinoma in addition to invasive breast cancer when these genes were queried from TCGA database. Importantly, both the higher average expression of the 14 upregulated ABC genes (Fig. 1d) and the lower average expression of the 24 downregulated ABC genes (Fig. 1e) were significantly predictive for poor relapse-free survival in all breast cancer patients from KM plotter dataset [5].
Because the expression data in the TCGA dataset were derived from bulk tissue samples and are not cell typespecific, we queried the 38 ABC genes in The Human Protein Atlas (https://www.proteinatlas.org/) for their cell type-specific expression information in breast tissue based on immunohistochemistry (IHC). Interestingly, while eight genes were not detectable by IHC and ten genes have no IHC data, the expression of the remaining twenty genes is either only detectable or higher in breast epithelial cells in comparison with adipocytes suggesting that the 38 ABC genes are mostly expressed by the breast epithelial cells. We next examined the expression pattern of the 38 ABC genes among different subtypes of breast cancer. We developed two metrics: (1) the sum of these 38 ABC gene expression values (Fig. 1f) and (2) the Pearson correlation (Fig. 1g) of these 38 ABC gene expression values based on whether they were up-or downregulated. Both metrics indicated that luminal A and B and Her2 subtypes had higher expressions of the 14 upregulated genes and lower expressions of the 24 downregulated genes compared to triple-negative and normal-like subtypes. This observation is also reflected in the heatmap shown in Fig. 1h. Consistently, the signatures of 14 upregulated genes and 24 downregulated genes were more enriched in luminal A and B and Her2 than basal and normal-like subtypes (Fig. 1I). Strikingly, the majority of these 38 genes did not exhibit an increased number of genetic alterations except CTHRC1 and ETV3L (Supplementary Figure 3, generated from cBioPortal).
To study whether these ABC genes are promoting or suppressing cancer progression, we developed a scheme to prioritize these genes to a few for experimental validation. Firstly, we conducted a large-scale siRNA screening on the 14 upregulated genes. For each upregulated gene, three siRNAs were transfected into six breast cancer cell lines representing all subtypes of breast cancer (basal, MDA-MB-468; luminal A, MCF7; luminal B, ZR-75-1; Her2, SK-BR-3; claudin-low, Hs578T). From this screening, we found that knockdowns of CLEC3A, CTHRC1, RNASE2, LPAR5, and LRRC15 inhibited the proliferation of all tested cell lines by at least one siRNA (Supplementary Figure 4). Knockdown of DYNLT3 was more effective in inhibiting cell proliferation of MCF7 and SK-BR-3 than ZR-75-1 and Hs578T with little effect on MDA-MB-468. Knockdown of P4HA3 showed a more suppressive effect on MDA-MB-468 and SK-BR-3 than on ZR-75-1 and Hs 578T with a limited effect on MCF7. Knockdown of other upregulated ABC genes including PRR13, EGLN3, HSD11B2, and MATN3 also inhibited the proliferation of various breast cancer cell lines (data not shown). Secondly, to validate the ageassociation of these genes, we conducted correlation tests between these gene expression and age using 90 samples from the GTEx cohort. We found that 22, 16, and 4 out of the 38 ABC genes showed FDR below 0.5, 0.3, and 0.1, respectively (Supplementary Table 2). The four genes with FDR below 0.1 were CLEC3A, DLK1, DYNLT3, and ETV3L. Because of the large variation in human data and our limited number of samples, we decided to use the cutoff of FDR below 0.5 in combination with other filters discussed here for selecting ABC genes for further study. Thirdly, we examined the predictive prognostic power of individual genes in relapse-free survival of all breast cancer together as well as each subtype separately in a large cohort of breast cancer patients [5]. We found that RNASE2, ARRDC2, PRX, KCNQ4, and DLK1 were significantly predictive in relapse-free survival of all breast cancer patients and in all subtypes whereas EGLN3, DYNLT3, TUBGCP6, HOXA6, and DCHS2 were predictive in all patients, and all subtypes except for those of HER2 subtype, and ENPP6, WDR86, ALX4, CLEC3A, P4HA3, PRR13, CTHRC1, and OTUD7A were predictive in all patients and in at least one subtype of breast cancer patients (data not shown). Using the above filters and reviewing the literature, we selected two upregulated (DYNLT3 and P4HA3, Fig. 2a) and two downregulated (ALX4 and WDR86, Fig. 3a) ABC genes for further functional studies because (1) their functional role in cancer was largely unknown while their expression is significantly upregulated in tumors (Fig. 2c, g) or downregulated (Fig. 3c, g); (2) knockdown of DYNLT3 and P4HA3 by siRNAs inhibited breast cancer cell proliferation (Supplementary Figure 4 and data below); (3) their age-dependent expression pattern was validated in the GTEx dataset (Figs. 2b, f and 3b, f); and (4) they were significantly predictive for relapse-free survival in all breast cancer patients or in patients with certain subtypes of breast cancer (Figs. 2b, h and 3d, h).

Knockdown of DYNLT3 and P4HA3 reduced breast cancer cell malignancy
To test the potential tumor-promoting roles of the two upregulated ABC genes DYNLT3 and P4HA3, we knocked down their expression in the basal-like breast cancer MDA-MB-231 cells and the luminal B breast cancer BT-474 cells. We chose the BT-474 cell line to represent luminal B subtype model because the 38 ABC genes were significantly enriched in the luminal B subtype. We also included MDA-MB-231 to contrast with BT-474. Both cell lines have tumorigenesis capacity in immune-deficient mice, which allows us to study their potential tumor-promoting role in vivo. Firstly, we confirmed the knockdown of these two genes in the two cell lines by Western blot (Supplementary Figure 5A-D). Next, we examined how their knockdown affected the malignant properties of the two cancer cell lines in vitro. We found that knockdown of either DYNLT3 or P4HA3 significantly reduced the tumor malignant properties in both cell lines, in terms of cell proliferation (Fig. 4a, b), migration (Fig. 4c, d), and anchorage-independent colony formation efficiency in soft agar (Fig. 4e, f). These data indicate that these two upregulated ABC genes are necessary for the malignant properties of the breast cancer cells. It is notable that the extent to which the knockdown of either gene inhibited proliferation of MDA-MB-231 was more moderate than that of BT-474. This could possibly be due to MDA-MB-231 being a faster-growing cell line than BT-474. Therefore, the moderate inhibition in MDA-MB-231 upon knockdown of either DYNLT3 or P4HA3 could be as large as in BT-474. Clearly, further studies are needed to elucidate the

Knockdown of P4HA3 but not DYNLT3 reduced malignancy in vivo
We next examined the roles of DYNLT3 and P4HA3 in tumorigenesis in vivo. The control and DYNLT3 knockdown BT-474 cells expressing ectopic luciferase (Luc) and green fluorescent protein (GFP) were injected subcutaneously into female NOD scid gamma (NSG) mice bilaterally in the flank with 10 mice in each group. In contrast to the significant reduction of the malignant properties in vitro, we observed reduced tumor growth of DYNLT3 knockdown cells but with no statistical significance in comparison with the tumors formed the control cells in terms of tumor volume (Fig. 5a) and tumor weight (Fig. 5b). Consistently, the knockdown of DYNLT3 showed no effect on the extent of lung metastasis as measured with bioluminescence imaging and expressed as the total light flux of luciferase activity in the excised whole lung (Fig. 5c, Supplementary Figure 6). We confirmed the constitutive knockdown of DYNLT3 in the xenograft tumors (Supplementary Figure 7). This appears to indicate that the reduced malignancy due to DYNLT3 knockdown in vitro was somehow compensated under the in vivo condition. Next, to examine whether the reduced in vitro malignancy due to P4HA3 knockdown can be similarly compensated in vivo, we injected the control and P4HA3  (Fig. 5d), tumor weight (Fig. 5e), and total flux of luciferase activity in the lungs (Fig. 5f, Supplementary Figure 8). Similarly, we confirmed the constitutive knockdown of P4HA3 in the xenograft tumors (Supplementary Figure 9). Thus, the deficiency of P4HA3, which functions mainly as a mediator of collagen synthesis [8], consistently reduced tumor malignancy both in vitro and in vivo, suggesting that collagen synthesis pathways and extracellular matrix modification is an important regulator of tumor progression.

Overexpression of ALX4, but not WDR86, reduced breast cancer cell malignancy
To test the function of the two downregulated ABC genes, ALX4 and WDR86, we ectopically introduced their cDNA via a lentiviral vector to increase their expression in the breast cancer cell lines. We found that overexpression of ALX4 moderately, but significantly, Data are presented as the mean ± sem of three measurements. Two-sample t tests were used to compare the means of the control and each knockdown group. *P < 0.05; ***P < 0.0005; ****P < 0.0001 inhibited the growth of BT-474 cells (Fig. 6a), but not the growth of MDA-MB-231 cells (Fig. 6b). Nevertheless, the overexpression of ALX4 was found to significantly inhibit migration (Fig. 6c, d) and anchorageindependent colony-forming capacity (Fig. 6e, f) Figure 10). On the other side, overexpression of WDR86 had no effect on these malignant properties of either cell line (Fig. 6a-f). These results suggest that ALX4, but not WDR86, plays a tumor-suppressive role in breast cancer cells, consistent with the reports that ALX4 is downregulated in various cancers [9][10][11].
To confirm our findings from the in vitro assays, we next examined the effect of overexpression of ALX4 or WDR86 on breast cancer cell growth in a xenograft model in vivo. The control, ALX4-overexpressing, and WDR86-overexpressing BT-474/Luc-GFP cells were injected into NSG mice bilaterally with 9 mice in each group. We found that overexpression of ALX4 significantly inhibited the growth of tumors formed by BT-474 in vivo in terms of both tumor volume (Fig. 6g) and , and total flux from luciferase activity in the lungs (c, f) between control and DYNLT3 or P4HA3 knockdown BT-474 cell-inoculated mice. Data are presented as the mean ± sem of ten tumors. Two-sample t tests were used to analyze the data.*P < 0.05; **P < 0.005 weight (Fig. 6h), supporting our conclusion that ALX4 is tumor-suppressive. In contrast, overexpression of WDR86 did not inhibit tumor progression (Fig. 6g, h), which is consistent with the in vitro data.

Discussion
Age is a well-known risk factor for breast cancer, and published studies have implicated potential relation between age and breast cancer [12,13]. Similar to our  (g, h). Two-sample t tests were used to compare the means of the control and each knockdown group.*P < 0.05; **P < 0.005 study, Pirone et al. collected 96 tissue specimens from patients with reduction mammoplasty with age ranging from 14 to 70 years old and studied the transcriptome profile by microarray [14]. The age-associated 802 probes (481 increased, 321 decreased with increasing age) they identified were enriched in biological processes including cancer, immune response, and cellular proliferation. However, only 31 upregulated and 16 downregulated genes overlap between our study and that of Pirone et al. (Supplementary Figure 11 A, B). The small number of overlapping genes could be due to the difference between tumor-adjacent normal tissue and mammoplasty reduction tissue. The latter likely contained more adipose tissue. In addition, the transcriptome profile of the tumor-matched adjacent normal samples in TCGA dataset may be influenced by the presence of cancer [15], which may contribute to the difference between the genes we identified and the age-associated genes by others. Nonetheless, while the age-associated genes we identified may be confounded by the cancer field effect, our main focus was to identify age-associated genes that contribute to tumorigenesis. Thus, in the last step of our transcriptome analysis, we contrasted the age-associated genes with genes differentially expressed in tumor versus their matched adjacent normal tissues resulting in a list of novel ABC genes with predictive power for the disease outcome as well as other unique properties discussed below.
In order to remove the menopause effect in agerelated genes, we identified and removed menopauseassociated differentially expressed genes from the ageassociated genes. However, due to the wide range of age in the post-menopausal patients in the TCGA cohort, such comparison of gene expression between pre-and post-menopausal cohorts may be confounded by age. Therefore, we may lose some important targets in the list of "pure" age-dependent genes (148 upregulated and 189 downregulated genes in Supplementary Figure 1A), resulting in a higher false-negative rate. Menopause has been shown to elevate inflammation response [16,17]. Aging is associated with chronic inflammation which is also associated with the early development of breast cancer [18]. Upon menopause, systemic hormone level alterations, especially estrogen, lead to the upregulation of estrogen receptor whose activity is critical in breast cancer. Thus, the delineation of the relationship among menopause, aging, and breast cancer development remains an important and challenging topic for further study.
In this study, we identified 38 genes that are both age and breast tumorigenesis related. Among them, fourteen genes are potential tumor-promoting genes that are upregulated during aging while twenty-four of them are potential tumor-suppressive genes that are downregulated during aging. While most of the upregulated genes have previously been shown to be involved in carcinogenesis, most of the downregulated genes have either not been studied for their biological functions or not been implicated in carcinogenesis. For example, the upregulated CLEC3A has been shown highly expressed on the tumor cell surface, and the cleavage of CLEC3A by MMP7 was reported to weaken tumor cell adhesion and migration [19]. The frequently amplified gene CTHRC1 has been reported to be involved in cell proliferation, migration, invasion, and metastasis by regulating multiple signaling pathways including the TGFβ pathway and collagen synthesis pathway [20][21][22][23]. The gene HSD11B2 regulated by progesterone, also known as hydroxysteroid (11-beta) dehydrogenase 2, has been frequently studied as a tumor promoter which promotes cancer progression by regulating interconversion of cortisol and cortisone [24][25][26]. Among the downregulated ABC genes, DLK1 has been shown in many recent studies to function as a tumor suppressor [27,28]. On the other hand, it was recently shown as non-canonical Notch ligand to promote tumorigenesis by promoting epithelial-mesenchymal transition in ovarian cancer [29]. The gene MAGI1 is short for membraneassociated guanylate kinase inverted 1. Previous studies have suggested its potential tumor suppressor role by interacting with beta-catenin to regulate cell-cell adhesion [30,31] and by regulating PTEN to inhibit cancer cell migration and invasion [32]. Among the ABC genes that have not been studied in breast cancer, we used the selection criteria described above and selected four novel genes, DYNLT3, P4HA3, ALX4, and WDR86, for further investigation of their functional role in regulating the malignant properties of breast cancer cells for the first time.
With in vitro experiments, we showed that DYNLT3 and P4HA3 promoted breast cancer malignancy. DYNLT3 is short for dynein light chain Tctex-type 3, which is a member of a subclass of dynein light chains. DYNLT3 protein has been reported to form the light chain component of the cytoplasmic dynein motor protein complex [33,34]. Its paralog, DYNLT1, has been found to promote tumor progression [35][36][37]. Thus, our current study and the published studies indicate that cytoplasmic dynein motor proteins play a critical role in tumor progression. P4HA3 is short for prolyl 4hydroxylase subunit alpha 3, which is a component of prolyl 4-hydroxylase involved in collagen synthesis. Its paralog, P4HA1, has been found to promote tumor progression [38,39]. Alterations in collagen synthesis pathways have been shown in many cancer progressions [40][41][42][43]. A recent study showed that P4HA3 appears to be a target effector of the TGFβ pathway mediating the tumor/migration-promoting activity of TGFβ [44]. Consistently, TGFβ1 was found in our study to be significantly upregulated in post-menopausal patients in comparison with pre-menopausal patients in the TCGA dataset (data not shown). On the other side, we showed that overexpression of ALX4, a potential tumor suppressor, significantly inhibited the malignant properties of breast cancer cells. ALX4 is short for aristaless-like homeobox 4, which is a homeobox domain transcription factor. It has been shown in lung cancer [9] and colorectal neoplasia [10] that ALX4 is epigenetically silenced via hypermethylation. Moreover, its expression was previously reported to be significantly downregulated in breast cancer [11]. As we were completing our experiments involving ALX4, Yang and co-workers also reported the tumor-suppressive function of ALX4 in breast cancer cells [45]. These published studies together with ours support the conclusion that ALX4 acts as a putative tumor suppressor in subsets of breast cancer. Lastly, we also studied the extent to which WDR86 could play as a tumor suppressor. WDR86 is a novel gene with little known function. WDR86 is short for WD domain repeat 86. It has been characterized with 4 validated transcription isoforms and 11 predicted isoforms [46]. It is proposed that one of the paralogs of WDR86 is TRAF7, which has been implicated as a tumor suppressor by inducing apoptosis [47,48]. Therefore, WDR86 may function as a potential tumor suppressor although we did not observe a malignancyinhibitory activity of WDR86 with the assays we used. Further studies are needed to ascertain the role of WDR86 in breast carcinogenesis. In summary, our current study provides evidence demonstrating tumorregulatory functions for the three of the four ABC genes. It is possible that some ABC genes such as these three genes may act in concert to regulate breast cancer progression because we found that in about 25% of the 82 cases, they appear co-regulated with a reasonable concurrence score (Supplementary Table 3). However, whether the expression alterations of these genes during aging truly contribute to mammary tumor initiation and progression requires further in vivo experiments using animal models.
One of the potential limitations of our study is that the genes we identified could be limited to the features of 82 patients in TCGA database. Among these 82 patients, there were 45 luminal A subtypes, 17 luminal B subtypes, 13 basal subtypes, 6 Her2 subtypes, and 1 unknown as of the date of analysis. Because of a relatively large variation in human samples, it was impossible to use data of only 45 luminal A subtypes to identify menopause-independent age-associated genes with enough power. Thus, we included all 82 patients for identifying age-associated genes, which could result in a bias toward more luminal-like samples. This may explain why the 38 ABC genes were more positively enriched in luminal A and B and Her2 subtypes of tumors compared to basal and normal-like subtypes. Thus, while the upor downregulated ABC genes are interestingly associated with poor or favorable patient outcomes, respectively, because of high variation in human samples, more data and studies are needed to confirm the findings in our study.

Conclusion
Our in silico analyses identified age-associated transcriptome alterations that are involved in tumorigenesis. Specifically, the upregulation of DYNLT3 and P4HA3 as well as the downregulation of ALX4 during aging promote breast cancer progression. These results suggest that transcriptome alterations during aging may contribute to breast tumorigenesis.