This prospective metabolomic analysis is among the largest done to date both in terms of the study population and the number of metabolites queried. Eleven metabolites were associated with breast cancer risk at FDR < 0.05 and an additional 50 metabolites were associated at a relaxed threshold of FDR < 0.20. These results replicated some previous studies and identified some novel associations.
The metabolites associated with breast cancer risk and that either replicate previous results or are novel findings are summarized in Table 3. The previously replicated metabolites which were associated with an increased risk of breast cancer were three androgenic steroids derived from DHEA [6, 10]. Two of these three steroids, androstenediol (3β,17β) disulfate  and 16α-hydroxy DHEA 3-sulfate, were associated with an increased risk of breast cancer in CPS-3. Four additional steroids, DHEA-S, androsteroid monosulfate , androstenediol (3β,17β) disulfate , and androstenediol (3β,17β) monosulfate , were also associated with an increased risk of breast cancer in CPS-3. These results, as well as the finding that the associations were only with postmenopausal breast cancer, are consistent with findings from other studies of circulating steroids [23,24,25]. Most studies of steroid metabolites in breast cancer have focused on androgens such as DHEA as the key metabolites influencing estrogen metabolism . However, the correlated group of steroid metabolites we identified included two metabolites of pregnenolone (21-hydroxypregnenolone and pregnenolone sulfate), which is a precursor to the androgenic steroids. This suggests that the alteration in the rate of formation of pregnenolone from cholesterol, which is a highly regulated reaction and the rate-limiting step in steroid hormone biosynthesis , may play a role in breast cancer etiology.
One other metabolite that has potentially been replicated by previous studies [9, 10] is the plasmalogen phosphatidylcholine (PC) (O-16:0/18:2). Our findings for this metabolite directly replicate the finding from the CPS-II study . In the European Prospective Investigation into Cancer (EPIC) study , which used the targeted Biocrates metabolomics platform, PC (O-16:0/18:2) was not specifically measured. However, all PC plasmalogens with 34 carbons and two double bonds, which include PC (O-16:0/18:2), were associated with breast cancer risk. Overall, the glycerophospholipids and sphingolipids we found to be associated with breast cancer clustered into two correlated blocks and included three lipids [PC (18:0/18:2), lyso-phosphatidylethanolamine (PE) (O-18:0) and lysoPC (18:2)] that replicated findings from previous studies [9, 10] for the first time. Why elevated levels of the lipids would be associated with reduced breast cancer risk is not clear. However, they are all common components of cellular membranes, and their altered levels could reflect the perturbation of pathways for membrane synthesis.
We found that glutamine was associated with a reduced risk of breast cancer, but previous studies have found conflicting results. Glutamine was associated with increased risk in the Supplémentation en Vitamines et Minéraux Antioxydants (SU.VI.MAX) cohort  where it was reported as glutamine/isoglutamine, and in the Etude Epidémiologique auprès de femmes de la MGEN (Mutuelle Générale de l’Education Nationale) (E3N) cohort , where the association was limited to premenopausal women. Glutamine was associated with a reduced risk of breast cancer in studies with both pre- and postmenopausal women in EPIC  and our study. Additional studies are needed to confirm the association of glutamine with breast cancer risk. However, the finding of an inverse association for asparagine, which is synthesized from glutamine, here and in the EPIC study  supports an inverse association for glutamine as higher levels of one of these amino acids should result in higher levels of the other. Neither of the studies that found a direct association for glutamine included asparagine among the metabolites analyzed.
We found associations between breast cancer risk and several metabolites that had not been included in previous studies. These metabolites are listed as novel associations in Table 3. Two metabolites, both decarboxylated fatty acids (octadecadienoate and 2-hydroxysebacate), were associated with decreased risk while the other three were associated with increased risk of breast cancer. One of these three, syringol sulfate, is a metabolite of syringol, which is a biomarker of smoked meat consumption . A recent meta-analysis found that higher consumption of either red or processed meat was associated with a greater risk of breast cancer  but did not study smoked meat consumption specifically. Our findings for syringol sulfate argue that this issue should be investigated further.
The other two novel associations we observed were for the xenobiotics catechol glucuronide and 3-hydrixypyridine glucuronide, which were highly correlated (r = 0.76) and are metabolites of catechol and pyridine, respectively. While both compounds occur naturally at low levels, they are produced synthetically in large amounts. About half of the catechol and pyridine and catechol produced is used to make pesticides, while smaller amounts are used for pharmaceuticals and flavoring agents [30, 31]. Pyridine is also used in organic chemistry and in dyes , and both compounds have been found in cigarette smoke. The International Agency for Research on Cancer (IARC) evaluated the carcinogenicity of catechol, in 1999 , and pyridine, in 2019 , using primarily animal data and classified both as 2B, possibly carcinogenic to humans. Our findings suggest that further investigation into the carcinogenicity of these compounds is warranted.
In addition to several steroid metabolites, the associations of two additional metabolites [sphinganine and octadecadienoate (C18:2-DC)*] differed significantly (p < 0.05) in pre- and postmenopausal women. Two metabolites (androstenediol (3β,17β) disulfate  and catechol glucuronide) differed significantly between women with ER+ and ER− breast cancer. It is unclear why these associations differ by menopausal or ER status. These findings may be due to chance and require replication in future analyses.
A significant portion of the cases in this study were diagnosed with breast cancer within a few years after the blood collection, while others occurred later in follow-up, allowing us to explore if associations varied by time between blood collection and diagnosis. Only two metabolites had associations that varied significantly (p < 0.05) by time between blood draw and diagnosis, thus limiting any conclusions as to whether any of metabolite levels might be affected by reverse causation.
Although all the risk estimates remained similar, adjustment for BMI attenuated the associations of all the metabolites with breast cancer. This could indicate that BMI is a mediator of the associations. If so, then adjustment for BMI may be inappropriate. This possibility should be investigated further in future analyses.
A strength of this study is the large study population and the large number of identified metabolites measured. The factors likely contributed to our finding of 11 metabolites associated with breast cancer risk at FDR < 0.05, which is more than previous studies which identified one or two metabolites at most at this significance level [9, 10]. Limitations of our study include the fact the results were based on a single blood sample for each study participant. However, evidence suggests that levels of most circulating metabolites are relatively stable for up to 2 years [33, 34], suggesting that a single sample may be sufficient. Other limitations include smaller numbers in the subgroups used in the stratified analyses. Finally, although Black and Hispanic women were included in our study, there were too few to determine if associations differed by race and/or ethnicity. Thus, our findings may not be generalizable to all groups.