A genome-wide gene-environment interaction study of breast cancer risk for women of European ancestry
Breast Cancer Research volume 25, Article number: 93 (2023)
Genome-wide studies of gene–environment interactions (G×E) may identify variants associated with disease risk in conjunction with lifestyle/environmental exposures. We conducted a genome-wide G×E analysis of ~ 7.6 million common variants and seven lifestyle/environmental risk factors for breast cancer risk overall and for estrogen receptor positive (ER +) breast cancer.
Analyses were conducted using 72,285 breast cancer cases and 80,354 controls of European ancestry from the Breast Cancer Association Consortium. Gene–environment interactions were evaluated using standard unconditional logistic regression models and likelihood ratio tests for breast cancer risk overall and for ER + breast cancer. Bayesian False Discovery Probability was employed to assess the noteworthiness of each SNP-risk factor pairs.
Assuming a 1 × 10–5 prior probability of a true association for each SNP-risk factor pairs and a Bayesian False Discovery Probability < 15%, we identified two independent SNP-risk factor pairs: rs80018847(9p13)-LINGO2 and adult height in association with overall breast cancer risk (ORint = 0.94, 95% CI 0.92–0.96), and rs4770552(13q12)-SPATA13 and age at menarche for ER + breast cancer risk (ORint = 0.91, 95% CI 0.88–0.94).
Overall, the contribution of G×E interactions to the heritability of breast cancer is very small. At the population level, multiplicative G×E interactions do not make an important contribution to risk prediction in breast cancer.
Breast cancer is a complex disease involving interplay between lifestyle/environmental and genetic risk factors. Risk factors such as parity, breastfeeding, age at menarche, age at first full-term pregnancy, body mass index (BMI), height, mammographic density, exogenous hormonal use, and alcohol consumption are well-established [1,2,3,4,5,6,7]. Through continued collaborative efforts such as the Collaborative Oncological Gene-environment Study (COGS) and the OncoArray project , more than 200 common single nucleotide polymorphisms (SNPs) associated with risk of breast cancer have been identified [9,10,11].
Traditional genome-wide association study (GWAS) analyses assess the marginal effects of variants and might miss variants which only show an effect within certain strata in the population. These potential gene–environment interactions where SNPs are associated with disease risk in conjunction with lifestyle/environmental risk factors can be investigated through genome-wide gene-environment interaction studies (GEWIS) [12,13,14,15].
Very few genome-wide studies of gene-environment (G×E) interactions in breast cancer have been conducted to date, and three focused on the use of menopausal hormonal therapy as the single environmental risk factor [16,17,18]. An exploratory analysis of G×E interactions examined ten environmental risk factors and 71,527 SNPs selected from prior evidence, using data from approximately 35,000 cases and controls in the Breast Cancer Association Consortium (BCAC). That study identified two potential G×E interactions associated with breast cancer risk . In the present study, we performed a comprehensive genome-wide analysis of gene–environment interactions for risk of overall breast cancer, as well as estrogen receptor positive (ER +) breast cancer using data from 72,285 cases and 80,354 controls participating in the BCAC.
Analyses were conducted using data from 46 studies (16 prospective cohorts, 14 population-based case–control studies, and 16 non-population based studies) participating in the BCAC. We excluded participants if they were genotypically male, of non-European descent, or had a breast tumor of unknown invasiveness or in-situ breast cancer. Women with prevalent breast cancer at the time of recruitment or with unknown reference age (defined as age at diagnosis for cases and age at interview for controls) were also excluded from the analyses. Further, studies with fewer than 150 cases and 150 controls for the risk factor under evaluation were excluded from those analyses. Each participating study obtained informed consent from the participants and was approved by their local ethics committee.
Risk factor data
Risk factor data from individual studies was checked for quality using a multi-step harmonization process based on a common data dictionary. Time-dependent risk factor variables were derived with respect to the reference date defined as date of diagnosis for cases and date of interview for controls. Analyses were conducted with the following risk factors among all women: age at menarche (per 2 years), parity (per 1 birth), adult height (per 5 cm), ever use of oral contraceptives (yes/no), and current smoking (yes/no). The analysis of age at first full-term pregnancy (per 5 years) was conducted among parous women only, and that of body mass index (BMI, per 5 kg/m2) was conducted among postmenopausal women only. Menopausal status was either self-reported or assigned as postmenopausal if the reference age was greater than 54 years.
All samples were genotyped either using the iCOGS [20, 21] or OncoArray [9, 10, 22]. Briefly, iCOGS is a customized iSelect SNP genotyping array, consisting of ~ 211,000 SNPs [20, 21], whereas OncoArray includes ~ 533,000 SNPs of which nearly 260,000 were selected as a GWAS backbone (Illumina HumanCore) . Detailed information is provided elsewhere [9, 10, 20,21,22]. Data were imputed to the 1000 Genomes Reference Panel (phase 3 version 5). Overall, 28,176 cases and 32,209 controls of European ancestry who were genotyped by the iCOGS array, and 44,109 cases and 48,145 controls who were genotyped using the OncoArray array were included in this analysis.
Genetic variants with imputation quality score < 0.5 in iCOGS or < 0.8 in OncoArray, or with minor allele frequency < 0.01, were excluded from the analyses. Variants in known breast cancer regions were also excluded from the analysis since interactions between known susceptibility variants and risk factors have been explored previously [23, 24]. After applying all exclusions, 7,672,870 genetic variants (SNPs and indels) were included in the analysis.
Unconditional logistic regression was employed to assess the associations of SNPs and risk factors with breast cancer risk. Genotypes were assessed using the expected number of copies of the alternative allele (‘dosage’) as the covariate under a log-additive model. Interactions between genetic variants and risk factors were tested by comparing the fit of logistic regression models with and without an interaction term using likelihood ratio tests. All models were adjusted for reference age, study, and ten ancestry-informative principal components. To account for potential differential main effects of risk factors by study design, all models included an interaction term between risk factor and an indicator variable for study design (population-based vs. non-population based). Analyses with current smoking were further adjusted for former smoking.
Analyses were performed separately for overall and ER + breast cancer risk, and also separately by genotyping array. Array-specific results were combined using METAL . Quantile–quantile (Q-Q) plots were assessed to examine the consistency of the distribution of p-values with the null distribution. Interaction P value less than 5E-07 was considered suggestive evidence of interaction. We also calculated Bayesian False Discovery Probabilities (BFDP) for all suggestive interactions, assuming a 1 × 10–5 prior probability of a true association for each SNP-risk factor pair. Overall, G×E interactions with BFDP < 15% were considered noteworthy . For noteworthy SNP-risk pairs, we evaluated the G×E interaction also for ER-negative breast cancer risk. For noteworthy interactions, we conducted stratified analyses by categories of the risk factor. All analyses were conducted using R version 3.5.1.
We estimated the overall genome-wide contribution of G×E associations for each risk factor to the familial relative risk of breast cancer using LD score regression . The analysis used the G×E interaction summary statistics and was restricted to HapMap3 SNPs with MAF > 5% in European population from the 1000 Genomes Project. Under the log-additive model, the G×E heritability on the frailty scale can be estimated by hf2 = hobs2 × var(X)/P(1-P), where hobs2 is the observed heritability given by LD score regression, var(X) is the variance of the risk factor under evaluation, and P is the proportion of cases in the sample. The proportion of the familial relative risk (FRR) of breast cancer due to G×E interactions is then given by hf2/2log(λ) where λ is the familial relative risk to first degree relatives of cases (assumed to be 2) .
Studies included in the analysis are summarized in Additional file 1: Table S1. The number of cases and controls in each analysis varied from 61,617 cases and 74,698 controls for parity to 48,276 cases and 60,587 controls for current smoking (Additional file 1: Table S2). Consistent with the literature, increasing age at first full-term pregnancy, higher adult height, ever use of oral contraceptives, and current smoking were associated with increased overall breast cancer risk, whereas increasing age at menarche, being parous, increasing number of full-term pregnancies, and breast feeding were associated with decreased breast cancer risk (Additional file 1: Table S3).
The genome-wide analysis of interactions with seven environmental risk factors yielded two SNP-risk factor pairs at BFDP < 15%, one for risk of overall breast cancer and one for ER + breast cancer risk (Table 1, Fig. 1, 2, Additional file 1: Figure S1A-S1B). No inflation in the test statistics was observed for either of the environmental risk factors. The heritability on the frailty scale of breast cancer risk explained by G×E interaction is shown in Additional file 1: Figure S2. The estimated proportion of the frailty scale heritability explained by G×E interactions was very low for all factors, being highest for age at first full-term pregnancy (~ 1.5% for both overall and ER + breast cancer risk), age at menarche and post-menopausal BMI.
For overall breast cancer risk, there was evidence of interaction between SNP rs80018847 and adult height (ORint = 0.94, 95% CI 0.92–0.96, Pint = 4.34E−08, BFDP = 11%) without an apparent marginal effect of the rs80018847 variant (ORmarg = 1.00, 95% CI 0.98–1.03, Pmarg = 0.88). By categories of adult height defined a priori, the estimated per allele ORmeta of rs80018847-G varied from 1.03 (95% CI 0.94–1.13, Pmeta = 0.53) for women shorter than 158 cm, 1.13 (1.02–1.25) for women 158–162 cm in height, to ORmeta of 1.01 (95% CI 0.93–1.09, Pmeta = 0.88) for women who were 168 cm or taller risk (Additional file 1: Table S4). Therefore, there is no linear relationship between the SNP and categories of adult height. The interaction with height was also observed for ER + breast cancer (ORint 0.95, 95% CI 0.93–0.97, Pint = 5.62E-06) but not for ER negative (ER-) breast cancer risk (ORint = 0.98, 95% CI 0.93–1.03, Pint = 0.77). The regional plot for overall breast cancer shows another SNP (rs1360506) at this locus in high linkage disequilibrium (LD) (r2 = 0.81) with rs80018847 (Additional file 1: Figure S3).
For risk of ER + breast cancer, a statistically significant interaction was observed between SNP rs4770552 and age at menarche (ORint = 0.91, 95% CI 0.88–0.94, Pint = 4.62E−08, BFDP = 11%). There was weak evidence for a marginal association between the rs4770552-T allele and ER + breast cancer (ORmarg = 1.02, 95% CI 1.00–1.05, Pmarg = 0.10). The per allele ORmeta appeared to decrease with increasing age at menarche, from 1.07 (95% CI 1.00–1.15, Pmeta = 0.04) for age at menarche less than 13 years to 0.92 (95% CI 0.77–1.09, Pmeta = 0.33) for age at menarche greater than 15 years (Additional file 1: Table S4). There was weaker evidence of interaction between SNP rs4770552 and age at menarche for overall breast cancer risk (ORint = 0.93, 95% CI 0.90–0.96, Pint = 5.47E−06), but no interaction for ER- breast cancer risk (ORint = 0.98, 95% CI 0.89–1.08), Pint = 0.66). At this locus, we found suggestive evidence of interactions between further 13 SNPs and age at menarche for ER + breast cancer risk. However, these 13 SNPs are in high LD (r2 = 0.8–1.0) with SNP rs4770552 (Additional file 1: Figure S4).
This is the largest genome-wide gene-environment interaction study for breast cancer to date. We found evidence of one novel susceptibility loci interacting with adult height associated with increased breast cancer risk overall, and one interaction for increased risk of ER + breast cancer with age at menarche. It is important to note, however, that while these associations reached conventional levels of genome-wide statistical significance, they may still represent chance associations. Based on the assumed prior distribution of effect sizes, the BFDP for both loci were 11%, considered noteworthy. Nevertheless, studies with an even larger sample size are required to confirm or refute these associations.
Many observational studies have shown an association between increasing adult height and increased breast cancer risk, in both premenopausal and postmenopausal women [7, 29, 30]. A meta-analysis estimated that each 10 cm increment in height was associated with a 17% increase in breast cancer risk . The biological link between height and breast cancer is poorly understood, but some studies have suggested that increased height corresponds to more stem cells at risk of acquiring driver mutations . Another hypothesis is that adult height could be a surrogate for nutritional intake, potentially implying a role for insulin-like growth factor 1 (IGF1) . The functional basis of the potential interaction between adult height and the SNP rs80018847 is unclear. This SNP is in an intronic region of the leucine rich repeat and Ig domain containing 2 gene (LINGO2) on the short arm of chromosome 9 (9p13). This gene encodes a transmembrane protein belonging to the LINGO/LERN protein family . Studies in mouse embryos have shown expression of LINGO2 specifically in the central nervous system , but it has not been implicated in breast cancer to date.
Early age at menarche is known to be associated with elevated risk of breast cancer. There is an approximate 5% decrease in risk with each year delay in the initiation of menstruation . It has been postulated that younger age at menarche corresponds to longer cumulative hormonal exposure and therefore elevated levels of estradiol [3, 36]. SNP rs4770552 is an intronic variant within the spermatogenesis associated 13 gene (SPATA13) at 13q12. SPATA13 encodes a guanine nucleotide exchange factor (GEF) for RhoA, Rac1 and CDC42 GTPases [37, 38]. Although the role of this gene in breast cancer is still unclear, there could be an indirect link via the role of RhoA GTPases in breast tumorigenesis. Rho GTPase signaling is altered in human breast cancers, and dysregulation of Rho GTPase may have differential effects on the development of breast tumors depending on the stage and subtype . Activation of RhoA results in release of megakaryoblastic leukemia 1 (MKL1), which in turn has been observed to alter the transcriptional activity of ERα, known to play a critical role in breast tumors . Therefore, SNP rs4770552 may potentially indirectly interact with the regulatory region of SPATA13 and affect the breast tumorigenesis process via activation of RHoA GTPases.
Given that the marginal effects of the common genetic variants are small and the associations of environmental risk factors with breast cancer are modest, interactions are also expected to be weak (Additional file 1: Figure S5). Although this is the largest breast cancer dataset available to date with more than 60,000 cases and 70,000 controls, the study is underpowered to detect weak interactions. Also, this study included only women of European ancestry and the findings may not be generalizable to women of other ancestries.
Using LDSC regression, we estimated the overall heritability due to G×E for each of the risk factors. The estimated frailty scale heritability (≤ 0.015) can be compared with corresponding heritability for the SNP main effects (for which heritability is about 0.47) or the overall heritability based on the familial risk (~ 1.4) [28, 41]. The implication is that G×E interactions make very little contribution to the heritability of breast cancer, at least for the known risk factors and common genetic variants that can be evaluated using genome-wide arrays, and hence do not make an important contribution to risk prediction at the population level. This is consistent with the fact that detection of G×E interactions is rare. This does not rule out the possibility that G×E interactions could be identified in additional large studies or that such interactions may provide important clues to mechanisms.
In conclusion, we identified two novel genome-wide gene–environment interactions for overall and ER + breast cancer risk for women of European ancestry. These results contribute to our global body of knowledge on genetic susceptibility for breast cancer by generating plausible biological hypotheses, but they require replication and further functional studies.
Availability of data and materials
The datasets analyzed during the current study are not publicly available but are available upon request and approval of BCAC Data Access Co-ordinating Committee.
Body mass index
Collaborative Oncological Gene-Environment Study
Single nucleotide polymorphisms
Genome-wide association study
Genome-wide gene-environment interaction study
Breast Cancer Association Consortium
- ER + :
Estrogen receptor positive
Estrogen receptor negative
Bayesian false discovery probability
Minor allele frequency
- ORint :
Interaction odds ratio
- ORmeta :
Meta-analyzed odds ratio
Insulin-like growth factor 1
Leucine rich repeat and Ig domain containing 2
Spermatogenesis associated 13
Megakaryoblastic leukemia 1
Linkage disequilirium score regression
Baer HJ, et al. Adult height, age at attained height, and incidence of breast cancer in premenopausal women. Int J Cancer. 2006;119:2231–5.
Collaborative Group on Hormonal Factors in Breast Cancer. Breast cancer and breastfeeding: collaborative reanalysis of individual data from 47 epidemiological studies in 30 countries, including 50 302 women with breast cancer and 96 973 women without the disease. Lancet. 2002;360:187–95.
Collaborative Group on Hormonal Factors in Breast Cancer. Menarche, menopause, and breast cancer risk: individual participant meta-analysis, including 118 964 women with breast cancer from 117 epidemiological studies. Lancet Oncol. 2012;13:1141–51.
Hunter DJ, et al. Oral contraceptive use and breast cancer: a prospective study of young women. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. 2010;19:2496–502.
Jung S, et al. Alcohol consumption and breast cancer risk by estrogen receptor status: in a pooled analysis of 20 studies. Int J Epidemiol. 2016;45:916–28.
Wang K, et al. Change in risk of breast cancer after receiving hormone replacement therapy by considering effect-modifiers: a systematic review and dose-response meta-analysis of prospective studies. Oncotarget. 2017;8:81109–24.
World Cancer Research Fund International/American Institute for Cancer Research. Diet, nutrition, physical activity and breast cancer 2017. 2017;120.
Amos CI, et al. The OncoArray Consortium: a network for understanding the genetic architecture of common cancers. Cancer Epidemiol Biomark Prev. 2017;26:126–35.
Michailidou K, et al. Association analysis identifies 65 new breast cancer risk loci. Nature. 2017;551:92–4.
Milne RL, et al. Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer. Nat Genet. 2017;49:1767–78.
Zhang H, Ahearn TU, Lecarpentier J, et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat Genet.
Fletcher O, Dudbridge F. Candidate gene-environment interactions in breast cancer. BMC Med. 2014;12:195.
Hutter CM, Mechanic LE, Chatterjee N, Kraft P, Gillanders EM. Gene-environment interactions in cancer epidemiology: a National Cancer Institute Think Tank report. Genet Epidemiol. 2013;37:643–57.
Rudolph A, Chang-Claude J, Schmidt MK. Gene-environment interaction and risk of breast cancer. Br J Cancer. 2016;114:125–33.
Simonds NI, et al. Review of the gene-environment interaction literature in cancer: What do we know? Genet Epidemiol. 2016;40:356–65.
Hein R, et al. A genome-wide association study to identify genetic susceptibility loci that modify ductal and lobular postmenopausal breast cancer risk associated with menopausal hormone therapy use: a two-stage design with replication. Breast Cancer Res Treat. 2013;138:529–42.
Rudolph A, et al. Genetic modifiers of menopausal hormone replacement therapy and breast cancer risk: a genome-wide interaction study. Endocr Relat Cancer. 2013;20:875–87.
Wang X, et al. Genome-wide interaction analysis of menopausal hormone therapy use and breast cancer risk among 62,370 women. Sci Rep. 2022;12:6199.
Schoeps A, et al. Identification of new genetic susceptibility loci for breast cancer through consideration of gene-environment interactions. Genet Epidemiol. 2014;38:84–93.
Michailidou K, et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet. 2013;45:353-361e2.
Michailidou K, et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat Genet. 2015;47:373–80.
Amos CI, et al. The OncoArray Consortium: a network for understanding the genetic architecture of common cancers. Cancer Epidemiol Prev Biomark. 2017;26:126–35.
Fachal L, et al. Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat Genet. 2020;52:56–73.
Kapoor PM, et al. Assessment of interactions between 205 breast cancer susceptibility loci and 13 established risk factors in relation to breast cancer risk, in the Breast Cancer Association Consortium. Int J Epidemiol. 2020;49:216–32.
Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–1.
Wakefield J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am J Hum Genet. 2007;81:208–27.
Bulik-Sullivan BK, et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47:291–5.
Michailidou K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature.
Friedenreich CM. Review of anthropometric factors and breast cancer risk. Eur J Cancer Prev Off J Eur Cancer Prev Organ ECP. 2001;10:15–32.
Gunnell D, et al. Height, leg length, and cancer risk: a systematic review. Epidemiol Rev. 2001;23:313–42.
Zhang B, et al. Height and breast cancer risk: evidence from prospective studies and Mendelian randomization. JNCI J Natl Cancer Inst. 2015;107:djv219.
Nunney L. Size matters: height, cell number and a person’s risk of cancer. Proc R Soc B Biol Sci. 2018;285:20181743.
Renehan AG. Height and cancer: consistent links, but mechanisms unclear. Lancet Oncol. 2011;12:716–7.
Haines BP, Rigby PWJ. Expression of the Lingo/LERN gene family during mouse embryogenesis. Gene Expr Patterns. 2008;8:79–86.
Fortner R, Hankinson SE 4. Reproductive and hormonal factors and breast cancer. 2012. https://doi.org/10.1210/TEAM.9781936704064.CH4.
Endogenous Hormones and Breast Cancer Collaborative Group, et al. Sex hormones and risk of breast cancer in premenopausal women: a collaborative reanalysis of individual participant data from seven prospective studies. Lancet Oncol. 2013;14:1009–19.
Kawasaki Y, et al. Identification and characterization of Asef2, a guanine-nucleotide exchange factor specific for Rac1 and Cdc42. Oncogene. 2007;26:7620–7.
Bristow JM, et al. The Rho-family GEF Asef2 activates Rac to modulate adhesion and actin dynamics and thereby regulate cell migration. J Cell Sci. 2009;122:4535–46.
McHenry PR, Vargo-Gogola T. Pleiotropic functions of Rho GTPase signaling: A Trojan horse or Achilles’ heel for breast cancer treatment? Curr Drug Targets. 2010;11:1043–58.
Huet G, et al. Repression of the estrogen receptor-α transcriptional activity by the Rho/megakaryoblastic leukemia 1 signaling pathway. J Biol Chem. 2009;284:33729–39.
Pharoah PDP, et al. Polygenic susceptibility to breast cancer and implications for prevention. Nat Genet. 2002;31:33–6.
We thank all the individuals who took part in these studies and all the researchers, clinicians, technicians and administrative staff who have enabled this work to be carried out. ABCFS thank Maggie Angelakos, Judi Maskiell, Gillian Dite. ABCS thanks the Blood bank Sanquin, The Netherlands. ABCTB Investigators: Christine Clarke, Deborah Marsh, Rodney Scott, Robert Baxter, Desmond Yip, Jane Carpenter, Alison Davis, Nirmala Pathmanathan, Peter Simpson, J. Dinny Graham, Mythily Sachchithananthan. Samples are made available to researchers on a non-exclusive basis. BCEES thanks Allyson Thomson, Christobel Saunders, Terry Slevin, BreastScreen Western Australia, Elizabeth Wylie, Rachel Lloyd. The BCINIS study would not have been possible without the major contribution of Ms. H. Rennert, and the contributions of Dr. M. Pinchev, Dr. O. Barnet, Dr. N. Gronich, Dr. K. Landsman, Dr. A. Flugelman, Dr. W. Saliba, Dr. E. Liani, Dr. I. Cohen, Dr. S. Kalet, Dr. V. Friedman of the NICCC in Haifa, and all the contributing family medicine, surgery, pathology and oncology teams in all medical institutes in Northern Israel. The BREOGAN study would not have been possible without the contributions of the following: Manuela Gago-Dominguez, Jose Esteban Castelao, Angel Carracedo, Victor Muñoz Garzón, Alejandro Novo Domínguez, Maria Elena Martinez, Sara Miranda Ponte, Carmen Redondo Marey, Maite Peña Fernández, Manuel Enguix Castelo, Maria Torres, Manuel Calaza (BREOGAN), José Antúnez, Máximo Fraga and the staff of the Department of Pathology and Biobank of the University Hospital Complex of Santiago-CHUS, Instituto de Investigación Sanitaria de Santiago, IDIS, Xerencia de Xestion Integrada de Santiago-SERGAS; Joaquín González-Carreró and the staff of the Department of Pathology and Biobank of University Hospital Complex of Vigo, Instituto de Investigacion Biomedica Galicia Sur, SERGAS, Vigo, Spain. CBCS thanks study participants, co-investigators, collaborators and staff of the Canadian Breast Cancer Study, and project coordinators Agnes Lai and Celine Morissette. CGPS thanks staff and participants of the Copenhagen General Population Study. For the excellent technical assistance: Dorthe Uldall Andersen, Maria Birna Arnadottir, Anne Bank, Dorthe Kjeldgård Hansen. The Danish Cancer Biobank is acknowledged for providing infrastructure for the collection of blood samples for the cases. CNIO-BCS thanks Guillermo Pita, Charo Alonso, Nuria Álvarez, Pilar Zamora, Primitiva Menendez, the Human Genotyping-CEGEN Unit (CNIO). Investigators from the CPS-II cohort thank the participants and Study Management Group for their invaluable contributions to this research. They also acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, as well as cancer registries supported by the National Cancer Institute Surveillance Epidemiology and End Results program. The authors would like to thank the California Teachers Study Steering Committee that is responsible for the formation and maintenance of the Study within which this research was conducted. A full list of California Teachers Study (CTS) team members is available at https://www.calteachersstudy.org/team. We thank the participants and the investigators of EPIC (European Prospective Investigation into Cancer and Nutrition). ESTHER thanks Hartwig Ziegler, Sonja Wolf, Volker Hermann, Christa Stegmaier, Katja Butterbach. PROCAS thank NIHR for funding. The GENICA Network: Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, and University of Tübingen, Germany [HB, RH, Wing-Yee Lo], Department of Internal Medicine, Johanniter GmbH Bonn, Johanniter Krankenhaus, Bonn, Germany [YDK, Christian Baisch], Institute of Pathology, University of Bonn, Germany [Hans-Peter Fischer], Molecular Genetics of Breast Cancer, Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany [UH], Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Institute of the Ruhr University Bochum (IPA), Bochum, Germany [Thomas Brüning, Beate Pesch, Sylvia Rabstein, Anne Lotz]; and Institute of Occupational Medicine and Maritime Medicine, University Medical Center Hamburg-Eppendorf, Germany [Volker Harth]. KARMA and SASBAC thank the Swedish Medical Research Counsel. kConFab/AOCS wish to thank Heather Thorne, Eveline Niedermayr, all the kConFab research nurses and staff, the heads and staff of the Family Cancer Clinics, and the Clinical Follow Up Study (which has received funding from the NHMRC, the National Breast Cancer Foundation, Cancer Australia, and the National Institute of Health (USA)) for their contributions to this resource, and the many families who contribute to kConFab. LMBC thanks Gilian Peuteman, Thomas Van Brussel, EvyVanderheyden and Kathleen Corthouts. MARIE thanks Petra Seibold, Nadia Obi, Ursula Eilber and Muhabbet Celik. The MCCS was made possible by the contribution of many people, including the original investigators, the teams that recruited the participants and continue working on follow-up, and the many thousands of Melbourne residents who continue to participate in the study. We thank the coordinators, the research staff and especially the MMHS participants for their continued collaboration on research studies in breast cancer. The MISS study group acknowledges the former Principal Investigator, professor Håkan Olsson. NBHS and SBCGS thank study participants and research staff for their contributions and commitment to the studies. For NHS and NHS2 the study protocol was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required. We would like to thank the participants and staff of the NHS and NHS2 for their valuable contributions as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. The authors assume full responsibility for analyses and interpretation of these data. OBCS thanks Arja Jukkola-Vuorinen, Mervi Grip, Saila Kauppila, Meeri Otsukka, Leena Keskitalo and Kari Mononen for their contributions to this study. The OFBCR thanks Teresa Selander, Nayana Weerasooriya and Steve Gallinger. ORIGO thanks E. Krol-Warmerdam, and J. Blom for patient accrual, administering questionnaires, and managing clinical information. PBCS thanks Louise Brinton, Mark Sherman, Neonila Szeszenia-Dabrowska, Beata Peplonska, Witold Zatonski, Pei Chao, Michael Stagner. We thank the SEARCH and EPIC teams. UCIBCS thanks Irene Masunaka. UKBGS thanks Breast Cancer Now and the Institute of Cancer Research for support and funding of the Generations Study, and the study participants, study staff, and the doctors, nurses and other health care providers and health information sources who have contributed to the study. We acknowledge NHS funding to the Royal Marsden/ICR NIHR Biomedical Research Centre. The authors thank the WHI investigators and staff for their dedication and the study participants for making the program possible.
Open Access funding enabled and organized by Projekt DEAL. BCAC is funded by the European Union's Horizon 2020 Research and Innovation Programme (grant numbers 634935 and 633784 for BRIDGES and B-CAST respectively), and the PERSPECTIVE I&I project, funded by the Government of Canada through Genome Canada and the Canadian Institutes of Health Research, the Ministère de l’Économie et de l'Innovation du Québec through Genome Québec, the Quebec Breast Cancer Foundation. The EU Horizon 2020 Research and Innovation Programme funding source had no role in study design, data collection, data analysis, data interpretation or writing of the report. Additional funding for BCAC is provided via the Confluence project which is funded with intramural funds from the National Cancer Institute Intramural Research Program, National Institutes of Health. Genotyping of the OncoArray was funded by the NIH Grant U19 CA148065, and Cancer Research UK Grant C1287/A16563 and the PERSPECTIVE project supported by the Government of Canada through Genome Canada and the Canadian Institutes of Health Research (grant GPH-129344) and, the Ministère de l’Économie, Science et Innovation du Québec through Genome Québec and the PSRSIIRI-701 grant, and the Quebec Breast Cancer Foundation. Funding for iCOGS came from: the European Community's Seventh Framework Programme under grant agreement n° 223175 (HEALTH-F2-2009–223175) (COGS), Cancer Research UK (C1287/A10118, C1287/A10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007, C5047/A10692, C8197/A16565), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112—the GAME-ON initiative), the Department of Defence (W81XWH-10–1-0341), the Canadian Institutes of Health Research (CIHR) for the CIHR Team in Familial Risks of Breast Cancer, and Komen Foundation for the Cure, the Breast Cancer Research Foundation, and the Ovarian Cancer Research Fund. The BRIDGES panel sequencing was supported by the European Union Horizon 2020 research and innovation program BRIDGES (grant number, 634935) and the Wellcome Trust (v203477/Z/16/Z). The Australian Breast Cancer Family Study (ABCFS) was supported by grant UM1 CA164920 from the National Cancer Institute (USA). The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Breast Cancer Family Registry (BCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the USA Government or the BCFR. The ABCFS was also supported by the National Health and Medical Research Council of Australia, the New South Wales Cancer Council, the Victorian Health Promotion Foundation (Australia) and the Victorian Breast Cancer Research Consortium. J.L.H. is a National Health and Medical Research Council (NHMRC) Senior Principal Research Fellow. M.C.S. is a NHMRC Senior Research Fellow. The ABCS study was supported by the Dutch Cancer Society [grants NKI 2007–3839; 2009 4363]. The Australian Breast Cancer Tissue Bank (ABCTB) was supported by the National Health and Medical Research Council of Australia, The Cancer Institute NSW and the National Breast Cancer Foundation. The AHS study is supported by the intramural research program of the National Institutes of Health, the National Cancer Institute (grant number Z01-CP010119), and the National Institute of Environmental Health Sciences (grant number Z01-ES049030). The work of the BBCC was partly funded by ELAN-Fond of the University Hospital of Erlangen. The BCEES was funded by the National Health and Medical Research Council, Australia and the Cancer Council Western Australia and acknowledges funding from the National Breast Cancer Foundation (JS). For the BCFR-NY, BCFR-PA, BCFR-UT this work was supported by grant UM1 CA164920 from the National Cancer Institute. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Breast Cancer Family Registry (BCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government or the BCFR. The BCINIS study is supported in part by the Breast Cancer Research Foundation (BCRF). The BREast Oncology GAlician Network (BREOGAN) is funded by Acción Estratégica de Salud del Instituto de Salud Carlos III FIS PI12/02125/Cofinanciado and FEDER PI17/00918/Cofinanciado FEDER; Acción Estratégica de Salud del Instituto de Salud Carlos III FIS Intrasalud (PI13/01136); Programa Grupos Emergentes, Cancer Genetics Unit, Instituto de Investigacion Biomedica Galicia Sur. Xerencia de Xestion Integrada de Vigo-SERGAS, Instituto de Salud Carlos III, Spain; Grant 10CSA012E, Consellería de Industria Programa Sectorial de Investigación Aplicada, PEME I + D e I + D Suma del Plan Gallego de Investigación, Desarrollo e Innovación Tecnológica de la Consellería de Industria de la Xunta de Galicia, Spain; Grant EC11-192. Fomento de la Investigación Clínica Independiente, Ministerio de Sanidad, Servicios Sociales e Igualdad, Spain; and Grant FEDER-Innterconecta. Ministerio de Economia y Competitividad, Xunta de Galicia, Spain. CBCS is funded by the Canadian Cancer Society (grant # 313404) and the Canadian Institutes of Health Research. CCGP is supported by funding from the University of Crete. The CECILE study was supported by Fondation de France, Institut National du Cancer (INCa), Ligue Nationale contre le Cancer, Agence Nationale de Sécurité Sanitaire, de l'Alimentation, de l'Environnement et du Travail (ANSES), Agence Nationale de la Recherche (ANR). The CGPS was supported by the Chief Physician Johan Boserup and Lise Boserup Fund, the Danish Medical Research Council, and Herlev and Gentofte Hospital. The CNIO-BCS was supported by the Instituto de Salud Carlos III, the Red Temática de Investigación Cooperativa en Cáncer and grants from the Asociación Española Contra el Cáncer and the Fondo de Investigación Sanitario (PI11/00923 and PI12/00070). The American Cancer Society funds the creation, maintenance, and updating of the CPS-II cohort. The California Teachers Study (CTS) and the research reported in this publication were supported by the National Cancer Institute of the National Institutes of Health under award number U01-CA199277; P30-CA033572; P30-CA023100; UM1-CA164917; and R01-CA077398. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. The collection of cancer incidence data used in the California Teachers Study was supported by the California Department of Public Health pursuant to California Health and Safety Code Sect. 103885; Centers for Disease Control and Prevention’s National Program of Cancer Registries, under cooperative agreement 5NU58DP006344; the National Cancer Institute’s Surveillance, Epidemiology and End Results Program under contract HHSN261201800032I awarded to the University of California, San Francisco, contract HHSN261201800015I awarded to the University of Southern California, and contract HHSN261201800009I awarded to the Public Health Institute. The opinions, findings, and conclusions expressed herein are those of the author(s) and do not necessarily reflect the official views of the State of California, Department of Public Health, the National Cancer Institute, the National Institutes of Health, the Centers for Disease Control and Prevention or their Contractors and Subcontractors, or the Regents of the University of California, or any of its programs. The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by: Ligue Contre le Cancer, Institut Gustave Roussy, Mutuelle Générale de l’Education Nationale, Institut National de la Santé et de la Recherche Médicale (INSERM) (France); German Cancer Aid, German Cancer Research Center (DKFZ), Federal Ministry of Education and Research (BMBF) (Germany); the Hellenic Health Foundation, the Stavros Niarchos Foundation (Greece); Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy and National Research Council (Italy); Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF), Statistics Netherlands (The Netherlands); Health Research Fund (FIS), PI13/00061 to Granada, PI13/01162 to EPIC-Murcia, Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, ISCIII RETIC (RD06/0020) (Spain); Cancer Research UK (14136 to EPIC-Norfolk; C570/A16491 and C8221/A19170 to EPIC-Oxford), Medical Research Council (1000143 to EPIC-Norfolk, MR/M012190/1 to EPIC-Oxford) (United Kingdom). The ESTHER study was supported by a grant from the Baden Württemberg Ministry of Science, Research and Arts. Additional cases were recruited in the context of the VERDI study, which was supported by a grant from the German Cancer Aid (Deutsche Krebshilfe). PROCAS thank NIHR for funding. The GENICA was funded by the Federal Ministry of Education and Research (BMBF) Germany grants 01KW9975/5, 01KW9976/8, 01KW9977/0 and 01KW0114, the Robert Bosch Foundation, Stuttgart, Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, the Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Institute of the Ruhr University Bochum (IPA), Bochum, as well as the Department of Internal Medicine, Johanniter GmbH Bonn, Johanniter Krankenhaus, Bonn, Germany. The GESBC was supported by the Deutsche Krebshilfe e. V.  and the German Cancer Research Center (DKFZ). The KARMA study was supported by Märit and Hans Rausings Initiative Against Breast Cancer. The KBCP was financially supported by the special Government Funding (VTR) of Kuopio University Hospital grants, Cancer Fund of North Savo, the Finnish Cancer Organizations, and by the strategic funding of the University of Eastern Finland. kConFab is supported by a grant from the National Breast Cancer Foundation, and previously by the National Health and Medical Research Council (NHMRC), the Queensland Cancer Fund, the Cancer Councils of New South Wales, Victoria, Tasmania and South Australia, and the Cancer Foundation of Western Australia. Financial support for the AOCS was provided by the United States Army Medical Research and Materiel Command [DAMD17-01–1-0729], Cancer Council Victoria, Queensland Cancer Fund, Cancer Council New South Wales, Cancer Council South Australia, The Cancer Foundation of Western Australia, Cancer Council Tasmania and the National Health and Medical Research Council of Australia (NHMRC; 400413, 400281, 199600). G.C.T. and P.W. are supported by the NHMRC. RB was a Cancer Institute NSW Clinical Research Fellow. LMBC is supported by the 'Stichting tegen Kanker'. DL is supported by the FWO. The MARIE study was supported by the Deutsche Krebshilfe e.V. [70–2892-BR I, 106332, 108253, 108419, 110826, 110828], the Hamburg Cancer Society, the German Cancer Research Center (DKFZ) and the Federal Ministry of Education and Research (BMBF) Germany [01KH0402]. The MCBCS was supported by the NIH grants R35CA253187, R01CA192393, R01CA116167, R01CA176785 a NIH Specialized Program of Research Excellence (SPORE) in Breast Cancer [P50CA116201], and the Breast Cancer Research Foundation. The Melbourne Collaborative Cohort Study (MCCS) cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further augmented by Australian National Health and Medical Research Council grants 209057, 396414 and 1074383 and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry and the Australian Institute of Health and Welfare, including the National Death Index and the Australian Cancer Database. The MEC was supported by NIH Grants CA63464, CA54281, CA098758, CA132839 and CA164973. The MISS study was supported by funding from ERC-2011-294576 Advanced grant, Swedish Cancer Society CAN 2018/675, Swedish Research Council, Local hospital funds, Berta Kamprad Foundation FBKS 2021–19, Gunnar Nilsson. The MMHS study was supported by NIH grants CA97396, CA128931, CA116201, CA140286 and CA177150. MSKCC is supported by grants from the Breast Cancer Research Foundation and Robert and Kate Niehaus Clinical Cancer Genetics Initiative. The NBHS was supported by NIH grant R01CA100374. Biological sample preparation was conducted the Survey and Biospecimen Shared Resource, which is supported by P30 CA68485. The Northern California Breast Cancer Family Registry (NC-BCFR) and Ontario Familial Breast Cancer Registry (OFBCR) were supported by grant U01CA164920 from the USA National Cancer Institute of the National Institutes of Health. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Breast Cancer Family Registry (BCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the USA Government or the BCFR. The Carolina Breast Cancer Study (NCBCS) was funded by Komen Foundation, the National Cancer Institute (P50 CA058223, U54 CA156733, U01 CA179715), and the North Carolina University Cancer Research Fund. The NHS was supported by NIH grants P01 CA87969, UM1 CA186107, and U19 CA148065. The NHS2 was supported by NIH grants UM1 CA176726 and U19 CA148065. The PBCS was funded by Intramural Research Funds of the National Cancer Institute, Department of Health and Human Services, USA. Genotyping for PLCO was supported by the Intramural Research Program of the National Institutes of Health, NCI, Division of Cancer Epidemiology and Genetics. The PLCO is supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and supported by contracts from the Division of Cancer Prevention, National Cancer Institute, National Institutes of Health. The SASBAC study was supported by funding from the Agency for Science, Technology and Research of Singapore (A*STAR), the US National Institute of Health (NIH) and the Susan G. Komen Breast Cancer Foundation. The SBCS was supported by Sheffield Experimental Cancer Medicine Centre and Breast Cancer Now Tissue Bank. SEARCH is funded by Cancer Research UK [C490/A10124, C490/A16561] and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge. The University of Cambridge has received salary support for PDPP from the NHS in the East of England through the Clinical Academic Reserve. The Sister Study (SISTER) is supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01-ES044005 and Z01-ES049033). The SMC is funded by the Swedish Cancer Foundation and the Swedish Research Council (VR 2017–00644) grant for the Swedish Infrastructure for Medical Population-based Life-course Environmental Research (SIMPLER). The USRT Study was funded by Intramural Research Funds of the National Cancer Institute, Department of Health and Human Services, USA. The WHI program is funded by the National Heart, Lung, and Blood Institute, the US National Institutes of Health and the US Department of Health and Human Services (HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C and HHSN271201100004C). This work was also funded by NCI U19 CA148065-01.
Ethics approval and consent to participate.
Each participating study obtained informed consent from the participants and was approved by their local ethics committee.
Consent for participate
Each participating study obtained informed consent from the participants to publish and was approved by their local ethics committee.
The authors do not have any conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
. A genome-wide gene-environment interaction study of breast cancer risk for women of European ancestry. Supplementary Table 1: Participating studies with number of total cases and controls per study. Supplementary Table 2: Detailed information of the characteristics of the study population by study design and case-control status. Supplementary Table 3: Associations of epidemiological risk factors for overall and ER-specific subtype breast cancer risk in population-based and cohort studies. Supplementary Table 4: Stratified analysis results for genome-wide significant interaction results by categories of risk factors. Supplementary Figure 1: Quantile-Quantile (Q-Q) plots of genome-wide interaction of A) Adult height on overall breast cancer risk and B) Age at menarche on ER+ breast cancer risk. Supplementary Figure 2: Frailty-scale heritability explained by GxE interaction on overall and estrogen receptor positive breast cancer risk. Supplementary Figure 3: Regional association plot for the interaction analyses between SNP rs80018847 and adult height for overall breast cancer risk. Supplementary Figure 4: Regional association plot for the interaction analyses between SNP rs4770552 and age at menarche for ER+ breast cancer risk. Supplementary Figure 5: Power (x-axis) to detect gene-environment interaction odds ratio (y-axis) at different minor allele frequencies (0.01 to 0.5: legend below) for 1:1 unmatched case-control study for different sample sizes (N = 40,000 to 120,000 with 10,000 increment). Power calculation was performed by Quanto 1.2.4, assuming a log additive model with SNP marginal effect estimate as 1.10, marginal effect estimate of the environmental risk factor as 1.20, and a two-side alpha of 5 x 10-08. We also assumed a 15% prevalence of the environmental risk factor and 1% prevalence of the disease.
About this article
Cite this article
Middha, P., Wang, X., Behrens, S. et al. A genome-wide gene-environment interaction study of breast cancer risk for women of European ancestry. Breast Cancer Res 25, 93 (2023). https://doi.org/10.1186/s13058-023-01691-8
- Breast cancer
- Gene-environment interactions
- Genetic epidemiology
- European ancestry