The EPIC study is an on-going multi-center prospective cohort study aimed at investigating the association between diet, lifestyle, genetic and environmental factors and the development of cancer and other chronic diseases. It consists of 521,448 men and women, followed-up for cancer incidence and cause-specific mortality for several decades. There are 23 EPIC centers in 10 European countries, that is, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden, and United Kingdom. Details have been described elsewhere . At enrollment between 1992 and 2000, information on habitual diet in the preceding year was collected through a questionnaire in most countries. Lifestyle questionnaires were used for information on education, reproductive history, use of oral contraceptives and hormone therapy, family history, medical history, physical activity and history of consumption of alcohol and tobacco .
This study pertains to female participants of the EPIC cohort between 25- and 70-years old at recruitment. We excluded participants with prior history of cancer, incomplete dietary/non-dietary information, and poorly completed questionnaires based on their ratio of energy intake versus energy expenditure (bottom 1% or top 1% of the cohort), leaving 335,060 women.
All participants provided written informed consent. The study was approved by the International Agency for Research on Cancer (IARC)’s ethical review committee and by the respective local ethical committees.
Diet was assessed using country-specific questionnaires , namely self-administered semi-quantitative food-frequency questionnaires (±260 food items), dietary history questionnaires (>600 food items) administered by interviewers, and semi-quantitative food-frequency questionnaires combined with a food record. Further details on questionnaires and their validation are described elsewhere .
As the exact structure of the questions varied by center and questionnaire, complete information on caffeinated and decaffeinated coffee intake was available only in Germany, Greece, Italy (except Ragusa and Naples), the Netherlands, Norway, Spain, Sweden (except Umea), and the United Kingdom. Analyses of caffeinated and decaffeinated coffee consumption only included women with complete information on type of coffee intake, that is, those whose different types of coffee intakes added up to their total coffee intake. For caffeinated coffee consumption, 226,368 participants were included. Since none of the participants in Norway and Sweden consumed decaffeinated coffee, they were excluded from analysis for decaffeinated coffee consumption, leaving 176,373 participants. Information on tea intake was not available in Norway, leaving 299,890 participants.
As the cohort consists of multiple populations with a wide range of variation in terms of volume and concentration of coffee and tea intake, country specific quartiles for these beverages were estimated based on distribution of intake within each country, after excluding the non-consumers. This yielded the following intake categories for total coffee, caffeinated coffee and tea: none, low, moderately low, moderately high, and high. As decaffeinated coffee intake was less common, we used tertiles of intake for the consumers and intake categories were: none, low, medium, and high.
Ascertainment of breast cancer cases
The outcome of interest was first incident of primary invasive breast cancer (coded using International Classification of Diseases for Oncology, Second Edition as C50.0-C50.9). As data on menopausal status at diagnosis was lacking, breast cancers occurring before the median menopausal age of 50 years were considered premenopausal, whereas those diagnosed at 50 years or older were considered postmenopausal. Information on hormone receptor status was provided by each center based on pathology reports. This information was routinely available for tumors diagnosed after 1997 to 2006, depending on the center.
Follow-up was based on linkage with population cancer registries in Denmark, Italy, Netherlands, Norway, Spain, Sweden and the United Kingdom. In France, Germany and Greece, combined methods including health insurance records, cancer and pathology registries, and active follow-up were used. Censoring dates for most centers depended on the dates at which cancer registries were considered complete (varying from December 2004 in Spain to December 2008 in Italy). In Germany, Greece and France where active follow-up was undertaken, dates of censoring were up to March 2010, December 2009, and July 2005, respectively. Loss to follow-up was less than 4%.
Multivariable Cox regression was used to examine the association between coffee or tea consumption and risk of breast cancer. Time at entry was age at recruitment, and exit time was age at diagnosis with breast cancer as first tumor, death, emigration, loss to follow-up, or end of follow-up. The non-zero slope of the scaled Schoenfeld residuals on the time function suggested that the proportional hazard assumption was met. All analyses were stratified by age at recruitment in one-year categories and by centers to control for differences in recruitment or follow-up procedures and questionnaire design. We studied consumption of beverages both as categorical and continuous (increment of 100 ml/day) variables. Non-consumers of coffee comprised a relatively small group (<10%) and seemed to have some unique health behaviors: they were less likely to have ever smoked, consume alcohol, or to have ever used oral contraceptives, and they were more likely to be physically inactive compared to the rest of the study population. We, therefore, used the low coffee consumers as the reference group in the categorical data analysis. To test for linear trends, the categories were entered as a continuous term (score variable: 0,1,2,3,4) in the Cox model. Since most coffee consumers tend to consume caffeinated as well as decaffeinated coffee, we additionally cross-classified coffee intakes in relation to breast cancer. This yielded eight categories, of which (any) decaffeinated coffee consumers with low caffeinated coffee intake comprised the largest group and was hence chosen as the reference for reasons of statistical robustness.
Two separate Cox models were fitted for pre- and postmenopausal breast cancers (Additional file 1). Both models were adjusted for age at menarche (categorical: <12, 12 to 4, >15 years), ever use of oral contraceptives (yes/no), age at first delivery (categorical: nulliparous, <20, 20 to 29, 30 to 39, ≥40 years), ever breastfeeding (yes/no), smoking status (categorical: never, past, current), educational level (categorical: none, primary school, technical/professional school, secondary school, university), physical activity level based on the Cambridge Physical Activity Index  (categorical: inactive, moderately inactive, moderately active, active), alcohol intake (continuous), height (continuous), weight (continuous), energy intake from fat source (continuous), energy intake from non-fat source (continuous), total saturated fat intake (continuous), and total fiber intake (continuous). The model for postmenopausal breast cancer was additionally adjusted for ever-use of postmenopausal hormones (yes/no). Importantly, coffee and tea intake were mutually adjusted for one another while models for caffeinated and decaffeinated coffee were also mutually adjusted.
As data on hormone receptor status was only available in breast cancer cases diagnosed after 1997 to 2006, depending on the center, it is mainly the postmenopausal cases that had receptor status available (77.2%) and only about half of the premenopausal cases (58.6%). Hormone-receptor defined analyses were, hence, only done among postmenopausal breast cancers and were not possible within premenopausal breast cancers.
As a form of sensitivity analysis, we also analyzed beverage intake using categories based on the overall cohort instead of country specific intake.
To improve comparability across centers, dietary intake was calibrated by a 24-hour dietary recall method common to all centers, in a random sub-sample of 8% of the cohort at baseline (Additional file 1) [24,25].
Heterogeneity of the association according to hormone receptor status was assessed using a data-augmentation method described by Lunn and McNeil . Effect modification by body mass index  and country were assessed, and several sensitivity analyses were conducted (Additional file 1).
Two-tailed P-values <0.05 and 95% confidence intervals (CI) for hazard ratios (HRs) not including 1 were considered statistically significant. All analyses were performed using SAS version 9.1 (SAS Institute Inc, Cary, NC, USA).