Cigarette smoking, cytochrome P4501A1 polymorphisms, and breast cancer among African-American and white women

Introduction Previous epidemiologic studies suggest that women with variant cytochrome P4501A1 (CYP1A1) genotypes who smoke cigarettes are at increased risk for breast cancer. Methods We evaluated the association of breast cancer with CYP1A1 polymorphisms and cigarette smoking in a population-based, case–control study of invasive breast cancer in North Carolina. The study population consisted of 688 cases (271 African Americans and 417 whites) and 702 controls (285 African Americans and 417 whites). Four polymorphisms in CYP1A1 were genotyped using PCR/restriction fragment length polymorphism analysis: M1 (also known as CYP1A1*2A), M2 (CYP1A1*2C), M3 (CYP1A1*3), and M4 (CYP1A1*4) Results No associations were observed for CYP1A1 variant alleles and breast cancer, ignoring smoking. Among women who smoked for longer than 20 years, a modest positive association was found among women with one or more M1 alleles (odds ratio [OR] = 2.1, 95% confidence interval [CI] = 1.2–3.5) but not among women with non-M1 alleles (OR = 1.2, 95% CI = 0.9–1.6). Odds ratios for smoking longer than 20 years were higher among African-American women with one or more M3 alleles (OR = 2.5, 95% CI = 0.9–7.1) compared with women with non-M3 alleles (OR = 1.3, 95% CI = 0.8–2.2). ORs for smoking in white women did not differ appreciably based upon M2 or M4 genotypes. Conclusions Cigarette smoking increases breast cancer risk in women with CYP1A1 M1 variant genotypes and in African-American women with CYP1A1 M3 variant genotypes, but the modifying effects of the CYP1A1 genotype are quite weak.


Introduction
Cigarette smoking is a major route of exposure to many potential human carcinogens. Cigarette smoking has been associated with increased risk of breast cancer in some epidemiologic studies, but many studies showed no effect or an inverse association [1,2]. The most consistent finding appears to be a weak positive association following a long duration of smoking [3]. In an attempt to resolve these inconsistencies, recent epidemiologic studies have focused on interactions between cigarette smoking and genetic factors involved in the metabolism of tobaccorelated carcinogens [4]. If interactions are observed between smoking and genes involved in the metabolism of specific compounds, a stronger case can be made that associations between smoking and breast cancer are causal and not due entirely to chance or to bias.
The cytochrome P4501A1 (CYP1A1) gene encodes an enzyme with aryl hydrocarbon hydroxylase activity. Formation of aryl epoxides by aryl hydrocarbon hydroxylase is the first step in the metabolism of polycyclic aromatic hydrocarbons from cigarette smoke [5]. The activity of aryl hydrocarbon hydroxylase encoded by the CYP1A1 gene has been observed in both normal and neoplastic human breast bp = base pairs; CBCS = Carolina Breast Cancer Study; CI = confidence interval; CYP1A1 = cytochrome P4501A1; ETS = environmental tobacco smoke; ICR = interaction contrast ratio; OR = odds ratio; PCR = polymerase chain reaction; RFLP = restriction fragment length polymorphism. epithelium [6,7]. Some studies suggest that heterocyclic amines are activated by CYP1A1 via N-hydroxylation in breast tissue [8]. Four common polymorphisms of the CYP1A1 gene have been identified: M1, a T → C substitution at nucleotide 3801, giving rise to a MspI restriction site in the 3'-noncoding region [9]; M2, nucleotide 2455 A → G, resulting in an amino acid change at codon 462 of isoleucine to valine within the heme-binding domain of exon 7 [10]; M3, nucleotide 3205 T → C, creating a MspI restriction fragment length polymorphism (RFLP) in the 3'-noncoding region [11]; and M4, nucleotide 2453 C → A, resulting in an amino acid substitution at codon 461 of threonine to asparagine [12].
The functional significance of variant CYP1A1 genotypes is unclear. Studies of CYP1A1 in cultured human lymphocytes showed significantly elevated levels of inducible enzyme activity among M2 genotypes compared with the wild-type genotype [13][14][15]. Crofts and colleagues [15] reported that M2 alleles appeared to be associated with CYP1A1 inducibility at the level of transcription followed by threefold elevation in aryl hydrocarbon hydroxylase enzyme activity. The M1 allele was also reported more readily inducible than the CYP1A1 wild-type allele [14,16,17].
Several epidemiologic studies evaluated the relationship between cigarette smoking, CYP1A1 polymorphisms, and breast cancer risk [18][19][20][21][22]. Some studies reported that M1 and M2 variants increase risk of breast cancer [18,21], while other studies did not observe main effects for CYP1A1 variants [19,21]. Joint effects of smoking and CYP1A1 variants on breast cancer risk have been reported in two studies [18,22].
In a population-based, case-control study of African-American and white women in North Carolina, we examined the association of the CYP1A1 genotypes and cigarette smoking and breast cancer. We hypothesized that women with CYP1A1 variant alleles and high levels of smoking exposure (longer duration, higher dose, earlier age at initiation) as well as exposure to environmental tobacco smoke (ETS) would be at increased risk for breast cancer.

Study design and participants
The Carolina Breast Cancer Study (CBCS) is a populationbased, case-control study of breast cancer in African-American and white women in North Carolina [23,24]. The cases were women with a first diagnosis of invasive breast cancer identified through a rapid ascertainment system implemented in cooperation with the North Carolina Central Cancer Registry. The coverage rate of the Central Cancer Registry for incident breast cancer cases was 97% [23]. Controls were selected from lists provided by the North Carolina Division of Motor Vehicles for women younger than 65 years old, and from records of the US Health Care Financing Administration for women 65-74 years of age. Coverage rates for the underlying North Carolina population were 96% for the Division of Motor Vehicles list and 93% for the Health Care Financing Administration list [23].
The CBCS was conducted in two phases: phase 1 (1993-1996) and phase 2 (1996-2001). The present analysis is based upon phase 1 participants, where the response rates were 76% for cases and 55% for controls. Detailed methods have been reported previously [23]. Interview data included reproductive history, lifestyle factors, a detailed family history, medical history, and occupational history. Approximately 98% of participants who were interviewed agreed to give a 30 ml blood sample at the time of interview. Informed consent to obtain genomic DNA from the blood was sought using a form approved by the Institutional Review Board of the University of North Carolina, School of Medicine and in compliance with the Helsinki Declaration. For the present study, we genotyped CYP1A1 on the first 688 breast cancer cases and 702 controls enrolled in phase 1 of the CBCS. Due to financial constraints, we were unable to genotype all study participants. The genotyped participants correspond to 80% of the 861 total cases and 89% of the 790 total controls enrolled in phase 1 of the CBCS.

Laboratory methods
Genomic DNA was extracted from peripheral blood leukocytes using an automated DNA extractor (Applied Biosystems, Foster City, CA, USA) and was stored at 4°C. PCR-RFLP assays were designed to detect each of the variant CYP1A1 alleles. Several systematic nomenclatures for CYP1A1 have been proposed [25]. Wild-type CYP1A1 has been referred to as CYP1A1*1 or CYP1A1*1A, M1 as CYP1A1*2 or CYP1A1*2A, M2 as CYP1A1*2B (for the combination of M1 + M2) or CYP1A1*2C (for M2 only), M3 as CYP1A1*3, and M4 as CYP1A1*4. To avoid confusion, we have retained the M1-M4 designations, as recently adopted by Bartsch and colleagues [25]. The variant alleles M2 and M4 lose BsaI and BsrDI restricted sites at nucleotides 4889 and 4887, respectively. An internal control for the completeness of digestion was created by introducing a restriction-enzyme site in the PCR primers designed for genotyping of M2 and M4 alleles. The variant alleles M1 and M3 contain MspI restricted sites at nucleotides 6235 and 5639, respectively. These restricted sites do not exist on the wild-type allele.
PCR products of length 214 bp were produced after amplification and 4 µl PCR products were subjected to BsrDI (for M2) or BsaI (for M4) digestion (New England Biolabs, Beverly, MA, USA) at 37°C in a total volume of 13 µl. PCR products from the wild-type allele was digested and separated on a 15% polyacrylamide gel with bands of 149 bp and 55 bp fragments. M2 or M4 variant alleles were undigested at polymorphic sites, and appeared as a band of 206 bp on the electrophoresis. The fragment of 214 bp represented incompletely digested PCR products, which could be distinguished from the band for M2 or M4 variant alleles on a 15% polyacrylamide gel.
PCR primers used to determine M1 and M3 alleles included the forward primer 5'-GGCTGAGCAATCTGAC-CCTA-3' and the reverse primer 5'-GGCCCCAACTACT-CAGAGGCT-3'. Reaction components were the same as for the PCR for genotyping of M2 and M4 alleles. Amplification conditions included: 95°C for 5 min, 64°C for 2 min, and 75°C for 2 min for the first cycle; 95°C for 1 min, 64°C for 2 min, and 75°C for 2 min for the following 33 cycles; and 95°C for 1 min and 72°C for 10 min for the final cycle. PCR products of length 739 bp were produced and 8 µl PCR products were subject to MspI/SphI digestion in a total volume of 25 µl. After 15% polyacrylamide gel electrophoreses, the separated bands of size 408 bp and 362 bp represented the wild-type and M1 alleles, respectively; and the bands of size 331 bp and 226 bp were determined as the wild-type and M3 alleles, respectively.
The introduced restriction-enzyme sites within the primers were designed as an internal control in order to assess the completeness of enzymatic digestion. Genotyping results were determined by two independent readers. Readers and laboratory personnel were blinded to the case-control status and other participant characteristics. When interpreting the results, the two readers were unaware of each other other's interpretations. All discrepancies in genotyping results between readers were then resolved through group discussion, and agreement was achieved on all samples with discordant genotyping results. Repeat genotyp-ing for 5% of samples (70 subjects) randomly selected from study subjects was performed to evaluate reproducibility. The results were 100% concordant for the repeat samples. Positive controls were also used for genotyping at each CYP1A1 locus.

Statistical analysis
Genotype frequencies were compared in cases versus controls using a chi-square test. When cell sizes were less than five, Fisher's exact test was used. In order to partially address the potential for linkage disequilibrium among CYP1A1 alleles, we cross-classified participants according to combinations of wild-type and variant alleles at each locus. Haplotypes (specific chromosomal combinations of CYP1A1 alleles) cannot be identified directly using PCR-RFLP techniques. For example, gel patterns cannot distinguish cis versus trans relationships for M1 and M3 alleles. CYP1A1 haplotype frequencies were thus estimated using the EH algorithm [26]. Haplotype frequencies in cases and controls were compared using a likelihood ratio test, as implemented in the EH algorithm.
Information concerning exposure to tobacco smoke was obtained from inperson interviews. The reference date was the date of diagnosis for cases and the date of selection for controls. To calculate the odds ratios (ORs) for active smoking, women who smoked fewer than 100 cigarettes in their lifetime constituted the referent group (nonsmokers). Women who never actively smoked but who had lived with a smoker after the age of 18 years were classified as exposed to ETS. The duration of active smoking was calculated by asking 'Keeping in mind that you may have stopped and started several times, overall how many years did you smoke regularly?' The dose of smoking was calculated in packs per day (20 cigarettes to a pack).
The ORs for active smoking and the CYP1A1 genotype did not differ when women with exposure to ETS were excluded from the referent group. Thus, to increase sample size, ORs for active smoking were calculated including women with ETS exposure in the referent group. For analyses of smoking and the CYP1A1 genotype on a multiplicative scale, analyses of ETS were conducted among women who never smoked themselves, using women exposed to neither active smoking nor ETS as a referent group [24].
Smoking information was missing for 10 cases and for seven controls.
The ORs and 95% confidence intervals (CIs) were used to measure associations between CYP1A1 genotypes and breast cancer and between smoking status and breast cancer, using unconditional logistic regression models. PROC GENMOD of the software package SAS (version 8.1; SAS Institute, Cary, NC, USA) was used to incorporate offsets derived from sampling probabilities used to identify eligible participants. Covariates included age, race (white/African American), age at menarche (≥ 12 years or < 12 years), parity (nulliparous, 1 and ≥ 2), family history of breast cancer (yes/no), benign breast biopsy (yes/no), and alcohol consumption (yes/no). ORs did not differ after adjusting for additional covariates, so results are presented adjusting for sampling fractions, age, and race (when appropriate).
Analyses of smoking effects according to CYP1A1 genotypes were conducted using the categories smoking status (current, former), duration, dose, time since cessation, and age at initiation. These categories and cut points were created in previous analyses that ignored the CYP1A1 genotype [24], and they represent aspects of smoking exposure that showed the strongest associations with breast cancer in the entire dataset. Stratified analyses were performed based on menopausal status, since previous analyses showed stronger effects for smoking in postmenopausal compared with premenopausal women [24]. Women were classified as postmenopausal if they had undergone natural menopause, bilateral oophorectomy, or irradiation to the ovaries. Natural menopause was defined as the cessation of regular (or approximately monthly) menstrual cycles. Women in the transition (perimenopausal) period were classified as postmenopausal. For women aged 50 years or older, postmenopausal status was assigned to those who had not stopped cycling but were taking hormone replacement therapy.
To assess interaction on a multiplicative scale, ORs for smoking were estimated across strata of CYP1Al genotypes, and separate logistic models with interaction terms between smoking and the CYP1Al genotypes were analyzed. To estimate independent and joint effects of cigarette smoking and the CYP1Al genotype on an additive scale, indicator variables were created for each category of joint exposure of smoking variable and the CYP1Al genotype (variant allele/variant allele or variant allele/wild-type allele). Women with the homozygous wild-type allele genotypes (wild-type allele/wild-type allele) and the lowest level of exposure to smoking variable were used as a common reference group.
Interaction contrast ratios (ICR) and CIs were calculated as described by Hosmer and Lemeshow [27] for the joint effects of the M1 locus and smoking. Data were too sparse to calculate ICRs for the remaining CYP1A1 loci. The ICR was calculated using the following formula [27]: ICR = OR 11 -OR 10 -OR 01 + 1, where OR 11 is the odds ratio for participants with smoking exposure and variant-containing CYP1A1 genotype, OR 10 is the odds ratio for the variant CYP1A1 genotype among those unexposed to smoking, and OR 01 is the odds ratio for smoking exposure among those with the nonvariant CYP1A1 genotype. ICRs greater than zero imply greater than additive effects of smoking and the CYP1A1 genotype (synergy), ICRs of zero imply additive effects (no interaction on an additive scale), and ICRs less than zero imply less than additive effects (antagonism). The 95% CIs for ICRs that do not contain zero can be interpreted as statistically significant at an alpha level of 0.05.

Results
Characteristics of breast cancer cases and controls, and ORs for breast cancer and smoking as well as other risk factors, have been published previously for the CBCS [24]. Briefly, no association was observed between current smoking and breast cancer. Smoking of 20 years or longer duration and cessation of smoking within 3 years of the reference date among former smokers showed modest associations with breast cancer [24]. These associations were stronger among postmenopausal compared with premenopausal women. The ORs for smoking were similar in African-American and white women.
The ORs for CYP1A1 genotypes and breast cancer are presented in Table 2 for African-American and white women. Although a slight positive association was observed among African-American women for the M1/M1 genotype and breast cancer, and among white women for the M2/M2 genotype and breast cancer, the results are not statistically significant. The remaining ORs were close to 1.0, with the exception of an inverse association for M4 alleles in African-American women that was very imprecise due to small numbers. The ORs for CYP1A1 genotypes in premenopausal and postmenopausal women are presented in Table 3 and were also close to 1.0. ORs were unchanged after adjustment for family history, reproductive history, alcohol, smoking, or other breast cancer risk factors (data not shown).
The ORs for smoking and breast cancer in relation to CYP1A1 M1 genotypes are presented in Table 4. Among current smokers, the ORs were close to 1.0 for women with M1 (wt/M1 or M1/M1) as well as wild-type genotypes. ORs for former smoking, for smoking for longer than 20 years, for cessation of smoking within 10 years prior to the reference date, for initiation of smoking before age 18, and for exposure to ETS were higher among women with M1 genotypes compared with wild-type genotypes. The ORs were elevated further among carriers of one or more copies of the M1 allele for former smoking and for smoking cessation within 10 years in postmenopausal women, and for initiation of smoking prior to age 18 and for ETS exposure in premenopausal women. The ORs for the number of cigarettes per day did not show a dose-response relationship. The ORs for smoking and breast cancer according to the M3 genotype in African-American women only are presented in Table 6, due to the low frequency of the M3 allele in white women. Cessation of smoking within 10 years was more strongly associated with breast cancer in women with M3 genotypes than with wild-type genotypes. ORs for smoking for longer than 20 years and for initiation of smoking before age 18 were higher in women with M3 genotypes than with wild-type genotypes. The ORs for the number of cigarettes smoked per day did not differ according to M3 genotype. Results stratified on menopausal status were very imprecise. In premenopausal women, the OR for initiation of smoking before age 18 was 7.3 (95% CI = 0.7-73.0) for women with M3 genotypes and was 1.6 (95% CI = 0.7-3.5) for wild-type genotypes. In postmeno- pausal women, the OR for smoking for longer than 20 years was 3.0 (95% CI = 0.9-10.7) for women with M3 genotypes and was 1.3 (95% CI = 0.7-2.5) for wild-type genotypes.
The ORs for smoking and breast cancer according to the M4 genotype in white women are presented in Table 7, due to the lower frequency of the M4 allele in African-American women. The ORs were imprecise due to the low frequency of the M4 allele, and did not differ appreciably by M4 genotype. A suggestion of an inverse association with long duration of smoking was observed for women with M4 gen-otypes while a weak positive association was observed for wild-type genotypes.
Joint effects of smoking and the M1 genotype are presented on an additive scale in Table 8. An indication of greater than additive joint effects was observed for M1 genotypes and former smoking, number of cigarettes smoked, duration of smoking, cessation of smoking within 10 years, and initiation of smoking before age 18. The ICRs were highest for early age at initiation of smoking in premenopausal women, and for cessation of smoking with 10 years in both premenopausal and postmenopausal women.

Discussion
We conducted a population-based, case-control study of invasive breast cancer in relation to cigarette smoking and CYP1A1 polymorphisms in African-American and white women in North Carolina. ORs for all four CYP1A1 variants (M1, M2, M3 and M4) were close to the null value in each subgroup examined (African-American and white women, premenopausal and postmenopausal women). Two previous studies reported positive associations between CYP1A1 variants and breast cancer. Taioli and colleagues [21] found a moderate to strong association for M1 genotypes among African-American women. A weak positive association for M2 genotypes among Caucasians was reported by Ambrosone and colleagues [18]. Two other studies, by Ishibe and colleagues [22] and by Bailey and colleagues [20], did not find an association between CYP1A1 variants and breast cancer. The ORs for CYP1A1 genotypes in the latter two studies were similar to those of the CBCS (ignoring smoking). The previous studies [18,[20][21][22] were smaller than the CBCS.
Most previous studies of CYP1A1 polymorphisms and breast cancer categorized smoking as ever or never, and did not investigate the effects of dose, duration, age at initiation, time since cessation, or exposure to ETS. The study by Ishibe and colleagues was the only one to evaluate the interaction between former smoking and M1 genotype, and the authors observed no difference in breast cancer risk for former smoking according to the M1 genotype [22]. Our results suggest that the CYP1A1 M1 genotype modifies the association between cigarette smoking and breast cancer risk among former smokers. Joint effects of the M1 genotype and former smoking were most evident among those who quit within 10 years of the reference date. ORs for the duration, age at initiation of smoking, and ETS exposure were also stronger in women with M1 genotypes, especially postmenopausal women. Ambrosone and colleagues [18] showed that the CYP1A1 M1 genotype was associated with an increased risk among lighter smokers, but the analysis was based only upon pack-years and did not address dose, duration, or age at initiation. We observed a

R469
stronger association among women with M1 alleles who started smoking at an early age (< 18 years), in contrast to Ishibe and colleagues [22] who found no interaction between M1 genotypes and age at initiation of smoking.
We did not observe strong modification of the ORs for smoking according to M2 genotypes. Ishibe and colleagues [22] found that women with the M2 genotypes showed a stronger association for early age at initiation of smoking compared with women with wild-type genotypes.
The study by Bailey and colleagues [20] is the only one to evaluate the effect of the M3 genotype on ORs for smoking and breast cancer. In a study that included African-American women, the authors [20] did not observe modification of the ORs for smoking by M3 genotypes, but smoking was categorized as ever versus never. For African-American women with M3 genotypes, we observed a stronger association for cessation of smoking within 10 years of the reference date, for smoking for longer than 20 years, and for initiation of smoking before age 18. These OR estimates are imprecise, and additional studies of M3 genotype and breast cancer in African-American women are needed.
There are several limitations to our study. The participation rate in controls was low (55%) and could have lead to biased estimates of effect. In a previous publication [28], we addressed the potential for selection bias using information from partial interviews conducted on persons who refused to participate in the CBCS. Eligible participants who refused were similar to full participants for most breast cancer risk factors [28], and the prevalence of smoking among controls in the CBCS was similar to previous surveys of the North Carolina population [24]. Participation in the CBCS is unlikely to be related to the CYP1A1 genotype, and therefore any bias in ORs for the CYP1A1 genotype, smoking, or the joint effects of these exposures is likely to be towards the null. Due to financial constraints, we were only able to genotype 80% of phase 1 CBCS cases and 89% of controls. ORs for smoking and other exposures, as well as distributions of these variables, did not dif- fer between study participants with CYP1A1 genotype information and those without (data not shown); the results presented here are therefore likely to be representative of the entire CBCS.
As in previous studies of CYP1A1, RFLP-based laboratory assays do not identify the phase of specific chromosomal combinations of alleles within the CYP1A1 locus (haplotypes). We thus estimated haplotypes using statistical methods, an indirect measurement. Based upon estimated haplotypes, it appears that the effects of the M1 and M3 alleles can be estimated independently in African-American women, since few participants (< 0.01%) appeared to carry the M1 + M3 haplotype. Among white women, some of the effects of M1 may be due to M2, since a small number (4%) of participants carried the M1 + M2 haplotype. Some of the effects of M2 may be due to M4, and vice versa, since the two loci are in close physical proximity. In our study population, the majority of participants with M4 were wild-type at M2, and neither allele was associated with breast cancer risk.
We addressed confounding by most known risk factors for breast cancer, but confounding by unmeasured factors cannot be ruled out. Misclassification of self-reported smoking status is possible. However, it is unlikely that residual confounding or misclassification of smoking would be differential by the CYP1A1 genotype. Information on ETS exposure was limited to exposure at home without measurements of workplace or leisure activity. Failure to fully measure ETS and to remove women with such exposure from the referent group would lead to underestimates (rather than overestimates) of smoking effects.
An additional limitation to our study is the problem of multiple comparisons. We estimated ORs for many aspects of smoking in premenopausal and postmenopausal women, and some or all of the observed associations could be due to chance. We did not base our analysis upon P values and thus did not adjust for multiple comparisons, but instead compared magnitude of ORs across categories of smoking exposure and genotype. Our results appear to be consistent with some previous epidemiologic studies as well as  current knowledge about the biologic effects of CYP1A1 alleles. Although our study is the largest to date among African-American women, many of the OR estimates were imprecise, and none of the ICRs were statistically significant. Additional studies with a larger sample size as well as data pooling across studies are needed to determine whether some or all of the ORs and interactions in our study and previous studies may be due to chance.

Conclusions
Our results suggest that CYP1A1 M1-containing and M3containing genotypes increase the risk of breast cancer associated with a long duration (> 20 years) of cigarette smoking, but the effects of the CYP1A1 genotype appear to be quite weak. Additional information on the functional characteristics of CYP1A1 alleles is needed, especially within breast tissue, to address the biologic plausibility of our findings. Since CYP1A1 is involved in activation of polycyclic aromatic hydrocarbons, our results lend support to the hypothesis that polycyclic aromatic hydrocarbon exposure is associated with increased risk of breast cancer. Polycyclic aromatic hydrocarbon-DNA adducts are formed within breast tissue and have been associated with increased breast cancer risk [29].
Future studies of smoking and breast cancer need to address the role of a variety of genetic polymorphisms involved in the metabolism of polycyclic aromatic hydrocarbons, heterocyclic amines and other compounds found in tobacco smoke. Large studies and data pooling will be required to disentangle the complex effects of smoking. Such studies are important since smoking may represent a modifiable risk factor for breast cancer.