The Breast Cancer Family Registry: an infrastructure for cooperative multinational, interdisciplinary and translational studies of the genetic epidemiology of breast cancer

Introduction The etiology of familial breast cancer is complex and involves genetic and environmental factors such as hormonal and lifestyle factors. Understanding familial aggregation is a key to understanding the causes of breast cancer and to facilitating the development of effective prevention and therapy. To address urgent research questions and to expedite the translation of research results to the clinical setting, the National Cancer Institute (USA) supported in 1995 the establishment of a novel research infrastructure, the Breast Cancer Family Registry, a collaboration of six academic and research institutions and their medical affiliates in the USA, Canada, and Australia. Methods The sites have developed core family history and epidemiology questionnaires, data dictionaries, and common protocols for biospecimen collection and processing and pathology review. An Informatics Center has been established to collate, manage, and distribute core data. Results As of September 2003, 9116 population-based and 2834 clinic-based families have been enrolled, including 2346 families from minority populations. Epidemiology questionnaire data are available for 6779 affected probands (with a personal history of breast cancer), 4116 unaffected probands, and 16,526 relatives with or without a personal history of breast or ovarian cancer. The biospecimen repository contains blood or mouthwash samples for 6316 affected probands, 2966 unaffected probands, and 10,763 relatives, and tumor tissue samples for 4293 individuals. Conclusion This resource is available to internal and external researchers for collaborative, interdisciplinary, and translational studies of the genetic epidemiology of breast cancer. Detailed information can be found at the URL .


Introduction
Breast cancer is known to be 'familial', both from the clinical perspective of observing more families with multiple cases than would be expected by chance, and from the population perspective of an increased risk of breast cancer among women with a family history of the disease. The twofold to threefold increased risk to first-degree relatives of affected women is likely to be due to underlying familial factors, both genetic and environmental (for example, hormonal and lifestyle factors), and the risk gradient across these factors must be 20-100-fold or more [1,2]. Understanding the interplay of genetic and environmental causes of familial aggregation is therefore a key to understanding the causes of breast cancer and to facilitating the development of effective prevention and therapy.
Much is yet to be learned about the causes of familial aggregation of breast cancer. Pathogenic mutations in the genes BRCA1 [3] and BRCA2 [4] are associated with large individual increased risks, on the order of 10-20-fold, but, being rare, they explain less than 20% of the increased risk associated with having an affected first-degree relative [1,5]. Less than half the families with three or more affected members in the Breast Cancer Linkage Consortium have segregating deleterious mutations in BRCA1 or BRCA2 [6]. Mutations in TP53 [7], and possibly in the ATM [8] and CHK2 [9] genes, seem to confer moderately increased risks of breast cancer but might explain only a very small proportion of familial aggregation. Breast cancer risk might also be affected by multiple variants in genes involved with hormonal or other etiological pathways, and the variants might be common and have small or modest effects on individual risk. Lifestyle and other known risk factors are unlikely to explain much familial aggregation [2]. The effects of some established lifestyle factors might vary in relation to family history [10] or BRCA1 or BRCA2 mutation status [11]. Thus, familial aspects of breast cancer are complex, potentially involving multiple genes, multiple environmental exposures, and 'gene-environment interactions' [12].
To address many unanswered research questions regarding the etiology of breast cancer and to expedite the translation of research results to affected and at-risk populations, the National Cancer Institute of the USA supported the establishment of a novel international research infrastructure for interdisciplinary and translational studies of the genetic epidemiology of breast cancer. The Breast Cancer Family Registry is a collaboration of six academic and research institutions and their medical affiliates located in the USA, Canada, and Australia. This paper describes the development of the Breast Cancer Family Registry research infrastructure, the resources available to the research community as of September 2003, and many of the possible studies using this resource.

Structure of the Breast Cancer Family Registry
The Breast Cancer Family Registry was established in 1995, with six participating sites from the USA, Canada, and Australia ascertaining families either from cancer registries (identifying population-based families) or seen in clinical and community settings (identifying clinic-based families) (Fig. 1). Population-based families were recruited from the Greater San Francisco Bay area, California, USA, by the Northern California Cancer Center; from the province of Ontario, Canada, by Cancer Care Ontario; and from the metropolitan areas of Melbourne and Sydney, Australia, by the University of Melbourne and the New South Wales Cancer Council. Clinic-based families, including those of Ashkenazi Jewish ancestry, were recruited from their local populations in the USA by Columbia University in New York City, New York, the Fox Chase Cancer Center in Philadelphia, Pennsylvania, and Huntsman Cancer Institute at the University of Utah in Salt Lake City, Utah; and in Australia by the University of Melbourne and New South Wales Cancer Council in Melbourne and Sydney, Australia. In Ontario, Canada, recruitment of clinic-based families was limited to Ashkenazi Jewish families.
The Breast Cancer Family Registry investigators include epidemiologists, molecular biologists, molecular geneticists, clinicians, geneticists, genetic counselors, statisticians, pathologists and behavioral scientists. The participating sites are supported through Cooperative Agreements; thus, the leadership and scientific conduct of the Breast Cancer Family Registry are a combined effort of the six principal investigators and their teams, with substantial involvement of the Program Officer and other representatives of the National Cancer Institute.

Policy and governance
The six sites have collaborated to develop and maintain the resources, to conduct interdisciplinary research, and to establish collaborations with external investigators. An organizational chart for the Breast Cancer Family Registry is provided in Fig. 2. Detailed information on the governance and policy can be found at the URL http:// www.cfr.epi.uci.edu/nci/access_manual_05-29-02.htm. Research proposals from both internal and external investigators requesting access to the Breast Cancer Family Registry resources are evaluated by an Advisory Committee for scientific merit and the appropriate use of resources on the basis of criteria established by the Steering Committee of the Breast Cancer Family Registry. The decisions of the Advisory Committee are then reviewed and ratified by the Steering Committee.

Informatics Center
In 1998 an Informatics Center was established at the University of California, Irvine. The Informatics Center has developed a flexible and evolving informatics model, which maintains an Oracle relational database using a 'minimal data set' to track subjects. The Informatics Center receives standard data on several modules including family history, epidemiologic risk factors, diet, biospecimen tracking, genotyping, pathology, and follow-up data. The Informatics Center collates, manages, and distributes core data, in collaboration with each of the six local informatics units. Data from each site are submitted in batches to the Central Informatics System with the use of a secure access procedure. A quality assurance system ensures the reliability, validity, and completeness of the database. Up-to-date extract files are created from the relational database and are distributed to investigators. The Informatics Center also maintains the website for the Breast Cancer Family Registry http:// www.cfr.epi.uci.edu, and coordinates teleconferences and other activities between the sites, the Advisory Committee, the Steering Committee, the National Cancer Institute, and the Informatics Center.

Ascertainment of probands and family members
Most families were enrolled in the Breast Cancer Family Registry from 1996 to 2000. During the period 2001-2005, several sites are continuing to recruit the following: (1) families known to segregate BRCA1 or BRCA2 mutations; (2) families with multiple cases of breast or ovarian cancer; (3) selected additional relatives of previously enrolled families; (4) families of Ashkenazi Jewish ancestry; and (5) families from specific racial and ethnic groups.

Clinic-based and community-based recruitment
Four sites enrolled families with multiple or early-onset cases of breast or ovarian cancer identified through community contacts and clinical settings including screening centers, family cancer clinics, surgical and medical oncology offices, and the Australian twin registry. Probands were defined as the first family member enrolled in the Breast Cancer Family Registry and may or may not have had a personal history of breast or ovarian cancer. Eligibility was based on one or more of the following criteria: two or more relatives with a personal history of breast or ovarian cancer; a woman diagnosed with breast or ovarian cancer at a young age; a woman with a history of both breast and ovarian cancer; an affected male; or known BRCA1 or BRCA2 mutation carriers. The Australian site also enrolled twin pairs in which one or both members had a personal history of breast cancer. Table 1 shows the eligibility criteria and family characteristics of clinic-based probands for each site.

Population-based recruitment
Three sites enrolled families through females with incident breast cancer identified through population-based cancer registries in defined geographic areas. Two sites also enrolled families through male breast cancer cases. Case probands, defined as the affected persons ascertained from a cancer registry, were sampled according to one or more criteria, including age at diagnosis, gender, race/ethnicity, and family history. Table 2 provides an overview of the sampling strategies for the population-based case probands. Control probands were randomly sampled from the general population living in the relevant catchment area of each of the regional cancer registries, using random-digit dialing (San Francisco), lists of randomly selected residential telephone numbers (Ontario), and electoral rolls (Melbourne and Sydney). At all six sites, permission and assistance were sought from the proband to contact eligible relatives.

Special recruitment initiatives Ashkenazi families
After the discovery that three specific mutations in BRCA1 and BRCA2 are relatively common among people of Ashkenazi Jewish ancestry [13], four sites (New York, Philadelphia, Ontario, and Melbourne and Sydney) were funded between 1996 and 2000 to recruit Ashkenazi Jewish families through their local communities and cancer family clinics, in addition to those being recruited through   the 'core' recruitment activities described above. Individuals of Ashkenazi Jewish ancestry were also identified through the epidemiology or family history questionnaires.

Racial and ethnic minorities
To increase the racial and ethnic diversity of the resource, special efforts were undertaken at several sites in the USA to enroll African-American, Asian, and Hispanic families, either through oversampling of probands (for example, in San Francisco) or community outreach (for example, in New York). Recruitment of African-American, Asian, and Hispanic families is continuing, and in California it has been expanded to include families from Orange County enrolled by the University of California, Irvine.

Protocols and procedures
Initially six Working Groups were established to develop uniform procedures and questionnaires for data and biospecimen collection and processing: Family History, Epidemiology, Biospecimens, Pathology, Database, and Informed Consent. These Working Groups developed the instruments for data collection, the protocols for biospecimen collection, processing, and distribution, and the data dictionaries to be used at the Informatics Center. An early challenge was to recognize and respect the geographic differences in social, cultural and health care structures and legislation, while finding common principles, issues, and language for the questionnaires and informed consent forms. Core epidemiology and treatment questionnaires were developed, and common language was incorporated into each site-specific consent form to address issues faced by all sites [14].
All sites collected the following: family history data from probands; epidemiological and dietary data, and blood samples (or mouthwash samples if venipuncture was declined) from probands and selected relatives; and clinical and treatment data, tumor blocks and pathology reports for probands and relatives with a personal history of breast or ovarian cancer. All data and biospecimens were stored without personal identifiers.

Questionnaires
Family history questionnaire Information was sought, at minimum, about previous cancer diagnoses in the proband and the proband's parents, siblings, and children. Similar information for more distant relatives was also sought depending on site protocols. All cancers, except non-melanoma skin cancers and cervical carcinoma in situ, were recorded. Dates of all cancer diagnoses and deaths were requested.
Epidemiology questionnaire This instrument obtained information on demographics, race/ethnicity, religion, personal history of cancer, breast and ovarian surgeries, radiation exposure, smoking and alcohol consumption, menstrual and pregnancy history, breast-feeding, hormone use, weight, height, and physical activity. Some sites used a short proxy version to collect limited information on deceased relatives and selected living relatives.

Dietary questionnaires
A self-administered food frequency questionnaire developed by the University of Hawaii [15] for multi-ethnic cohort studies was used by the five North American sites. The Melbourne and Sydney site used a locally validated dietary questionnaire developed for a cohort study of Greek, Italian, and Australian-born inhabitants of Melbourne [16]. Both instruments collected information on frequency of food consumption and portion size, using photographs to help in assigning portion sizes.
Treatment questionnaire Self-reported information was sought on aspects of treatment for breast or ovarian cancer and for any recurrences. Information from medical records was also collected by some sites.
Biospecimen collection and processing A 30 ml sample of blood was requested from probands and selected relatives, and paraffin blocks or unstained sections of the paraffin blocks were requested for individuals with a history of breast or ovarian cancer. From participants who declined venipuncture, a mouthwash sample was collected at some sites in accordance with the protocol of Lum and Le Marchand [17]. The New York site also collected urine samples from selected participants for estrogen metabolite analyses.

Blood and mouthwash samples
Biospecimen samples were processed at each site, or at a collaborating laboratory, in accordance with a common standardized protocol. A quality control program was developed to allow validation of the methods and their application at each site. From three tubes of blood, one tube was used for direct DNA isolation, a second was used for the preparation of blood spots and plasma, and a third was collected for the isolation and cryopreservation of lymphocytes for future transformation or DNA preparation. To provide an unlimited source for nucleic acids, Epstein-Barr virus-transformed lymphoblastoid cell lines from probands and selected relatives were established [18]. Biospecimens were stored at either the participating academic institutions or, for some sites, at the Coriell Institute for Medical Research. Biospecimen collection, processing, annotation, storage, and distribution for the Breast Cancer Family Registry were evaluated recently in a report prepared for the National Cancer Institute and the National Dialogue on Cancer [19] and it was noted that many of the 'best practices' suggested for biospecimen repositories are currently in use within the Breast Cancer Family Registry.

Pathology specimens
For individuals with a personal history of breast or ovarian cancer, histological slides and/or paraffin tumor blocks were requested from the treating institution. Sections were cut from each block, stained with hematoxylin and eosin, and reviewed by the site pathologist(s Representative blocks of tumors and also of associated benign lesions were selected by the pathologists for retention in the tissue repository. All other slides and blocks were returned to the treating institution. If permission was not obtained to retain the representative blocks in the repository, sections were cut in accordance with a standard cutting protocol. This included cutting 10-20 sections at 4 µm thickness for future immunohistochemical studies and an additional 10-20 sections at 10 µm for future DNA extraction. Control sections, for staining with hematoxylin and eosin, were taken at the beginning, middle and end of the cutting protocol as a quality control measure. The specific number of sections taken from a block depended on the amount of tumor present in the block. The slides sectioned for future immunohistochemical studies were placed in either +4°C fridges or -20 or -80°C freezers. This was to minimize the risk of any loss of antigenicity, which is known to occur if unstained sections are stored at room temperature. If permission had been obtained to retain tumor blocks in the repository, further permission was sought to construct tissue microarrays from these blocks. Tissue microarrays have been constructed at the Ontario site and are soon to be constructed at other sites. All pathology reviews were entered into a database and submitted to the Informatics Center at the University of California at Irvine.

Validation of breast and ovarian cancer diagnoses
Verification was sought for all reported breast and ovarian cancers, and at some sites for all reported cancers. Because the population-based sites ascertained case probands through cancer registries, verification was necessary only for cancers reported for relatives. The level of confidence regarding a cancer diagnosis was classified into one of six categories, in decreasing order: (1) review of slides by Breast Cancer Family Registry pathologist, (2) pathology report, (3) cancer registry report or medical records indicating treatment for the specific type of cancer, (4) report on a death certificate, (5) self-report, and (6) report by a relative.

Follow-up
Selected participants are being followed to obtain updated information on cancer and vital status of family members. For the clinic-based families in the USA, at least one participant from each family is contacted annually to update personal and family cancer histories and deaths, as well as some exposures addressed in the core epidemiology questionnaire. In Ontario, an annual mailed follow-up questionnaire to case probands seeks to update births, deaths, and new cancer diagnoses of case probands and family members. The San Francisco site contacts case probands annually by telephone to update information on cancer and vital status of the proband and family members. In Australia, passive record linking to state cancer registries and death certificates is being conducted, and the medical records of case probands have been followed up for recurrence and death [20]. A systematic registry-wide follow-up of all enrolled probands and relatives is being developed by the Follow-up Working Group, and is currently being pilot tested in Australia, Ontario, and San Francisco.

Mutational analysis of BRCA1 and BRCA2
Substantial mutational analyses of BRCA1 and BRCA2 have been undertaken by site laboratories and, more recently, by Myriad Genetics using full sequence analysis [21], funded by multiple sources. A validation study was conducted for five of the methods used between 1997 and 2000, including four DNA-based methods (namely twodimensional gene scanning, denaturing high-performance liquid chromatography, enzymatic mutation detection, and single-strand conformation polymorphism analysis) and an RNA/DNA-based method (a protein truncation test) [22]. Single-strand conformation polymorphism analysis was less sensitive than the other methods and is no longer being used. The specificity and sensitivity of the other four methods for protein-truncating mutations were comparable to those of full sequencing ('gold standard'). Ashkenazi Jewish participants have been screened for the three founder mutations, 185delAG and 5382insC in BRCA1 and 6174delT in BRCA2.

Collection of data and biospecimens
As of September 2003, the six sites had enrolled a total of 11,950 families in the Breast Cancer Family Registry (Table 3). They included 6126 population-based case families and 2990 population-based control families, and 1647 clinic-based families with an affected proband and 1187 clinic-based families with an unaffected proband. Affected probands included 7111 females with a first primary breast cancer, 538 females with a second breast cancer, and 124 males with breast cancer. The enrolled families included 2346 minority families, residing mostly in the USA.
The epidemiology questionnaire was completed by 27,421 participants (10,895 probands and 16,526 relatives), and the short proxy epidemiology questionnaire was completed for 20,003 relatives (Table 4). Blood or mouthwash samples were collected from 20,045 individuals (9282 probands and 10,763 relatives), and tumor tissue was obtained for 4293 individuals with a history of breast and/ or ovarian cancer (3322 probands and 971 relatives).

Population-based families
Because recruitment of families is continuing in San Francisco and Ontario, the participation rates below refer to enrollment from 1996-2000. At each site, before contact was made with incident breast cancer cases identified from the regional cancer registry, the case's physician was contacted. Physician consent was obtained to contact the great majority of case probands (98% in San Francisco, 92% in Ontario, and 90% in Melbourne and Sydney). In Ontario and in Melbourne and Sydney, 2% and 3%, respectively, of the case probands were deceased and were therefore not studied at those sites. In San Francisco, family history and epidemiology data for deceased case probands were collected from proxy respondents.

Eligibility of case probands
To determine eligibility for sampling as a case proband, information on family history of breast or ovarian cancer was first obtained through a telephone interview in San Francisco (84% response rate), and by a mailed questionnaire in Ontario (65% response rate). In Melbourne and Sydney, all newly diagnosed breast cancer cases were eligible, regardless of family history of breast cancer.

Case probands
Of the eligible case probands, 6126 completed the family history questionnaire, including 104 males (Table 3), and 5250 completed the epidemiology and treatment questionnaires (76% in San Francisco, 72% in Ontario, and 75% in Melbourne and Sydney for both the family history and epidemiology questionnaires) ( Table 4). An analysis at the Melbourne and Sydney site showed that there was high agreement between self-reported treatment data and medical records (Phillips KA, Milne RL, Buys S, Friedlander ML, Blood or mouthwash samples were collected from 4786 case probands (70% in San Francisco, 62% in Ontario, and 71% in Melbourne and Sydney) ( Table 4). An analysis of participants at the Ontario site showed that proband non-response at all stages (namely family history, epidemiology questionnaire, and biospecimen collection) was not associated with family history of breast or ovarian cancer [23,24]. Lymphoblastoid cell lines were established for 1723 case probands, and tumor tissue samples were obtained for 2675.

Relatives
A total of 22,857 relatives of case probands have been enrolled ( Table 4). The epidemiology questionnaire was completed by 10,535 relatives. The short proxy epidemiology questionnaire was completed for 11,155 relatives from the Australian site. Blood or mouthwash samples were collected for 6776 relatives. Lymphoblastoid cell lines were established for 450 relatives, and tumor tissue was obtained for 437 affected relatives. Collection of tumor tissue is still continuing in Ontario and in Melbourne and Sydney.

Control probands
Among women selected as control probands, 2990 completed the family history questionnaire, 2979 completed the epidemiology questionnaire, and 1855 provided a blood or mouthwash sample. Response to the epidemiology questionnaire was 60% in San Francisco, 64% in Ontario, and 68% in Melbourne and Sydney. Participation in biospecimen collection was 56% in San Francisco and 55% in Melbourne and Sydney. Biospecimen collection is continuing in Ontario and is expected to be completed for 45% of control probands.
The same ascertainment protocol was also used in Melbourne and Sydney to recruit case and control probands before the establishment of the Breast Cancer Family Registry, including 467 case probands diagnosed between 1992 and 1995 with breast cancer before the age of 40 years, and 408 control probands frequency-matched to case probands on age. Relevant data are stored at the Informatics Center and, together with material collected from family members, are available from the site investigators to be used in conjunction with the Breast Cancer Family Registry resources.

Clinic-based families
A total of 2834 probands (1647 affected, 1187 unaffected) have been enrolled (Table 4). Of these, 2666 completed the epidemiology questionnaire and 2641 provided a blood or mouthwash sample (1530 affected, 1111 unaffected). Lymphoblastoid cell lines were established for 738 (343 affected, 395 unaffected). Tumor blocks were obtained for 647 probands.
A total of 8264 relatives have been enrolled, with an average of three members per family. Of these, 4604 completed the epidemiology questionnaire, and 3973 provided a blood or mouthwash sample. The short proxy epidemiology questionnaire was completed for 3006 relatives. Lymphoblastoid cell lines were established for 1014 relatives. Tumor blocks for breast or ovarian cancer were obtained for 533 relatives.
In addition to families presented in Table 4, more than 500 multiple-case breast cancer families have been recruited in Australia as part of the Kathleen Cuningham Consortium for Familial Breast Cancer (kConFab), which administered the same epidemiology and family history questionnaires and used the same blood collection protocol as the Breast Cancer Family Registry. Funds for BRCA1 and BRCA2 mutation testing have been provided by the National Cancer Institute and these families are available to be used in conjunction with the Breast Cancer Family Registry resources through application to kConFab http:// www.kconfab.org.

Proband and family characteristics
The Breast Cancer Family Registry contains various subgroups of probands and families with specific characteristics (Table 5). Among the 6779 probands with a history of breast or ovarian cancer and a completed epidemiology questionnaire (5250 from population-based families and 1529 from clinic-based families) there are 124 male probands, 1526 (23%) with a diagnosis before age 40 years, 1748 (26%) from minority populations, 1040 (15%) of Ashkenazi Jewish ancestry, 494 (7%) with a history of two breast cancer diagnoses, 65 (1%) with a history of both breast and ovarian cancer, 2332 (34%) with at least one first-degree relative with breast cancer, and 61 from participating twin pairs. The relatively high proportion (23%) of probands diagnosed before the age of 40 years reflects both the designs used by the population-based sampling to increase the number of case probands with a genetic etiology and the age at diagnosis distribution of the multiple-case families. The relatively large proportion (26%) of minority probands largely reflects the oversampling of these families at the San Francisco site. Among the 2834 clinic-based families enrolled so far, 204 (7%) include three or more first-degree relatives with breast or ovarian cancer (Table 6). Among the 6126 population-based case families this percentage is 4%. Blood or mouthwash samples are available for 1863 sibships from population-based case families and 701 sibships from clinic-based families with one or more affected sisters and one or more unaffected sisters (Table 7).

Mutational analysis of BRCA1 and BRCA2
Testing for mutations in BRCA1 has been conducted for 5656 females and 612 males, and in BRCA2 for 5497 females and 524 males ( There are a total of 230 population-based female mutation carriers, making this the largest collection of populationbased carriers yet established. At some sites, BRCA1 and BRCA2 mutation testing is continuing; the number of available mutation carriers will therefore increase.

Discussion
The Breast Cancer Family Registry has enrolled nearly 12,000 families containing individuals with a wide range of familial risks of breast cancer. This novel research infrastructure has many strengths, including the following: its focus on data and biospecimen collection from both population-based and clinic-based families, including a large number of minority families and Ashkenazi Jewish families from three countries; attention to quality, compara-bility, and comprehensiveness of data and biospecimen collection; continuing molecular characterization; establishment of Epstein-Barr virus-immortalized lymphoblastoid cell lines; and broad-based research and clinical expertise in breast cancer represented among the six participating sites and their international collaborators.  Extended families with multiple cases are a proven means of discovering genes that when mutated convey a high risk of disease [25], and are an efficient sampling design for identifying mutation carriers for studies of genetic and environmental risk modifiers. Sisters concordant for disease are useful for gene discovery and for disentangling gene-gene interactions, and sisters discordant for disease are useful for case-control studies of putative genetic and environmental risk factors. Population-based case families can be used to characterize susceptibility genes by providing estimates of penetrance and prevalence applicable to the population groups from which they are sampled, and, when combined with controls and control families, can provide multiple designs for addressing issues in the genetic epidemiology of breast cancer [26][27][28][29]. The availability of population-based and family-based controls within the registry, combined with the ethnic diversity of the families, will per-mit questions related to population stratification to be addressed. Populations carrying founder or ancestral mutations in susceptibility genes can facilitate the characterization of specific mutations and their impact on communities, whereas twin studies can provide new insights into the relative roles of genetic and environmental factors [30,31]. Population-based case probands and both related and unrelated controls are important for assessing the effects of measured genetic variants or haplotypes in candidate genes [32], which might have a low individual risk but, when combined, the genetic variants and haplotypes might have a high population attributable risk, with substantial relevance to public health.
The different designs and common resources available through the Breast Cancer Family Registry can be used for a multitude of collaborative studies; see, for example, Whittemore and Nelson [33] for a discussion of designs, strengths, and weaknesses. Some of the studies and initiatives currently under way or in development include searching for novel breast cancer susceptibility loci, testing for association and/or linkage with variants in known candidate genes, estimating the penetrance and detecting modifiers of penetrance associated with variants in different genes (including the examination of genetic and environmental modifiers of risks associated with these variants, often referred to as gene-gene and gene-environment interactions), and developing and disseminating innovative analytical approaches and related software for the discovery and Potential limitations of the Breast Cancer Family Registry need to be considered when using its resources. There are differences across and even within sites in designs, eligibility criteria, sampling schemes, and data collection modes that can be used to advantage, but can be problematic if not understood. Proper analysis and interpretation of family studies across sites require a clear understanding of the ascertainment procedures that have been used. As in all family studies, biospecimens are not available from all eligible family members, and study designs and analyses need to consider this limitation. Lastly, there is also some incompleteness in the collection of questionnaire data and tumor samples. The impact of each of these issues on specific studies needs to be considered and, if possible, minimized.
The Breast Cancer Family Registry offers several challenges for theoretical and applied statisticians in developing optimal methods for the design and analysis of studies using its resources. Such challenges include the following: analyzing data from individuals who are related and from families for whom data collection is incomplete; trying to make inferences about either a measured genetic marker or about the characteristics (mode of inheritance, allele frequency, effects on risk) of a presumed unmeasured genetic effect against the background of other familial effects of unknown origin, such as polygenic inheritance or shared family environmental factors [32,34]; and how to make appropriate adjustment for non-random or non-systematic ascertainment of families. Researchers from the Breast Cancer Family Registry are collaborating with other statisticians to develop methods and software to facilitate appropriate and optimal analyses and to make these developments available to researchers using the resource.
The development of the Breast Cancer Family Registry has already resulted in the initiation of more than 80 hypothesisdriven research projects among participating sites and with the greater international research community, and has already produced numerous publications. The Breast Cancer Family Registry website lists continuing collaborative research projects http://www.cfr.epi.uci.edu/ic_registries/ breast/approved.htm and publications http:// www.cfr.epi.uci.edu/ic_registries/breast/ breast_publications.htm.
The Breast Cancer Family Registry data and biospecimens are available to the scientific community. Researchers interested in initiating collaborative research projects using the resources are invited to access the Breast Cancer Family Registry website for preliminary information http:// epi.grants.cancer.gov/BCFR/index.html, and to make initial contact with the Program Officer at the National Cancer Institute to discuss the process of developing a collaborative proposal (details are available from the corresponding author). Interested investigators will then be referred to the relevant investigators and Working Groups to discuss the details of their proposal, to become acquainted with sitespecific recruitment issues, to establish collaborations, and then to submit to the Advisory and Steering Committees a concise proposal that includes information on the study design and requirements for data and/or biospecimens. Approval for the use of human subjects in accordance with the requirements of the Office for Human Research Protections is required. For approved proposals requesting the use of biospecimens, a Material Transfer Agreement is required.

Conclusion
Nearly 12,000 families have been enrolled in the Breast Cancer Family Registry, a novel research infrastructure. Data and biospecimen resources are available for collaborative, interdisciplinary, and translational studies of the genetic epidemiology of breast cancer.