Skip to main content
  • Research article
  • Published:

The Million Women Study: design and characteristics of the study population



To describe the design of the Million Women Study and the characteristics of the study population.

Study design

Population-based cohort study of women aged 50-64 in the UK.


Women are asked to join the Million Women Study when they are invited to routine screening for breast cancer at 61 of the screening centres of the UK National Health Service Breast Screening Programme (NHSBSP). An estimated 71% of women screened by the NHSBSP return a completed questionnaire.


800 000 women were recruited between May 1996 and June 1999, and it is planned that an additional 200 000 will be recruited by the year 2000.


The characteristics of the first 121 000 women recruited into the Million Women Study are described here. At recruitment 33% of the study population were currently using hormone replacement therapy and 47% had used it at some time. Over half (54%) had used oral contraceptives, and 18% were current smokers at the time of recruitment. Before they were screened 1.4% of the women had been diagnosed with breast cancer in the past, 6% had a mother with a history of breast cancer and 3.7% had a sister with a history of breast cancer. It is estimated that 1 million women will have been recruited by early in the year 2000, and that by the end of the year 2002 there will be 5000 screen-detected breast cancers and 23 000 deaths in the cohort, the majority of which will be attributed to cancer (12 600 deaths) and circulatory disease (8000 deaths).


By the end of the year 2002, the Million Women Study will have sufficient statistical power to detect relative risks of 0.8 or less, or of 1.2 or more in current users compared with never users of hormone replacement therapy for mortality from breast cancer, colorectal cancer, lung and ovarian cancer, ischaemic heart disease and stroke.


The Million Women Study is a nationwide collaborative research project in the UK, the chief aim of which is to describe the relationship between use of hormone replacement therapy (HRT) and the risk of various conditions, particularly breast cancer. The study began in May 1996 and the plan is to recruit and follow-up a cohort of 1 million women invited to attend the UK National Health Service Breast Screening Programme (NHSBSP).

The NHSBSP was set up in 1988. Once every 3 years each woman in the UK aged between 50 and 64 years who is registered with the NHS is sent a letter by the NHSBSP, offering her routine screening for breast cancer by mammography. About 1 million women are screened annually by the NHSBSP and about 5000 of them have a breast cancer detected on mammography [1]. The day-to-day organization and screening activities are performed by about 100 separate screening offices throughout the UK, and the work is monitored and statistics gathered centrally by a national co-ordinating centre.

The present paper describes the design of the Million Women Study and the characteristics of the study population.


The Million Women Study is a population-based cohort study. Women are recruited when they are invited for routine breast cancer screening, and the main outcomes to be examined at follow-up are the incidence of screen-detected breast cancer and cause-specific mortality.

Attendance at screening

About three-quarters of the women who are invited for screening by the NHSBSP subsequently attend for mammography [1]. Before the study could be launched, it was necessary to demonstrate that inviting women to join the Million Women Study would not reduce uptake of screening offered by the NHSBSP. During 1994 and 1995 a total of 6000 women who were due to be invited for breast cancer screening in Oxford and West London were randomly divided into two groups. One group was sent the usual invitation for screening and the other group was sent the study questionnaire, accompanying the usual invitation to screening. Attendance rates for screening were similar, at 71%, among those who were and were not sent an accompanying questionnaire [2].

Recruitment procedures

Women are asked to join the Million Women Study by participating NHSBSP screening centres at the time that or just before they are sent their usual invitation for routine breast cancer screening. A questionnaire is included with each woman's invitation and, if the woman wishes to join the study, she is asked to complete the questionnaire, to give signed permission for follow-up, and to return the questionnaire at the time she is screened. A freephone number is provided for women who have any questions or problems filling out the questionnaire. The questionnaire is four pages long (A4 size) and includes questions about lifestyle and sociodemographic factors, reproductive history, past use of oral contraceptives, use of HRT, past medical history and family history of breast cancer. Completed questionnaires are transferred periodically from the participating screening centres to the study co-ordinating centre at the Imperial Cancer Research Fund Cancer Epidemiology Unit (CEU), Oxford, UK.

Data storage, entry and checking

The confidential completed questionnaires are stored securely at all times. Once they reach the CEU they are checked and coded by trained staff and then scanned electronically. The scanned data are 'captured' using computerized intelligent character recognition and optical mark reading software (Eyes and Hands®; Readsoft Inc, Slough, UK). Range and logical checks are performed at the time of data entry. Any inconsistency or information that is not recognised by the data capture software is verified manually by trained data entry staff, who also validate computer-interpreted data and check each questionnaire to confirm whether signed consent for follow up has been granted. Each week the verified data for about 50 individuals are checked against the original questionnaires and the error rate is consistently below 1%. This partially automated process thus permits data to be entered rapidly and with high accuracy.

Follow up for breast cancer

Each screening centre of the NHSBSP is required to compile annual statistics on its activities, which include details of all breast cancers detected at mammography [1]. A list of women enrolled into the Million Women Study at each centre is cross-checked at regular intervals against the list of the women diagnosed with screen-detected breast cancer at that centre. If a breast cancer has been diagnosed at screening in a study participant, routinely recorded details of the cancer are abstracted, including tumour location, histology, size, grade, invasive status and involvement of axillary lymph nodes. Information on hormone receptor status and treatment is abstracted when it is available. Several approaches are being used to identify breast cancers diagnosed subsequent to screening. One will involve record linkage with cancer registry data. Also, women will be contacted directly 2-3 years after they were screened, and asked about new illnesses, including any new breast cancers, that may have been diagnosed (see Additional follow up, below). This will permit the identification of both screen-detected and interval cancers.

Follow-up for deaths

Deaths are identified annually by computerized matching of name, date of birth and NHS number of the women who gave signed consent for follow up in the Million Women Study, with the national death files held by the Office of National Statistics. For each death thus identified the date of death and underlying and associated causes of death are provided by the Office of National Statistics.

Additional follow up

Participants will be sent a follow-up questionnaire about 2-3 years after recruitment, to ascertain changes in use of HRT and incident morbidity, for example breast cancers, diagnosed outside the screening programme.


The most important variables for this study are the subjects' identification details, their use of HRT, any diagnosis of breast cancer and the recording of deaths. To assess the accuracy of the subjects' identification details and of the recording of deaths, a random sample of 5000 women recruited in 1996 has been selected for flagging on the NHS Central Register (NHSCR). Identification details recorded for the study (name, address, date of birth and NHS number) enabled all but 10 (0.2%) of the 5000 women to be identified on the NHSCR. The completeness and accuracy of the reported deaths will be validated in the future against those recorded in the NHSCR for these 5000 women.

The reliability of diagnosis of screen-detected breast cancers is monitored by various quality control procedures within the NHSBSP. Screen-detected breast cancers are verified according to defined procedures, and the invasive status, size and type of cancer are recorded for virtually 100% of the cancers.

The validity of reported information on use of HRT, including the type and dose, is being examined and a full report will be published in due course. Preliminary comparisons with the prescription records from one general practice in Oxfordshire indicate at least 95% agreement for reported current use of HRT, including the hormonal constituents of the preparation used most recently (Banks et al, unpublished data).


After it was demonstrated that the Million Women Study questionnaire did not alter attendance rates [2], each screening centre in England, Scotland and Wales was invited to participate in the study. Almost all the centres expressed enthusiasm for the study, although practical problems precluded the involvement of some centres. The most frequent reason for screening centres not participating was that the Million Women Study questionnaire could not readily be packaged together with the letters and other information normally posted to women when they are invited for screening.

Accrual of the cohort

Recruitment of women into the study began in May 1996. Most of the participating screening centres began recruitment during 1997, and 61 centres were taking part by late 1998. The locations of these centres are shown in Figure 1. Before recruitment could begin at any centre, local ethical committee approval was required, and this often entailed contacting more than one ethical committee for each centre. In total, 126 local ethical committees were approached and approval for the study was obtained without exception.

Figure 1
figure 1

Location of UK National Health Service Breast Screening Centres that began recruitment into the Million Women Study before December 1998.

Figure 2 shows the numbers of questionnaires returned to the CEU between May 1996 and June 1999. More than 800000 questionnaires had been returned by the middle of 1999, and according to this accrual rate it is estimated that a cohort of 1 million women will have been recruited by early in the year 2000.

Figure 2
figure 2

Accrual of women into the Million Women Study, up to June 1999.

Response rate

Statistics presented here are based on the first 227 000 questionnaires, which were printed between May 1996 and February 1997. This represented a convenient point in the accrual of women to assess response rate, because the layout and colour of the questionnaire were modified at this stage. Table 1 shows the numbers of questionnaires dispatched and returned, and whether the respondents also gave signed permission for follow up. Overall 121 000 (53%) of the questionnaires sent out were returned to the CEU. Women who returned a questionnaire are referred to as 'respondents', and it is estimated that they comprise about 71% of the women screened at the participating centres. Not all respondents can be included in the cohort to be followed, however, because 7% of them did not give signed consent or gave insufficient personal details for follow up. The remaining 93% of the respondents who can be followed are referred to as 'the cohort' or as 'participants'.

Table 1 Response rate for the Million Women Study*

Characteristics of 121 000 respondents

Table 2 summarizes certain characteristics of the first 121 000 respondents, including details of their age, use of HRT, reproductive history, past use of oral contraceptives and consumption of cigarettes. It can be seen that most women are aged between 50 and 64 at recruitment (a small number of women are screened just before their 50th birthday and women aged over 65 can be screened by the NHSBSP if they specifically request it). It can also be seen that for most variables there is little missing data. One-third (33%) of the women reported currently using HRT, and almost half (47%) had used it at some time. More than half (54%) had used oral contraceptives and 18% were current smokers.

Table 2 Some characteristics of 121 000 respondents in the Million Women Study

Table 3 summarizes the history and family history of breast disease, including breast cancer, in the respondents: 1.4% of the women had breast cancer diagnosed before recruitment and 9% reported that their mother and/or sister(s) had breast cancer diagnosed in the past. Table 4 summarizes the respondents' history of various other illnesses and operations. It can be seen that a substantial proportion of women have had hypertension diagnosed or are being treated for it, that one in four women have had a hysterectomy, one in five have been sterilized and one in 14 have had a bilateral oophorectomy.

Table 3 History of breast disease, including breast cancer, in 121 000 respondents to the Million Women Study and their relatives
Table 4 Previous illnesses and conditions for which women are now being treated in 121 000 respondents to the Million Women Study

Comparison of participants and nonparticipants

The overwhelming reason for nonparticipation in the Million Women Study is not attending for breast cancer screening, having been invited to do so. Women are asked to bring the completed questionnaire with them when they are screened, and thus far over 99% of the respondents were recruited in this way. Although no envelope or pre-paid postage is provided, a small number of the respondents posted their questionnaire back to the screening or co-ordinating centre, and virtually all of them also attended for breast cancer screening.

A direct comparison of those who agreed to participate in the study with those who did not has been performed in one general practice in Oxfordshire and similar comparisons are planned for other areas. A full report of these findings will be published in the future, but preliminary results suggest that there are few substantial differences between participants and nonparticipants. At this stage the main difference between the groups appears to be that nonparticipants are more likely than participants to be prescribed medications for the treatment of hypertension (Banks et al, unpublished data).

About 7% of the respondents returned the study questionnaire but did not give sufficient information and/or signed permission for follow up. Table 5 compares their characteristics with those of the 93% who can be followed. It can be seen that the main difference between these two groups is that the women who gave consent and sufficient information to be followed were more likely to be current users of HRT (33 versus 25%) and to have ever used oral contraceptives (54 versus 45%) than the women who cannot be followed.

Table 5 Comparison of those who did and did not give sufficient information and/or permission for follow up

Expected numbers and statistical power

At the present accrual rate it is expected that a cohort of 1 million women will have been recruited by the year 2000. Based on national statistics from the NHSBSP [1], about 5000 screen-detected breast cancers would be expected in this cohort. Given these numbers, and the expected proportion of current and never users of HRT at recruitment, the study should have 80% power to detect a relative risk of 1.1 in both current users and in current users of durations of at least 5 years, compared with never users.

Another aim of the study is to examine the relationship between use of HRT and mortality from various causes, the objective being to present findings with respect to the most important causes of death within 5 years. Table 6 shows the expected numbers of deaths from various causes by the end of 2002, assuming that 1 million women are recruited by the beginning of the year 2000. As with other cohort studies of women taking hormonal agents [3], it is likely that mortality in these women will be somewhat lower than that the general population because of self-selection of relatively healthy subjects into the study. The expected numbers in Table 6 have, therefore, been calculated assuming that death rates from causes other than breast cancer are 20% lower than the national rate and that breast cancer death rates are 30% lower than the national rate, thus taking into account the additional expected benefit of screening [4]. It can be seen that by the end of 2002 about 23 000 deaths will have occurred, with the majority being attributed to cancer (12 600 deaths) or to diseases of the circulatory system (8000).

Table 6 Estimated number of deaths up to the end of the year 2002 in the Million Women Study cohort and minimum and maximum detectable relative risks for certain common conditions

Previous studies have suggested that both recency and duration of HRT use are important in determining its effect on breast cancer, and perhaps on other diseases[5,6]. However, because women tend to stop taking HRT when they become ill, there are problems in interpreting differences in mortality according to HRT use at the time of death. One way of overcoming these problems is to examine mortality in relation to use of HRT before diagnosis of any serious illness. Analyses of cause-specific mortality in relation to use of HRT within the Million Women Study will, therefore, exclude women with serious illnesses at the time of recruitment and be based on use as recorded at the time of entry into the cohort.

Table 6 shows the least extreme detectable relative risks for each of the main causes of death to be examined for various patterns of HRT use as compared with never users. These power calculations show that for a common cancer, such as colorectal cancer, there should be sufficient power to detect an increase or decrease in mortality of as little as about 20% in current users compared with in nonusers, and of about 25% in current users of long duration compared with nonusers. Even for endometrial cancer, which is the least common of the causes listed, it should be possible to detect quite modest increases or decreases in mortality of around 40% in current users compared with never users, and of about 45% in current users of long durations compared with never users.

By the end of the year 2002, the largest numbers of expected deaths among these women will be due to breast cancer and ischaemic heart disease. Thus, the effect of HRT use on deaths from these two causes will be particularly important in determining the net benefit or risk to mortality in HRT users as compared with nonusers. For both of these conditions relative risks of greater than 1.1 or less than 0.9 would be detectable among current versus never users. The corresponding figures among current users with durations of use of 5 or more years are 1.2 and 0.8, respectively.

Other questions

Many other questions about women's health can also be answered by this study. The cohort is sufficiently large to provide reliable data on the health effects of many lifestyle factors, including the consumption of tobacco and alcohol, and on the effects of past use of other hormonal agents, such as oral contraceptives. In addition, the Medical Research Council is supporting an extension of the study to evaluate the effect of HRT on the efficacy of mammography. In that study, women recalled for further assessment after screening and women with interval cancers are being identified. Information on interval cancers is being sought from cancer registries and also directly by sending women a follow-up questionnaire 2-3 years after their initial screen and asking about recent morbidity, including diagnosis of breast cancer. This will allow estimation of how HRT affects the sensitivity and specificity of mammography.


The main purpose of the Million Women Study is to examine the relationship between breast cancer and use of HRT, in a context where use of hormonal therapy is recorded as reliably as possible and breast cancers are diagnosed as uniformly and consistently as possible. Obtaining details of use of HRT before any breast cancer is diagnosed will minimise possible reporting biases of use of such therapy. Moreover, studying screen-detected breast cancers overcomes the potential bias that women who are taking HRT may be more likely to be screened than women who do not use such therapy. The limitation of examining screen-detected cancers alone, however, is that use of HRT may itself reduce the efficacy of mammographic screening. The plan, therefore, is to follow the women screened for interval breast cancer, and to include those cancers in the analyses of the relation between use of HRT and breast cancer.

Because the entire cohort is being followed up for deaths, it will also be possible to look at the relationship between use of HRT and mortality from various causes. Women prescribed HRT tend to be healthier than those who are not, however, and so it is crucial that analyses take proper account of the so-called 'healthy user effect' [7]. In designing the study attention has been given to the recording of detailed information about illnesses present at the time of recruitment. It can be seen in Tables 3 and 4 that a substantial proportion of women recruited have had illnesses such as hypertension and other cardiovascular disease in the past that would affect their risk of death from circulatory disease and other causes. The plan is to analyse results separately according to history of previous illness, and most weight will be given to the findings in women who had no previous illness.

Randomized clinical trials of HRT are now underway. These trials will have sufficient statistical power to detect a substantial reduction in ischaemic heart disease, but will not be able to pick up important, but modest, changes in the risk of cancer [7]. Thus, there will be a continued need for observational data to look at the effects of HRT on disease.


The Million Women Study is one of the largest cohort studies ever devised. Recruitment is proceeding rapidly and the study is on target to accrue a cohort of 1 million women by the year 2000. Preliminary results indicate that the women joining the Million Women Study do not differ substantially from women of a similar age in the general population.

It is expected that, within 5 years, the study will have sufficient statistical power to answer questions about the role of HRT in mortality from breast cancer and other specific conditions of interest.

This cohort may ultimately include about one women in every five aged between 50 and 64 years in the UK. This excellent co-operation at a national level reflects the efficient organization of the NHSBSP. It is also indicative, perhaps, of concern by women at the lack of reliable knowledge about the long-term effects of HRT and the fact that in the UK today there is substantial use of this type of therapy.


NHS Breast Screening Centres that began recruitment before December 1998 (in alphabetical order) are as follows: Avon, Aylesbury, Barnsley, Basingstoke, Bedfordshire & Hertfordshire, Cambridge & Huntingdon, Chelmsford & Colchester, Chester, Cornwall, Crewe, Cumbria, Doncaster, Dorset, East Berkshire, East Cheshire, East Devon, East of Scotland, East Suffolk, Gateshead, Gloucestershire, Great Yarmouth, Hereford & Worcester, Kings Lynn, Leicestershire, Liverpool, Manchester, Milton Keynes, Newcastle, North Birmingham, North East Scotland, North Lancashire, North Middlesex, North Nottingham, North of Scotland, North Tees, North Yorkshire, Nottingham, Oxford, Portsmouth, Rotherham, South Birmingham, South East Scotland, South East Staffordshire, Sheffield, Shropshire, Somerset, South Derbyshire, South Essex, South Lancashire, South West Scotland, Surrey, Warrington Halton St Helens & Knowsley, Warwickshire Solihull & Coventry, West Berkshire, West Devon, West of London, West Suffolk, West Sussex, Wiltshire, Winchester and Wycombe.

The Million Women Study Co-ordinating Centre staff are as follows: Emily Banks, Valerie Beral, Anna Brown, Diana Bull, Becky Cameron, Barbara Crossley, Diane Deciacco, Dave Ewart, Laura Gerrard, Julie Hall, Sally Hall, Elizabeth Hilton, Ann Hogg, Carol Keene, Nikki Langley, Nicky Langston, Gillian Reeves, Moya Simmonds.

The Steering Committee members are Joan Austoker, Emily Banks, Valerie Beral, Ruth English, Julietta Patnick, Richard Peto, Gillian Reeves, Martin Vessey and Matthew Wallis.

The Writing Committee members are Emily Banks, Valerie Beral and Gillian Reeves.


  1. Patnick J:NHS Breast Screening Programme Review.NHSBSP: Sheffield. 1996

    Google Scholar 

  2. Banks E, Richardson A, Beral V, et al: Effect on attendance at breast cancer screening of adding a self administered questionnaire to the usual invitation to breast screening in southern England. J Epidemiol Commun Health. 1998, 52: 116-119.

    Article  CAS  Google Scholar 

  3. Beral V, Hermon C, Kay C, et al: Mortality associated with oral contraceptive use: 25 year follow-up of a cohort of 46000 women from the Royal College of General Practitioners' oral contraception study. BMJ. 1999, 318: 96-100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Nystrom L, Rutqvist LE, Wall S, et al: Breast cancer screening with mammography: overview of Swedish randomised trials. Lancet. 1993, 341: 973-978. 10.1016/0140-6736(93)91067-V.

    Article  CAS  PubMed  Google Scholar 

  5. Beral V, Doll R, Bull D, et al: Breast cancer and hormone replacement therapy: collaborative reanalysis of data from 51 epidemiological studies of 52705 women with breast cancer and 108411 women without breast cancer. Lancet. 1997, 350: 1047-1059. 10.1016/S0140-6736(97)08233-0.

    Article  Google Scholar 

  6. Beral V, Reeves G, Bull D, Key T, Peto R: Breast cancer and hormone replacement therapy: putting the risk into context. IV European Congress on Menopause. Vienna, October 8-12. 1998, 267-277.

    Google Scholar 

  7. Beral V, Banks E, Reeves G, Appleby P: Use of hormone replacement therapy and the subsequent risk of cancer. J Epidemiol Biostat. 1999

    Google Scholar 

Download references


The main acknowledgement is undoubtedly to each of the 1 million women participating in this study. The contribution from many individuals at each of the collaborating NHS Breast Screening Centres (listed below) is also gratefully acknowledged.

Author information

Authors and Affiliations


Rights and permissions

Reprints and permissions

About this article

Cite this article

The Million Women Study Collaborative Group. The Million Women Study: design and characteristics of the study population. Breast Cancer Res 1, 73 (1999).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: