The Million Women Study: design and characteristics of the study population
© Current Science Ltd 1999
Received: 31 March 1999
Accepted: 7 July 1999
Published: 1 December 1999
To describe the design of the Million Women Study and the characteristics of the study population.
Population-based cohort study of women aged 50-64 in the UK.
Women are asked to join the Million Women Study when they are invited to routine screening for breast cancer at 61 of the screening centres of the UK National Health Service Breast Screening Programme (NHSBSP). An estimated 71% of women screened by the NHSBSP return a completed questionnaire.
800 000 women were recruited between May 1996 and June 1999, and it is planned that an additional 200 000 will be recruited by the year 2000.
The characteristics of the first 121 000 women recruited into the Million Women Study are described here. At recruitment 33% of the study population were currently using hormone replacement therapy and 47% had used it at some time. Over half (54%) had used oral contraceptives, and 18% were current smokers at the time of recruitment. Before they were screened 1.4% of the women had been diagnosed with breast cancer in the past, 6% had a mother with a history of breast cancer and 3.7% had a sister with a history of breast cancer. It is estimated that 1 million women will have been recruited by early in the year 2000, and that by the end of the year 2002 there will be 5000 screen-detected breast cancers and 23 000 deaths in the cohort, the majority of which will be attributed to cancer (12 600 deaths) and circulatory disease (8000 deaths).
By the end of the year 2002, the Million Women Study will have sufficient statistical power to detect relative risks of 0.8 or less, or of 1.2 or more in current users compared with never users of hormone replacement therapy for mortality from breast cancer, colorectal cancer, lung and ovarian cancer, ischaemic heart disease and stroke.
The Million Women Study is a nationwide collaborative research project in the UK, the chief aim of which is to describe the relationship between use of hormone replacement therapy (HRT) and the risk of various conditions, particularly breast cancer. The study began in May 1996 and the plan is to recruit and follow-up a cohort of 1 million women invited to attend the UK National Health Service Breast Screening Programme (NHSBSP).
The NHSBSP was set up in 1988. Once every 3 years each woman in the UK aged between 50 and 64 years who is registered with the NHS is sent a letter by the NHSBSP, offering her routine screening for breast cancer by mammography. About 1 million women are screened annually by the NHSBSP and about 5000 of them have a breast cancer detected on mammography . The day-to-day organization and screening activities are performed by about 100 separate screening offices throughout the UK, and the work is monitored and statistics gathered centrally by a national co-ordinating centre.
The present paper describes the design of the Million Women Study and the characteristics of the study population.
The Million Women Study is a population-based cohort study. Women are recruited when they are invited for routine breast cancer screening, and the main outcomes to be examined at follow-up are the incidence of screen-detected breast cancer and cause-specific mortality.
Attendance at screening
About three-quarters of the women who are invited for screening by the NHSBSP subsequently attend for mammography . Before the study could be launched, it was necessary to demonstrate that inviting women to join the Million Women Study would not reduce uptake of screening offered by the NHSBSP. During 1994 and 1995 a total of 6000 women who were due to be invited for breast cancer screening in Oxford and West London were randomly divided into two groups. One group was sent the usual invitation for screening and the other group was sent the study questionnaire, accompanying the usual invitation to screening. Attendance rates for screening were similar, at 71%, among those who were and were not sent an accompanying questionnaire .
Women are asked to join the Million Women Study by participating NHSBSP screening centres at the time that or just before they are sent their usual invitation for routine breast cancer screening. A questionnaire is included with each woman's invitation and, if the woman wishes to join the study, she is asked to complete the questionnaire, to give signed permission for follow-up, and to return the questionnaire at the time she is screened. A freephone number is provided for women who have any questions or problems filling out the questionnaire. The questionnaire is four pages long (A4 size) and includes questions about lifestyle and sociodemographic factors, reproductive history, past use of oral contraceptives, use of HRT, past medical history and family history of breast cancer. Completed questionnaires are transferred periodically from the participating screening centres to the study co-ordinating centre at the Imperial Cancer Research Fund Cancer Epidemiology Unit (CEU), Oxford, UK.
Data storage, entry and checking
The confidential completed questionnaires are stored securely at all times. Once they reach the CEU they are checked and coded by trained staff and then scanned electronically. The scanned data are 'captured' using computerized intelligent character recognition and optical mark reading software (Eyes and Hands®; Readsoft Inc, Slough, UK). Range and logical checks are performed at the time of data entry. Any inconsistency or information that is not recognised by the data capture software is verified manually by trained data entry staff, who also validate computer-interpreted data and check each questionnaire to confirm whether signed consent for follow up has been granted. Each week the verified data for about 50 individuals are checked against the original questionnaires and the error rate is consistently below 1%. This partially automated process thus permits data to be entered rapidly and with high accuracy.
Follow up for breast cancer
Each screening centre of the NHSBSP is required to compile annual statistics on its activities, which include details of all breast cancers detected at mammography . A list of women enrolled into the Million Women Study at each centre is cross-checked at regular intervals against the list of the women diagnosed with screen-detected breast cancer at that centre. If a breast cancer has been diagnosed at screening in a study participant, routinely recorded details of the cancer are abstracted, including tumour location, histology, size, grade, invasive status and involvement of axillary lymph nodes. Information on hormone receptor status and treatment is abstracted when it is available. Several approaches are being used to identify breast cancers diagnosed subsequent to screening. One will involve record linkage with cancer registry data. Also, women will be contacted directly 2-3 years after they were screened, and asked about new illnesses, including any new breast cancers, that may have been diagnosed (see Additional follow up, below). This will permit the identification of both screen-detected and interval cancers.
Follow-up for deaths
Deaths are identified annually by computerized matching of name, date of birth and NHS number of the women who gave signed consent for follow up in the Million Women Study, with the national death files held by the Office of National Statistics. For each death thus identified the date of death and underlying and associated causes of death are provided by the Office of National Statistics.
Additional follow up
Participants will be sent a follow-up questionnaire about 2-3 years after recruitment, to ascertain changes in use of HRT and incident morbidity, for example breast cancers, diagnosed outside the screening programme.
The most important variables for this study are the subjects' identification details, their use of HRT, any diagnosis of breast cancer and the recording of deaths. To assess the accuracy of the subjects' identification details and of the recording of deaths, a random sample of 5000 women recruited in 1996 has been selected for flagging on the NHS Central Register (NHSCR). Identification details recorded for the study (name, address, date of birth and NHS number) enabled all but 10 (0.2%) of the 5000 women to be identified on the NHSCR. The completeness and accuracy of the reported deaths will be validated in the future against those recorded in the NHSCR for these 5000 women.
The reliability of diagnosis of screen-detected breast cancers is monitored by various quality control procedures within the NHSBSP. Screen-detected breast cancers are verified according to defined procedures, and the invasive status, size and type of cancer are recorded for virtually 100% of the cancers.
The validity of reported information on use of HRT, including the type and dose, is being examined and a full report will be published in due course. Preliminary comparisons with the prescription records from one general practice in Oxfordshire indicate at least 95% agreement for reported current use of HRT, including the hormonal constituents of the preparation used most recently (Banks et al, unpublished data).
After it was demonstrated that the Million Women Study questionnaire did not alter attendance rates , each screening centre in England, Scotland and Wales was invited to participate in the study. Almost all the centres expressed enthusiasm for the study, although practical problems precluded the involvement of some centres. The most frequent reason for screening centres not participating was that the Million Women Study questionnaire could not readily be packaged together with the letters and other information normally posted to women when they are invited for screening.
Accrual of the cohort
Response rate for the Million Women Study*
Estimated number of women who were sent a questionnaire
when they were invited for screening
Estimated number who were screened (75% of those invited )
Number of women who returned a questionnaire
(53% of those invited and 71% of those screened)
Number of women who returned a questionnaire with full
(50% of those invited, 66% of those screened and
identification details and signed permission for follow-up
93% of those returning a questionnaire)
Number of women who returned a questionnaire without full
(3.5% of those invited, 4.7% of those screened and
identification details and/or signed permission for follow-up
6.6% of those returning a questionnaire)
Characteristics of 121 000 respondents
Some characteristics of 121 000 respondents in the Million Women Study
Age at screening (years)
Use of hormone replacement therapy
Number of children
Age at birth of first child (years)
Past use of oral contraceptives
History of breast disease, including breast cancer, in 121 000 respondents to the Million Women Study and their relatives
Previous screening for breast cancer
Previous surgery for breast disease
Previous breast cancer
History of maternal breast cancer
History of breast cancer in a sister
Previous illnesses and conditions for which women are now being treated in 121 000 respondents to the Million Women Study
Women with a history of:
Hypertension, when pregnant
Hypertension, when not pregnant
Blood clot in leg
Women currently being treated for:
Women with previous operations:
Comparison of participants and nonparticipants
The overwhelming reason for nonparticipation in the Million Women Study is not attending for breast cancer screening, having been invited to do so. Women are asked to bring the completed questionnaire with them when they are screened, and thus far over 99% of the respondents were recruited in this way. Although no envelope or pre-paid postage is provided, a small number of the respondents posted their questionnaire back to the screening or co-ordinating centre, and virtually all of them also attended for breast cancer screening.
A direct comparison of those who agreed to participate in the study with those who did not has been performed in one general practice in Oxfordshire and similar comparisons are planned for other areas. A full report of these findings will be published in the future, but preliminary results suggest that there are few substantial differences between participants and nonparticipants. At this stage the main difference between the groups appears to be that nonparticipants are more likely than participants to be prescribed medications for the treatment of hypertension (Banks et al, unpublished data).
Comparison of those who did and did not give sufficient information and/or permission for follow up
Sufficient information and/or permission for follow-up
Yes (n = 113 104)
No (n = 8025)
Average age (years)
Average age at leaving school (years)
Average height (m)
Average weight (kg)
Current smoker (%)
Drink alcohol (%)
Exercise at least once a week (%)
Average number of children
Average age at first birth (years)
Previous breast cancer screen (%)
Previous diagnosis of breast cancer (%)
Mother or sister with a history of breast
Previous use of oral contraceptives (%)
Current use of HRT (%)
Past use of HRT (%)
Expected numbers and statistical power
At the present accrual rate it is expected that a cohort of 1 million women will have been recruited by the year 2000. Based on national statistics from the NHSBSP , about 5000 screen-detected breast cancers would be expected in this cohort. Given these numbers, and the expected proportion of current and never users of HRT at recruitment, the study should have 80% power to detect a relative risk of 1.1 in both current users and in current users of durations of at least 5 years, compared with never users.
Estimated number of deaths up to the end of the year 2002 in the Million Women Study cohort and minimum and maximum detectable relative risks for certain common conditions
Current versus never users
Current users of 5+ years duration versus never users
Cause of death
Expected number of deaths*
Minimum detectable RR† > 1
Maximum detectable RR† < 1
Minimum detectable RR† > 1
Maximum detectable RR† < 1
All circulatory disease
Ischaemic heart disease
All causes of death
Previous studies have suggested that both recency and duration of HRT use are important in determining its effect on breast cancer, and perhaps on other diseases[5,6]. However, because women tend to stop taking HRT when they become ill, there are problems in interpreting differences in mortality according to HRT use at the time of death. One way of overcoming these problems is to examine mortality in relation to use of HRT before diagnosis of any serious illness. Analyses of cause-specific mortality in relation to use of HRT within the Million Women Study will, therefore, exclude women with serious illnesses at the time of recruitment and be based on use as recorded at the time of entry into the cohort.
Table 6 shows the least extreme detectable relative risks for each of the main causes of death to be examined for various patterns of HRT use as compared with never users. These power calculations show that for a common cancer, such as colorectal cancer, there should be sufficient power to detect an increase or decrease in mortality of as little as about 20% in current users compared with in nonusers, and of about 25% in current users of long duration compared with nonusers. Even for endometrial cancer, which is the least common of the causes listed, it should be possible to detect quite modest increases or decreases in mortality of around 40% in current users compared with never users, and of about 45% in current users of long durations compared with never users.
By the end of the year 2002, the largest numbers of expected deaths among these women will be due to breast cancer and ischaemic heart disease. Thus, the effect of HRT use on deaths from these two causes will be particularly important in determining the net benefit or risk to mortality in HRT users as compared with nonusers. For both of these conditions relative risks of greater than 1.1 or less than 0.9 would be detectable among current versus never users. The corresponding figures among current users with durations of use of 5 or more years are 1.2 and 0.8, respectively.
Many other questions about women's health can also be answered by this study. The cohort is sufficiently large to provide reliable data on the health effects of many lifestyle factors, including the consumption of tobacco and alcohol, and on the effects of past use of other hormonal agents, such as oral contraceptives. In addition, the Medical Research Council is supporting an extension of the study to evaluate the effect of HRT on the efficacy of mammography. In that study, women recalled for further assessment after screening and women with interval cancers are being identified. Information on interval cancers is being sought from cancer registries and also directly by sending women a follow-up questionnaire 2-3 years after their initial screen and asking about recent morbidity, including diagnosis of breast cancer. This will allow estimation of how HRT affects the sensitivity and specificity of mammography.
The main purpose of the Million Women Study is to examine the relationship between breast cancer and use of HRT, in a context where use of hormonal therapy is recorded as reliably as possible and breast cancers are diagnosed as uniformly and consistently as possible. Obtaining details of use of HRT before any breast cancer is diagnosed will minimise possible reporting biases of use of such therapy. Moreover, studying screen-detected breast cancers overcomes the potential bias that women who are taking HRT may be more likely to be screened than women who do not use such therapy. The limitation of examining screen-detected cancers alone, however, is that use of HRT may itself reduce the efficacy of mammographic screening. The plan, therefore, is to follow the women screened for interval breast cancer, and to include those cancers in the analyses of the relation between use of HRT and breast cancer.
Because the entire cohort is being followed up for deaths, it will also be possible to look at the relationship between use of HRT and mortality from various causes. Women prescribed HRT tend to be healthier than those who are not, however, and so it is crucial that analyses take proper account of the so-called 'healthy user effect' . In designing the study attention has been given to the recording of detailed information about illnesses present at the time of recruitment. It can be seen in Tables 3 and 4 that a substantial proportion of women recruited have had illnesses such as hypertension and other cardiovascular disease in the past that would affect their risk of death from circulatory disease and other causes. The plan is to analyse results separately according to history of previous illness, and most weight will be given to the findings in women who had no previous illness.
Randomized clinical trials of HRT are now underway. These trials will have sufficient statistical power to detect a substantial reduction in ischaemic heart disease, but will not be able to pick up important, but modest, changes in the risk of cancer . Thus, there will be a continued need for observational data to look at the effects of HRT on disease.
The Million Women Study is one of the largest cohort studies ever devised. Recruitment is proceeding rapidly and the study is on target to accrue a cohort of 1 million women by the year 2000. Preliminary results indicate that the women joining the Million Women Study do not differ substantially from women of a similar age in the general population.
It is expected that, within 5 years, the study will have sufficient statistical power to answer questions about the role of HRT in mortality from breast cancer and other specific conditions of interest.
This cohort may ultimately include about one women in every five aged between 50 and 64 years in the UK. This excellent co-operation at a national level reflects the efficient organization of the NHSBSP. It is also indicative, perhaps, of concern by women at the lack of reliable knowledge about the long-term effects of HRT and the fact that in the UK today there is substantial use of this type of therapy.
NHS Breast Screening Centres that began recruitment before December 1998 (in alphabetical order) are as follows: Avon, Aylesbury, Barnsley, Basingstoke, Bedfordshire & Hertfordshire, Cambridge & Huntingdon, Chelmsford & Colchester, Chester, Cornwall, Crewe, Cumbria, Doncaster, Dorset, East Berkshire, East Cheshire, East Devon, East of Scotland, East Suffolk, Gateshead, Gloucestershire, Great Yarmouth, Hereford & Worcester, Kings Lynn, Leicestershire, Liverpool, Manchester, Milton Keynes, Newcastle, North Birmingham, North East Scotland, North Lancashire, North Middlesex, North Nottingham, North of Scotland, North Tees, North Yorkshire, Nottingham, Oxford, Portsmouth, Rotherham, South Birmingham, South East Scotland, South East Staffordshire, Sheffield, Shropshire, Somerset, South Derbyshire, South Essex, South Lancashire, South West Scotland, Surrey, Warrington Halton St Helens & Knowsley, Warwickshire Solihull & Coventry, West Berkshire, West Devon, West of London, West Suffolk, West Sussex, Wiltshire, Winchester and Wycombe.
The Million Women Study Co-ordinating Centre staff are as follows: Emily Banks, Valerie Beral, Anna Brown, Diana Bull, Becky Cameron, Barbara Crossley, Diane Deciacco, Dave Ewart, Laura Gerrard, Julie Hall, Sally Hall, Elizabeth Hilton, Ann Hogg, Carol Keene, Nikki Langley, Nicky Langston, Gillian Reeves, Moya Simmonds.
The Steering Committee members are Joan Austoker, Emily Banks, Valerie Beral, Ruth English, Julietta Patnick, Richard Peto, Gillian Reeves, Martin Vessey and Matthew Wallis.
The Writing Committee members are Emily Banks, Valerie Beral and Gillian Reeves.
The main acknowledgement is undoubtedly to each of the 1 million women participating in this study. The contribution from many individuals at each of the collaborating NHS Breast Screening Centres (listed below) is also gratefully acknowledged.
- Patnick J:NHS Breast Screening Programme Review.NHSBSP: Sheffield. 1996Google Scholar
- Banks E, Richardson A, Beral V, et al: Effect on attendance at breast cancer screening of adding a self administered questionnaire to the usual invitation to breast screening in southern England. J Epidemiol Commun Health. 1998, 52: 116-119.View ArticleGoogle Scholar
- Beral V, Hermon C, Kay C, et al: Mortality associated with oral contraceptive use: 25 year follow-up of a cohort of 46000 women from the Royal College of General Practitioners' oral contraception study. BMJ. 1999, 318: 96-100.View ArticlePubMedPubMed CentralGoogle Scholar
- Nystrom L, Rutqvist LE, Wall S, et al: Breast cancer screening with mammography: overview of Swedish randomised trials. Lancet. 1993, 341: 973-978. 10.1016/0140-6736(93)91067-V.View ArticlePubMedGoogle Scholar
- Beral V, Doll R, Bull D, et al: Breast cancer and hormone replacement therapy: collaborative reanalysis of data from 51 epidemiological studies of 52705 women with breast cancer and 108411 women without breast cancer. Lancet. 1997, 350: 1047-1059. 10.1016/S0140-6736(97)08233-0.View ArticleGoogle Scholar
- Beral V, Reeves G, Bull D, Key T, Peto R: Breast cancer and hormone replacement therapy: putting the risk into context. IV European Congress on Menopause. Vienna, October 8-12. 1998, 267-277.Google Scholar
- Beral V, Banks E, Reeves G, Appleby P: Use of hormone replacement therapy and the subsequent risk of cancer. J Epidemiol Biostat. 1999Google Scholar