A comparison of five methods of measuring mammographic density: a case-control study

Background High mammographic density is associated with both risk of cancers being missed at mammography, and increased risk of developing breast cancer. Stratification of breast cancer prevention and screening requires mammographic density measures predictive of cancer. This study compares five mammographic density measures to determine the association with subsequent diagnosis of breast cancer and the presence of breast cancer at screening. Methods Women participating in the “Predicting Risk Of Cancer At Screening” (PROCAS) study, a study of cancer risk, completed questionnaires to provide personal information to enable computation of the Tyrer-Cuzick risk score. Mammographic density was assessed by visual analogue scale (VAS), thresholding (Cumulus) and fully-automated methods (Densitas, Quantra, Volpara) in contralateral breasts of 366 women with unilateral breast cancer (cases) detected at screening on entry to the study (Cumulus 311/366) and in 338 women with cancer detected subsequently. Three controls per case were matched using age, body mass index category, hormone replacement therapy use and menopausal status. Odds ratios (OR) between the highest and lowest quintile, based on the density distribution in controls, for each density measure were estimated by conditional logistic regression, adjusting for classic risk factors. Results The strongest predictor of screen-detected cancer at study entry was VAS, OR 4.37 (95% CI 2.72–7.03) in the highest vs lowest quintile of percent density after adjustment for classical risk factors. Volpara, Densitas and Cumulus gave ORs for the highest vs lowest quintile of 2.42 (95% CI 1.56–3.78), 2.17 (95% CI 1.41–3.33) and 2.12 (95% CI 1.30–3.45), respectively. Quantra was not significantly associated with breast cancer (OR 1.02, 95% CI 0.67–1.54). Similar results were found for subsequent cancers, with ORs of 4.48 (95% CI 2.79–7.18), 2.87 (95% CI 1.77–4.64) and 2.34 (95% CI 1.50–3.68) in highest vs lowest quintiles of VAS, Volpara and Densitas, respectively. Quantra gave an OR in the highest vs lowest quintile of 1.32 (95% CI 0.85–2.05). Conclusions Visual density assessment demonstrated a strong relationship with cancer, despite known inter-observer variability; however, it is impractical for population-based screening. Percentage density measured by Volpara and Densitas also had a strong association with breast cancer risk, amongst the automated measures evaluated, providing practical automated methods for risk stratification. Electronic supplementary material The online version of this article (10.1186/s13058-018-0932-z) contains supplementary material, which is available to authorized users.


Background
High mammographic density, the relative proportion of fibroglandular to fatty tissue in the breast, reduces the effectiveness of mammographic screening [1][2][3][4] and increases risk of developing breast cancer [5,6]. The relationship of density with risk was established using expert visual assessment of film mammograms [7], with computer-assisted methods providing more reproducible estimates [8,9]. With increasing uptake of full-field digital mammography (FFDM), the association between automated density assessment methods and cancer risk is under investigation [10][11][12].
The most widely used method of assessing mammographic density in the USA is the Breast Imaging Reporting And Data System (BI-RADS) categorisation, where experts assign mammograms to one of four classes, the upper two being considered "dense"' [13]. The class descriptors were changed in 2013 to better identify women whose cancers may be masked by dense parenchymal tissue [14]. Visual assessment of percentage density may be recorded on visual analogue scales (VAS), providing a continuous measure. This yielded a strong relationship with breast cancer risk for film mammograms, with an odds ratio (OR) of approximately 7 for 76-100% density relative to 0-25% [15]. Likewise, Boyd's 6-class categorisation of percent visual density gave a relative risk in the highest category (> 75% dense) compared with the lowest of 6.05 (95% confidence interval (CI) = 2.82-12.97) in a case-control study with 354 cases [8]. Cumulus, a semi-automated thresholding method, was developed to improve reproducibility [8,9] and has a wellestablished relationship with cancer risk [8,12]. However, this method also requires trained observers, and whilst separating the breast from the mammogram background is reproducible, judgement of the best threshold to separate dense tissue from fat is less so. Boyd category, VAS and Cumulus are all relative, area-based methods, so density estimates can vary depending on breast positioning and patient weight [8,16]. Weight change disproportionately alters the fatty component of the breast [17] and percentage density measures should be adjusted to take body mass index (BMI) into account [18]. Now that digital mammography is standard throughout the UK, volumetric measures of mammographic density, made by calibrating pixel values in the raw ("for processing") FFDM image using a model of x-ray physics and imaging parameters [19], are now available. These can be expressed either in percentage terms (volumetric percent dense) or as absolute measures of dense and non-dense tissue.
The availability of fully automated density assessment paves the way for risk stratification in screening [20], allowing selection of the most appropriate imaging modality and screening frequency for the individual [21,22]. The addition of mammographic density to breast cancer risk models based on other risk factors has demonstrated increased predictive power, depending on the method used for density estimation [23][24][25]. It is therefore important to determine which density methods are suitable for risk-adapted screening; more accurate risk prediction will enable better targeting of risk-reducing interventions including chemoprevention and lifestyle modification [26,27].
A previous case-control analysis, carried out in London, compared density measured in the unaffected (contralateral) breast in 414 women diagnosed with unilateral breast cancer at one hospital with that of 685 unmatched controls attending routine breast screening. Comparing the highest percentage density quintile with the lowest, and adjusting for age, BMI and reproductive variables, the strongest association with risk of developing breast cancer was for Volpara, with an OR of 8.26 (95% CI 4.28-15.96), followed by Quantra, OR 3.94 (2.26-6.86) and Cumulus, OR 3.38 (2.00-5.72) [13]. However, mammographic density was assessed at the time of detection of cancer, so the ability of density to predict women who would later develop the disease was not assessed. Here we address this by evaluating the association between five mammographic density methods and the presence of cancer at the time of screening, and the association between four mammographic density methods and cancer detected subsequently, either between screening rounds or at a later screen, using data from the Predicting Risk of Cancer At Screening study (PROCAS) [20].

Study design
Women invited to the Greater Manchester Breast Screening Service for routine 3-yearly mammographic screening between October 2009 and March 2015 were also invited to participate in the "Predicting Risk Of Cancer At Screening" (PROCAS) study, which aimed to provide women with a personalised risk estimate of their breast cancer risk based on mammographic density and classic breast cancer risk factors obtained via a questionnaire and quantified by the Tyrer-Cuzick risk score [28]. After October 2012 only women attending their first (prevalent round) screen were invited. At the time of recruitment informed consent was obtained from all participants.
In order to assess density using fully automated methods, the raw FFDM (for processing) image data from GE Senographe Essential mammography systems was obtained. Cancers (invasive and ductal carcinoma in situ) were identified through hospital records or through the North West Cancer Intelligence Service; women who moved out the area were considered ineligible. Two case-control datasets were created. In study 1, cases were women with breast cancer detected at the screen on entry to PROCAS and in study 2 cases were women who were breast cancer free at the screen on entry to PROCAS but had breast cancer detected subsequently, either between screening rounds or at a later screen. In these women we analysed the density of the screen on entry to PROCAS.
Three controls without cancer were matched to each cancer case based on age (±12 months), BMI category (missing, < 24.9, 25.0-29.9, 30+ kg/m 2 ), hormone replacement therapy (HRT) use (current vs never/ever) and menopausal status (premenopausal, perimenopausal or postmenopausal). In both studies all controls had a subsequent cancer-free screening mammogram so it was unlikely that early signs of cancer were visible, and in study 2, controls were also matched on year of mammogram at entry.

Mammographic density measurement Visual estimation of percentage density
Processed FFDM images were displayed on Planar Dome E5 5MP self-calibrating high-resolution monitors. Two of nineteen readers (usually a consultant radiologist or breast physician and an advanced practitioner radiographer) independently recorded density estimates on a paper form showing four 10-cm horizontal VAS, one for each view, labelled 0% and 100% at the ends of the scale. Forms were read using custom software and visual percentage density calculated. VAS readings were averaged between readers and views, and analysed in quintiles and as Boyd categories (0%, > 0-10%, > 10-25%, > 25-50%, > 50-75% and > 75%) [8]. Due to the small number of cases in the highest category (three in study 1 and six in study 2), the top two Boyd categories were combined for analysis. Intra-observer and inter-observer agreement for 120 mammograms randomly selected across deciles of VAS density scores, from the PROCAS study, were assessed by 11 readers, on two occasions, 3 years apart. The majority of readers had excellent intra-observer agreement (intraclass correlation coefficient (ICC) > 0.80), and inter-observer agreement for consistency was excellent (ICC = 0.82) and was substantial for absolute agreement (ICC = 0.69) [29].

Cumulus
Cumulus (Sunnybrook Health Sciences Centre, Toronto, ON, Canada) density assessment was undertaken by a single reader (JS) trained in August 2010 and validated by a member of the PROCAS team (JW) who had herself been trained by the group that developed the software. Reader performance was validated on test sets of data developed for this purpose by the trainers. Processed FFDM images were analysed. Cumulus was undertaken on a single contralateral mediolateral oblique (MLO) view of a subset of the study 1 dataset comprising 311 screen-detected cancers and their matched controls. The reader was blind to case-control status.

Quantra™
Quantra version 2.0 (Hologic Inc, Bedford, MA, USA) was used to assess density from the raw FFDM images for each view, each breast and each woman, giving breast and fibroglandular tissue volume (cm 3 ), and the dense tissue area as a percentage of breast volume. It also provides a quantized BI-RADS-like score for each view and per breast.

Volpara™
Volpara Density Algorithm 1.5.0 (Volpara Health Technologies, Wellington, New Zealand) was also used to assess density from the raw FFDM images for each view, giving breast volume and fibroglandular tissue volume (cm 3 ) and percentage density by volume. Volpara provided a macro, which produced per-patient results including Volpara Density Grade (VDG 4 th and 5 th Edition), designed to correlate with BI-RADS 4 th and 5 th Edition [15]. This also computes the percentage density of the two breasts following outlier removal.

Densitas™
Densitas version 2.0.0 (Densitas Inc, Halifax, NS, Canada) analyses processed FFDM images, giving breast and fibroglandular area (cm 2 ) and percentage density by area for each image and per patient. It also produces per-patient measures of BIRADS 4 th and 5 th edition [15].

Statistical methods
In study 1, mammographic density was assessed in the contralateral breast in women with cancer and the breast on the same side in matched controls, whereas in study 2, density was assessed in both breasts at entry to PRO-CAS and the average was used.
Categorical data were compared using the chi-square test for proportions. For ordinal variables, a chi-square test for trend was also conducted. Continuous variables were assessed by the median and Mann-Whitney U test.
The relationship between density assessment and casecontrol status was analysed using conditional logistic regression. Density measures were modelled as quintiles based on the density distributions of controls, and also as continuous measures, transformed to approximately follow a normal distribution (square root transformation for VAS and Cumulus, and a logarithm transformation for Volpara, Quantra and Densitas). Univariate models were fitted initially, and multivariate models fitted to adjust for the logarithm 10-year Tyrer-Cuzick (v.6) risk score. In study 2 we also adjusted for parity, due to imbalance between cases and controls. We also performed an analysis in a subset of women who had been assessed using all density methods to determine which model performed best and differences between models were compared using the likelihood-ratio chi square. The matched concordance (mC) index, a modification of the concordance index (or area under the receiving operator characteristic curve (AUC)) for matched case-control studies, gives an average concordance index within matched groups (where 1.0 would indicate perfect discrimination after allowing for matching factors) with empirical bootstrap confidence intervals [30], was calculated to compare the discrimination performance of risk factors. All p values were two-sided. Analysis was performed in SPSS version 22 [31] and R 3.3.1 [32].

Results
Of the 57,905 women recruited to PROCAS, raw FFDM image data were available for 44,658 women (77%). Unavailability of raw FFDM images was predominantly due to the use of film mammography initially. There were 1004 cases of cancer occurring after consent up to November 2015, of which 704 were included in the analysis. The excluded women comprised 39 women with a preexisting diagnosis of breast cancer, 13 with synchronous bilateral breast cancer, 118 with film mammograms and 130 with FFDM but for which raw image data was unavailable. Of the 704 women eligible for the analysis, 366 were women with breast cancer detected at the screen on entry to PROCAS (study 1) and 338 were women who were found to be breast cancer free at the screen on entry to PROCAS but had breast cancer detected subsequently, either between screening rounds or at a later screen (study 2). Of the latter, 114 women developed an interval cancer within 5-46 months of entry (IQR 13-31) and 224 women had breast cancer detected at a subsequent screen 17-55 months after entry (IQR 35-38).
Matching was satisfactory for both studies (Table 1). There was a difference in 10-year Tyrer-Cuzick score, with the score higher in cases (study 1, 2.95 vs 2.72, p = 0.003; study 2: 2.91 vs 2.63, p < 0.001). The reported rate of a previous breast biopsy in cases was 17.8% (study 1) and 22.5% (study 2), and in controls it was 14.5% (study 1) and 15.1% (study 2). The difference in biopsy rate between cases and controls was statistically significant in study 2 (p = 0.005), but was similar (in study 2) to the PROCAS study as a whole. In study 1 significantly fewer cases than controls reported being of "white" ethnic origin (91.3% vs 94.5%, p = 0.003), and fewer cases than controls reported having children in study 2 (85.8% vs 90.2%, p = 0.023).
In study 1, VAS results were missing for 46 cases of cancer, Quantra failed to produce results for one case and one control, Volpara failed for one case, and Densitas failed for 6 cases and 62 controls. In study 2 there were missing density results for two cases of cancer assessed by VAS, for one case and one control assessed by Quantra and for 7 cases and 34 controls assessed by Densitas.

Study 1: screen-detected cancers
In study 1 after full adjustment, the strongest predictor of breast cancer risk was visually assessed density ( Table 2, Fig. 1), with an odds ratio (OR) of 4.37 (95% CI 2.72-7.03) in the highest quintile of density compared with the lowest. When quantized in Boyd categories (Table 3), the adjusted OR of those with greater than 50% density was 6.73 (95% CI 3.64-12.45) compared to those with density 10% or lower. Volpara percent density provided the next strongest association with cancer, with an OR for the highest quintile of 2.42 (95% CI 1.56-3.78) ( Table 2, Fig. 1). When quantized in Volpara Density Grades (VDG 5 th edition), the OR of VDG4 was 4.39 (95% CI 2.28-8.48) compared with VDG1 (Table 3). Both visually assessed density and Volpara percent density showed a significant and clear trend with increasing density (χ2 trend 35.6, p < 0.001 and 11.2, p < 0.001, respectively). Percent density measured by Densitas and Cumulus was also statistically significant ( Table 2, Fig. 1), with ORs of 2.17 (95% CI 1.41-3.33) and 2.12 (95% CI 1.30-3.45), respectively in the highest quintile of percent density compared with the lowest, and for Quantra there was no significant association (OR = 1.02, 95% CI 0.67-1.54). The relationship with dense volume is shown in Table 2; generally associations tended to be slightly lower than those for percent density. In the subset of women with all density measures VAS was a significantly better predictor of breast cancer risk than all other methods ( Table 2, Additional file 1: Table S2). The matched concordance index for VAS was 0.651 (95% CI 0.611-0.691) demonstrating better discrimination between cases and controls than all other methods (Table 4).

Study 2: prior mammograms
In study 2 visually assessed density had the strongest association with subsequent cancer in the fully adjusted      Fig. 2). VAS predicted breast cancer risk significantly better than all other density methods in the subset of women who had density measured by all four methods ( Table 2, Additional file 1: Table S2). The matched concordance index for VAS was 0.647 (95% CI 0.607-0.688) demonstrating better discrimination between cases and controls than all other methods (Table 4).

Discussion
Visual assessment of breast density recorded on a VAS was the strongest predictor of breast cancer risk, both in the contralateral breast of women with screen-detected cancers and in the average of bilateral mammographic views prior to the detection of cancer. It is unlikely that the presence of cancer influenced visual assessment in study 1, since a blinded re-read of images from the contralateral breast by four readers showed no evidence of bias [23] and the ORs were similar to those in study 2. There is strong association between the VAS and breast cancer despite known inter-observer variability [32]; since the average VAS score of two readers was used it is likely that cases falling into the top and bottom quintiles of density do so unequivocally. Volpara and Densitas percent density had the next strongest associations with cancer in both studies, with categorisation into VDG having the largest odds ratio. Volpara, Quantra and Cumulus did not have as strong an association with breast cancer in study 1 as previously reported [13]. This may be due to differences in the approach used; Eng et al. analysed 414 cases from one hospital and 685 unmatched controls from a screening service based in London, adjusting for age, BMI and reproductive variables in the analysis, whilst we analysed 366 cases using Volpara and Quantra, and 311 using Cumulus, with 3 well-matched controls per case all recruited from the same screening programme. There were also a number of differences between the study populations, with our study population tending to be younger, with more women of white ethnicity and with higher BMI and being less likely to be postmenopausal and to have had children. Density distributions also differed across the two studies, with the current study having lower median (IQR) percent density assessed by Volpara (4.9, 3.5-7.4) and Quantra (11,(8)(9)(10)(11)(12)(13)(14), but higher percent density for Cumulus (20.3, 11.6-30.3) [13]. Our version of Volpara was later (1.5.0 vs 1.0) and we applied a Volpara macro for outlier rejection; our version of Quantra was also more recent (2.0 vs 1.3). For Cumulus, the difference might be due to reader experience. Study 2 examined the relationship between mammographic density in mammograms prior to the detection of cancer, and in matched controls that subsequently remained cancer free. This enables us to evaluate which mammographic density methods are most appropriate for stratifying women attending breast screening. Whilst visual assessment was most strongly associated with cancer, it is unlikely to be used widely for population-based stratified screening; we conclude that Volpara or Densitas percentage density provide a pragmatic solution. However, we hypothesise that methods that measure purely the quantity or relative proportion of dense tissue do not fully capture the mammographic risk in the same way as visual assessment by experts, who can see not only the quantity of dense tissue but the location and pattern. The addition of algorithms that automatically quantify mammographic pattern to automated density software could potentially provide a solution that more closely reproduces visual assessment. Recent research in   this area has proved promising [33][34][35][36], although there is as yet no consensus as to the best method of encapsulating texture information within risk assessment.

Strengths and limitations
The strengths of this study include the ability to assess the relationship between several measures of mammographic density and risk of breast cancer. As well as examining the association between mammographic density and breast cancer risk, we were able to establish the temporal relationship in study 2. We also gathered detailed information in relation to a number of covariates (demographic, hormonal, reproductive, lifestyle and family history) via a self-reported questionnaire at entry to PROCAS [20]. Uptake to PROCAS was relatively low (38%), which may have biased the population to those with higher or lower risk, for example, the proportion of women in the PROCAS study who were overweight or obese was significantly lower than in the general population of Greater Manchester [37]. In addition, in study 1, due to the nature of the study design, whereby controls had to have had a subsequent cancer-free mammogram after entry to the PROCAS study, the year of mammogram in controls tended to be earlier than in cases, this may have had an impact on density measures due to changes in mammography technology, and the use of different mammographic machines over time.

Conclusions
Visual assessment of density, recorded on a VAS and averaged between two independent readers, is a strong predictor of breast cancer risk both in mammograms taken before the detection of cancer and in images of the opposite breast at the time of detection. Percentage density measured by Volpara and Densitas also showed a strong association with breast cancer risk amongst the automated measures evaluated, providing practical automated methods for risk stratification in personalized screening programmes.

Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Authors' contributions SMA conceived and designed the study, analysed and interpreted the data and drafted the manuscript. EFH performed the statistical analysis, interpreted the data and drafted the manuscript. JCS carried out the cumulus density assessment after training from JW. ARB participated in the statistical analysis. PS made substantial contributions to acquisition of questionnaire data. MW, UB, SG, YL, AJ, SB and NB made substantial contributions to acquisition of density data, each reading more than 4000 mammograms. VR was responsible for mammography reporting. RW, DGRE, AH and JC conceived and designed the study, analysed and interpreted the data, and helped to draft the manuscript. All authors were involved in critically revising the paper for intellectual content, and all read and approved the final manuscript.
Ethics approval and consent to participate Ethics approval for the study was through the North Manchester Research Ethics Committee (09/H1008/81). Informed consent was obtained from all participants on entry to the PROCAS study.

Consent for publication
Not applicable.

Competing interests
Software licences for Volpara, Quantra and Densitas were provided free of charge under a research agreement by Volpara Health Technologies (Wellington, New Zealand), Hologic Inc (Marlborough, MA, USA) and Densitas Inc (Halifax, NS, Canada) respectively. The authors declare that they have no competing interests.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.