A clinical model for identifying the short-term risk of breast cancer

Background Most mammography screening programs are not individualized. To efficiently screen for breast cancer, the individual risk of the disease should be determined. We describe a model that could be used at most mammography screening units without adding substantial cost. Methods The study was based on the Karma cohort, which included 70,877 participants. Mammograms were collected up to 3 years following the baseline mammogram. A prediction protocol was developed using mammographic density, computer-aided detection of microcalcifications and masses, use of hormone replacement therapy (HRT), family history of breast cancer, menopausal status, age, and body mass index. Relative risks were calculated using conditional logistic regression. Absolute risks were calculated using the iCARE protocol. Results Comparing women at highest and lowest mammographic density yielded a fivefold higher risk of breast cancer for women at highest density. When adding microcalcifications and masses to the model, high-risk women had a nearly ninefold higher risk of breast cancer than those at lowest risk. In the full model, taking HRT use, family history of breast cancer, and menopausal status into consideration, the AUC reached 0.71. Conclusions Measures of mammographic features and information on HRT use, family history of breast cancer, and menopausal status enabled early identification of women within the mammography screening program at such a high risk of breast cancer that additional examinations are warranted. In contrast, women at low risk could probably be screened less intensively. Electronic supplementary material The online version of this article (doi:10.1186/s13058-017-0820-y) contains supplementary material, which is available to authorized users.


Background
Risk prediction models for breast cancer use lifestyle factors [1], family history of breast cancer [2], mammographic density [3], genetic determinants [4], or any combination of these factors to predict risk of developing the disease [3]. Mammographic density is one of the strongest risk factors for breast cancer and consists of the radiographically dense fibroglandular part of the mammogram. Women with dense breasts have both an increased risk of breast cancer and a lesser likelihood of a cancer being detected. It is currently mandatory by law to report the level of mammographic density to a woman undergoing a mammography in 27 U.S. states, but there is no obligation to report the risk of breast cancer.
Computer-aided detection (CAD) is designed to support radiologists at mammographic screening units in diagnosing early breast cancer. These software can indicate suspicious microcalcifications and masses. We used fully automated CAD and breast density measurement systems and predicted the probability for a woman with a negative mammogram result to be diagnosed with breast cancer within 2 years. We wanted to create an easily implementable prediction tool for individualized breast cancer screening without adding substantial cost or effort to the health care system.
We merged established risk factors, such as use of hormone replacement therapy (HRT), family history of breast cancer, menopausal status, body mass index (BMI), and mammographic density with microcalcifications and masses, using U.S. Food and Drug Administrationapproved CAD software [5]. We were able to identify high-risk women who would probably benefit from intensified breast cancer screening or would be in immediate need of clinical examinations. In parallel, we identified women with such a low breast cancer risk that they might not benefit from screening. To achieve these goals, we used a unique, prospective Swedish population-based screening cohort: the Karolinska Mammography Project for Risk Prediction of Breast Cancer (KARMA) cohort (karmastudy.org).

Methods
In Sweden, women aged 40-74 years are invited every 18-24 months to the national screening program [6]. Women who attended mammographic screening at four hospitals in Sweden were invited to be included in the KARMA cohort between January 2011 and March 2013. A total of 70,877 women chose to participate (age range 31-79 years) [7]. Participants answered a comprehensive web-based questionnaire, donated blood, allowed storage of mammograms, and accepted linkage to national breast cancer registers. By October 2015, a total of 570 incident breast cancers had been identified. Women diagnosed with breast cancer within 3 months of a negative entry mammogram were omitted because it could not be excluded that a cancer was detected at the screening visit. A total of 137 patients lacked information on one or several risk factors, leaving 433 breast cancer cases to be used for the model development. However, the 137 women lacking information were included in calculating the absolute risk estimates, whereby missing data were replaced with the average risk of that risk factor. Four control subjects were matched on age to each case in a prospective nested casecontrol design.
Full-field digital mammograms from the mediolateral oblique (MLO) and craniocaudal (CC) views of the left and right breasts were used to measure mammographic density using the area-based STRATUS method (Additional file 1: Supplementary methods 1). The percentage mammographic density was calculated by dividing the dense area by the total breast area. Breast density was categorized on scale cutpoints (2%, 18%, 49%) into four breast composition groups reflecting the clinically accepted Breast Imaging Reporting and Data System (BI-RADS; American College of Radiology, Reston, VA, USA) score [5, [8][9][10] (Additional file 1: Supplementary methods 2). The computer-generated score is hereafter called cBIRADS.
The CAD software (M-Vu CAD®; iCAD, Nashua, NH, USA) identifies suspicious microcalcifications and masses and presents the findings to the radiologist or as digital text information. Raw mammograms of the MLO and CC views of right and left breasts were used to identify microcalcifications and masses. On the basis of the distribution of microcalcifications among control subjects, the number of microcalcifications was categorized into five groups: 0, 1-10, 11-20, 21-40, and >40 microcalcifications. The number of masses was given as the true number. Level of density and number of microcalcifications and masses, as well as the differences in density and number of microcalcifications and masses between breasts, were used in the model.
On the basis of self-reported information, dichotomous variables were created for current use of HRT, history of breast cancer in first-degree relatives, and menopausal status. Current use of HRT was defined as use within the last 12 months. BMI and age were assessed at the time of study entry, which was the time the baseline mammogram was taken. Screening-detected breast cancer was defined as breast cancer diagnosed within 3 months of a screen. An interval breast cancer was defined as a breast cancer diagnosed at least 3 months after a negative screen but before the date of the next scheduled screen [11].
Descriptive statistics were presented for participant characteristics and to describe mammographic features in the tumor breast side (where the tumor eventually was diagnosed) versus the nontumor breast side in the cases. Differences between the breasts were calculated without assuming knowledge of the tumor breast side. These absolute differences were calculated as the standard deviation (SD) of the two breasts for each mammographic feature and were used as continuous predictors in the final model.
The continuous predictors in the conditional logistic regression model were tested for the best transformation using the Sauerbrei method [12] with fractional polynomials, and the predictors for the absolute breast differences were transformed as reciprocal numbers. The functional form of the final model was assessed using the branch-and-bound Furnival and Wilson statistics for main effects and interaction terms [13]. Relative risks were reported as HRs in this prospective study design.
Absolute risks were calculated using the Individualized Coherent Absolute Risk Estimator (iCARE) package in R [14]. The Swedish national incidence rates of breast cancer and competing mortality risks were used and calculated as the average rates from 2007 to 2011. Prevalence rates of HRT use and family history of breast cancer were derived from the KARMA cohort, and the relative risks from the regression analyses were entered into the model matrix. Missing data from nonreported risk factors were imputed with model averaged risk estimates using the iCARE protocol (Additional file 1: Supplementary methods 3).
Using the same data, a cross-validated AUC was calculated and compared with values generated by the established Tyrer-Cuzick and Gail risk models. The numbers of invasive and in situ cases that were diagnosed during follow-up were tabulated by quintile of the 2-year absolute risks predicted at baseline. The increase in number of diagnosed cases by quintile of baseline risk was calculated and tested for linear trend.
All statistical tests were two-sided at a significance level of 0.05 and calculated using SAS version 9.4 software (SAS Institute, Cary, NC, USA) for descriptive statistics and relative risks. Absolute risks were evaluated with R 3.3.0 software using the iCARE package 1.0.0.

Results
In all, 433 women had a negative mammogram result more than 3 months prior to diagnosis and had full information on risk factors. The data of these women were used to develop the model ( Table 1). The median followup time between the baseline mammogram and diagnosis of breast cancer was 1.7 years, mean age at breast cancer diagnosis was 59.0 years, 88% of the breast cancers were invasive, and 63% were detected by screening. Significantly more cases were current users of HRT (6.9% in cases and 4.4% in control subjects, p = 0.05) and had a family history of breast cancer (19% of cases and 13% of control subjects, p = 4.5 × 10 −4 ) ( Table 1).
At baseline, the median mammographic density was 23.0% in cases on the tumor side (i.e., on the side where the tumor was diagnosed at follow-up) and 12.2% in control subjects (p = 4.0 × 10 −10 ) in the breast corresponding to the tumor side in cases ( Table 2). The corresponding figures for the contralateral side in cases and control subjects were 21.7% and 12.5%, respectively (p = 2.5 × 10 −7 ). Comparing density pairwise between the tumor side and nontumor side in cases showed a mean difference of 1.1% (p = 3.4 × 10 −3 ) ( Table 2).
The mean number of microcalcifications in cases and control subjects was significantly different on both the tumor side (for cases versus corresponding side for control subjects, 6.1 vs. 2.6; p = 4.0 × 10 −20 ) and the contralateral side in cases and control subjects (3.4 vs. 2.6, p = 0.03). The comparison between tumor and nontumor sides in cases showed a mean difference of 2.7 microcalcifications (p = 1.9 × 10 −3 ) ( Table 2).
The mean number of detected masses in cases versus control subjects was significantly different on the tumor side (for cases and corresponding side for control subjects, 0.77 vs. 0.56; p = 8.4 × 10 −6 ) but not on the contralateral side in cases and control subjects. The pairwise comparison between tumor and nontumor sides in cases showed a mean difference of 0.26 masses (p = 9.2 × 10 −3 ) ( Table 2).
In the lower part of Table 2, the absolute differences between the breasts are presented to contrast cases and control subjects. It can be seen that cases have a more uneven distribution of mammographic density (p = 1.7 × 10 −6 ), microcalcifications (p = 4.0 × 10 −16 ), and masses (p = 0.02).
Relative risks of breast cancer within 3 years from a negative mammographic screening result at baseline were calculated using two models (Table 3). In the fully adjusted model, the risk of breast cancer in women with a family history of the disease was 1.3 (95% CI 1.0-1.7). A significant difference was seen for women with the highest versus lowest cBIRADS scores (HR 4.8), in women with microcalcifications in category 4 compared with no microcalcifications (HR 2.0), in women with significant difference in density (HR 1.9), and in microcalcifications (HR 2.8) between left and right breasts (Table 3). A more detailed stratification is provided in Additional file 1: Table S1.
Dividing cases into invasive (n = 383) and in situ (n = 50) cancers (Additional file 1: Table S2) revealed that microcalcifications were significantly more likely to identify risk of , we observed that all mammographic features, including the absolute differences between the breasts, were more likely to identify interval cancers than screening-detected cancers (Additional file 1: Table S2). Women with a cBIRADS score of 4, microcalcifications in category 3 or higher, and three or more masses had a nearly ninefold higher risk of breast cancer than women with a cBIRADS score of 1 and no microcalcifications or masses ( Table 4).
The final model including the selected risk factors, stratified by menopausal status, is provided in Additional file 1: Table S3 and was used for calculating absolute risks. We plotted the frequency distribution of the predicted absolute risk of breast cancer using the generated relative risks and prevalence of risk factors in 570 incident breast cancer cases and 60,237 healthy women in the KARMA cohort (Fig. 1). This was done after exclusion of women with previous breast cancers and/or lack of mammograms (Additional file 1: Supplementary methods 3).
There was a significant linear trend in the association between increasing 2-year absolute baseline risk and larger proportion of cancers diagnosed during the study follow-up. For each quintile of 2-year baseline risk, 56.7 more cases were found to be diagnosed (p = 0.04). The   Table S4).

Discussion
Using the KARMA cohort, including 570 patients with breast cancer and 60,237 healthy control subjects, we generated a comparatively simple breast cancer risk prediction model for clinical use. Exploiting three fully automatically measured mammographic features enabled identification of women at an approximately ninefold greater risk of developing breast cancer when we compared the high-and low-risk groups (Table 4). In the full model, taking HRT use, family history of breast cancer, and menopausal status into consideration, the AUC reached 0.71 (Table 5). Several studies have shown mammographic density to be an excellent predictor of breast cancer risk where women with high breast density have a four to six times higher risk than women with low breast density [16]. Reassuringly, we observed a relative risk of 4.8 (95% CI 2.6-8.8) when we compared the highest with the lowest cBIRADS scores (Table 3). Comparing the highest with the lowest numbers of microcalcifications and masses each gave significant relative risks of approximately 2 (Table 3). In addition, the difference in number of microcalcifications between the breasts gave a risk of 2.8 (95% CI 1.8-4.5).
It should be underlined that our model identifies women at short-term risk of being diagnosed with breast cancer. These women are in their later progression but earlier stage, have a negative screening mammogram result, and are within 2 years of being diagnosed with either an interval cancer or a cancer at the next screening visit. The interval cancers were also shown to be at the highest risk (Additional file 1: Table S2). There are studies presenting extremely high relative risks of mammographic density (OR 17.8) for interval cancer [17]. We would have got similar results if we had not considered that interval cancers should be compared with control subjects also having clinical examinations and not with control subjects having an ordinary scheduled screening mammogram.
Adding the clinical observation that differences in density, microcalcifications, and masses between the breasts are indicators of malignancy developed our model further. It has long been known that breast asymmetry is a risk factor for breast cancer [18]. In our model, the influence on risk from breast asymmetry was as strong as that from the total number of microcalcifications and masses (Table 3). This means that the risk association with the total number of microcalcifications was driven mainly by the increase of microcalcifications in one of the breasts. This indicates that the difference in calcifications between the breasts was the important risk marker for malignancy, although a dose-response relationship with the total number of microcalcifications might be seen with multifocal tumors. The risk from breast asymmetry was also significantly higher in interval cancers than in screeningdetected cancers (Additional file 1: Table S2).
The biology behind microcalcifications is not well understood. One hypothesis is that epithelial cells acquire mesenchymal characteristics and, as a sign of carcinogenic transformation, become capable of producing breast microcalcifications [19]. Because we found microcalcifications to be more abundant on the tumor side and that density was almost doubled in cases versus control subjects, it could be argued that microcalcifications are signs of a precursor lesion, whereas density is a general sign of increased breast cancer risk. In our full model including mammographic density, microcalcifications, and masses, the AUC reached 0.71, as compared with the Tyrer-Cuzick and Gail models, with AUCs 0.63 and 0.56, respectively (Table 5) [1,20]. We thus found that our model added substantial discriminatory effect. More than half (N = 284) of the total number of patients  Gail protocol with breast cancer (n = 570) who developed breast cancer during the study follow-up were predicted at baseline in the highest quintile of the 2-year absolute risk score (Additional file 1: Table S4). It should also be noted that the relative risks of cancer in situ were significantly higher than invasive cancer when we compared women with and without microcalcifications (Additional file 1: Table S2), although the model predominantly identified increased number of invasive cancers at higher risk levels compared with the Tyrer-Cuzick and Gail models (Additional file 1: Table S4). All models showed the same tendency with increased numbers of invasive and in situ cases by increased levels of risk.
In Sweden, approximately 6 of 1000 women are diagnosed with breast cancer at each round of biannual screening [21]. We managed to identify a low-risk group of approximately 10% of all women in which 1 woman in 1000 will be diagnosed with breast cancer. In contrast, in the highest risk category, 20 of 1000 women will have cancer detected within 2 years (Fig. 1).
The individualized screening protocol requires information on mammographic features, age, BMI, family history of breast cancer, use of HRT, and menopausal status of the woman. The mammographic measures are fully automated and require approximately 120 seconds of computation time to be generated. The remaining risk factors are easily collected through an online questionnaire at the time of the mammography visit. External validations of the result are needed to verify the performance of our risk model. It will be of utmost importance to understand which types of tumors the model predicts. Most established models target receptor-positive and highly differentiated tumors (i.e., tumors seen as less aggressive). In future studies, it will also be of importance to understand the relationship between the localization of the mammographic features and subsequent tumors and how different cut-off points for defining interval cancer will influence the risk estimates.
The KARMA cohort is large, but the follow-up time is just some few years. The obvious weakness of our study is the low number of breast cancer cases. For women with missing data on a risk factor, imputation was performed according to the protocol established with each risk model. We calculated a cBIRADS score that mimics the established BI-RADS score to help clinical implementation, but we do not know how the true BI-RADS score would influence our model. As a unique strength, we built our model on one of the few existing population-based prospective screening cohorts with detailed information on factors that possibly influence the risk of breast cancer.

Conclusions
Our model includes three mammographic features that could easily be derived from raw mammograms. By adding information on some few established risk factors, it has the potential to individualize screening and improve clinical care by identifying women in need of additional examination procedures. At the same time, there may be a substantial proportion of women who will have very little benefit from mammography screening, owing to their low risk of breast cancer.

Additional file
Additional file 1: Table S1. Relative risk of developing breast cancer in relation to mammographic density, number microcalcifications and number masses. Table S2. Relative risks on developing breast cancer in relation to tumor invasiveness and mode of detection.