Overconfident Physician Opinion on the Effectiveness of BRCA1 Risk Reduction Measures
© Biomed Central Ltd 2000
Published: 1 April 2001
We explored whether clinicians are overconfident in their judgments about the effectiveness of risk reduction measures in women with mutations in the BRCA1 gene. In this context, "overconfidence" is defined as the expression of too much certainty in subjective estimates, regardless of whether estimates are large or small.
We asked physicians to estimate the percent decrease in the lifetime probability of breast and ovarian cancer in carriers who received various prophylactic interventions. Respondents were also asked to indicate their 90% plausibility interval. Subjects were breast cancer clinicians and principal investigators on NCI-sponsored Specialized Programs in Oncology Research and Education (SPOREs) in breast cancer at six US cancer centers.
Clinicians varied widely in their estimates of effectiveness. Many had plausibility intervals that did not include the best estimate offered by other clinicians. It was not uncommon to find two clinicians with plausibility intervals that did not overlap. In addition, many clinicians expressed 90%-plausibility intervals that were so narrow that they did not capture findings from large robust studies of the effectiveness of prophylaxis. While, by definition, 10% of clinicians should have been surprised to learn that a scientific finding was outside their 90% plausibility interval, we found that 34-67% would have been surprised. This is because their plausibility intervals were too narrow.
We found that clinicians are overconfident in their estimates of the effectiveness of BRCA1 risk-reduction measures.
The psychological literature on probability judgment is replete with studies demonstrating that people express high levels of confidence in their fallible judgments . For example, in a classic study outside the medical context, Alpert and Raiffa  asked Harvard Business School students to estimate obscure quantities such as the percent of their classmates who preferred bourbon to scotch, the total egg production in the US, and the 1967 toll collection of the Panama Canal in millions of dollars. Naturally, students were uncertain about these facts. The authors also asked them to specify a lower bound estimate and an upper bound estimate such that they were 98% sure that the true value was between these two extremes. If students had specified intervals that were sufficiently wide given their uncertainty, then 98% of them should have captured the true value and 2% should have been surprised upon learning that the true value was outside their interval. The authors found, however, that 42% failed to capture the true value. This is because the students offered intervals that were too narrow, an indication of overconfidence.
Someone with an appropriate level of confidence should be able to express 98% confidence intervals such that the true value tends to lie outside their intervals 2% of the time. Similarly, they should be able to offer 90% intervals such that the true value lies outside 10% of the time, or 80% confidence intervals such that the true value lies outside 20% of the time. If the true estimate lies outside the specified interval too often, that's evidence of overconfidence. If it lies inside too often, that's evidence of underconfidence.
Note that the degree of knowledge that a person has is not necessarily related to the appropriateness of their confidence in their knowledge. For example, someone with very little knowledge of a given fact can express an appropriate level of confidence (or lack thereof) simply by offering confidence intervals that are wide. Conversely, someone with considerable knowledge of the same fact might actually have an inappropriate level of confidence because they offer confidence intervals that are not sufficiently narrow given their expertise.
If physicians are inappropriately confident about their medical knowledge this might have important consequences for patient welfare. An overconfident physician might dissuade patients from seeking a second opinion, or convince a risk-averse patient that the likelihood of an adverse outcome is more remote than the facts would suggest. In addition, an overconfident physician may be less likely to seek, perceive, and assimilate new information.
We sought to assess whether clinicians were overconfident in their estimates of the effectiveness of prophylactic measures for risk reduction in women with a mutation in the BRCA1 gene. That is, we did not seek to assess what clinicians know, but whether they "know what they know."
Subjects were 18 breast oncologists and principal investigators on NCI-sponsored Specialized Programs in Oncology Research and Education (SPOREs) in breast cancer at six US cancer centers. Each completed a one-page questionnaire.
We asked clinicians to consider several interventions: bilateral prophylactic mastectomy, oophorectomy with and without estrogen replacement, tamoxifen, and certain combinations. For each prophylactic measure, they indicated the "percent decrease in the lifetime probability of developing breast cancer among 30 year old women with a BRCA1 mutation (who have completed their childbearing)." In addition to their best estimate, we asked clinicians to give their 90% plausibility interval by indicating a "lower bound estimate such that there is only a 0.05 probability that the true value is lower, and ... upper bound estimate such that there is only a 0.05 probability that the true value is higher." For each intervention we also asked them to specify their best estimate and range for ovarian cancer risk reduction.
Results and discussion
Figure 1 depicts each clinician's best estimate and plausibility range for the breast cancer risk reduction offered by prophylactic mastectomy. Baseline estimates vary considerably, ranging from 45% to 98%. There was also considerable variation in the degree of expressed uncertainty. Clinician 2, for example, felt that mastectomy in carriers probably reduced risk by 85%, but might reduce risk by as little as 10% or as much as 90%. Clinician 18 expressed greater confidence, saying that mastectomy would reduce risk by 98% and risk reduction was unlikely to be lower than 97% or higher than 99%. It is noteworthy that each of these two clinician's plausibility intervals do not contain the best estimate offered by the other, and in fact their plausibility intervals do not even overlap - yet, by definition, each is 90% confident that the true estimate lies within their specified range. Figures 2 shows clinicians' estimates of the impact of oophorectomy and Figure 3 shows the impact of tamoxifen on breast cancer.
Our results reveal clear evidence of overconfidence. There was no estimate of effectiveness such that 90% of subjects captured that estimate in their 90% plausibility intervals. Clinicians systematically gave plausibility intervals that were too narrow. This was true for all interventions and both breast and ovarian cancer.
In this study we did not seek to judge the knowledge of clinicians or the accuracy of their estimates. An oncologist might be less knowledgeable about mastectomy and more knowledgeable about tamoxifen, for example, while a surgeon would have the opposite expertise. This is to be expected. However, even a physician who is very uncertain could exhibit a level of confidence appropriate to their (lack of) knowledge simply by specifying a wide plausibility interval. In this study we sought to assess the appropriateness of their level of confidence in their estimates. Our results offer clear evidence that in this instance clinicians are systematically overconfident. A well-calibrated clinician should be able to delineate 90% plausibility intervals that contain the true estimate roughly 90% of the time. We found, however, that clinicians captured the results from rigorous studies only 39-67% of the time.
This research shows that even some of the most respected breast cancer clinicians in the United States hold conflicting opinions and are, in general, overconfident in those opinions. This lack of consensus combined with high levels of confidence may be a source of great confusion to women seeking advice on prophylaxis from two or more physicians. Physician overconfidence in the effectiveness of preventive measures serves to compound the difficulty of the decisions surrounding genetic testing.
This work was funded in part by the National Cancer Institute through a Specialized Program of Research Excellence (SPORE) grant in Breast Cancer at Duke University, P50 CA68438. We gratefully acknowledge the assistance of those breast cancer clinicians affiliated with NCI-sponsored SPOREs around the country who anonymously provided us with subjective cancer risk reduction estimates.
- Lichtenstein S, Fischhoff B, Phillips LD: Calibration of probabilities: the state of the art to 1980. In Kahneman D, Slovic P, and Tversky A, ed. Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press,. 1982Google Scholar
- Alpert M, Raiffa H: A progress report on the training of probability assessors. In Kahneman D, Slovic P, and Tversky A, ed. Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press,. 1982Google Scholar
- Hartmann LC, Schaid DJ, Woods JE, Crotty TP, Myers JL, Arnold PG, Petty PM, Sellers TA, Johnson JL, McDonnell SK, et al: Efficacy of bilateral prophylactic mastectomy in women with a family history of breast cancer. N Engl J Med. 1999, 340: 77-84. 10.1056/NEJM199901143400201.View ArticlePubMedGoogle Scholar
- Rebbeck TR, Levin AM, Snyder C, Watson P, Cannon-Albright L, Isaacs C, Olopade O, Garber JE, Godwin AK, et al: Breast cancer risk after bilateral prophylactic oophorectomy in BRCA1 mutation carriers. J Nat Cancer Inst. 1999, 91: 1475-1479. 10.1093/jnci/91.17.1475.View ArticlePubMedGoogle Scholar
- Fisher B, Costantino JP, Wickerham DL, Redmond CK, Kavanah M, Cronin WM, Vogel V, Robidoux A, Dimitrov N, Atkins J, et al: Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. J Nat Cancer Inst. 1998, 90: 1371-1388. 10.1093/jnci/90.18.1371.View ArticlePubMedGoogle Scholar