- Open Access
Reanalysis of the NCCN PD-L1 companion diagnostic assay study for lung cancer in the context of PD-L1 expression findings in triple-negative breast cancer
Breast Cancer Researchvolume 21, Article number: 72 (2019)
The companion diagnostic test for checkpoint inhibitor immune therapy is an immunohistochemical test for PD-L1. The test has been shown to be reproducible for expression in tumor cells, but not in immune cells. Immune cells were used in the IMpassion130 trial which showed PD-L1 expression was associated with a better outcome. Two large studies have been done assessing immune cell PD-L1 expression in lung cancer. Here, we reanalyze one of those studies, to show that, even with an easier scoring method, there is still only poor agreement between assays and pathologist for immune cell PD-L1 expression.
Companion diagnostic testing has gained increased importance of the last few years. The earliest companion tests were immunohistochemistry (IHC) based (estrogen receptor and HER2). These have recently been followed by a series of molecular, mutation-based tests (EGFR and BRAF) and most recently, another IHC test for PD-L1. When the FDA clears or approves companion diagnostic tests, it is widely assumed that these tests are accurate, reproducible, and robust. In fact, the SSED (Summary of Safety and Effectiveness Documents) released by the FDA provide the evidence to justify the assumption that the tests are worthy of consumer, payer, and physician confidence. Examination of the SSEDs for the PD-L1 tests shows that the FDA clears assays after review by only 2 or 3 pathologists, often showing high overall percent agreement (OPA) that may not reflect real-world outcomes. In fact, when PD-L1 assays were assessed by multiple observers, some FDA-approved categories were found to be unreproducible, specifically including immune cell expression of PD-L1 [1, 2].
In October of 2018, Schmid and colleagues from Genentech reported the results of the IMpassion 130 trial in first-line metastatic setting in breast cancer . In a trial of atezolizumab or placebo in combination with paclitaxel, this work showed statistically significant extension of median disease-free overall survival from 15.5 to 25 months in patients with “PD-L1 positive” tumors and no benefit in PD-L1 negative tumors. While this is exciting for breast cancer patients, it is a challenge for pathologists and oncologists. Pathologists are responsible for PD-L1 status determination and the approach used in this breast cancer study conflicts with previous efforts in lung, gastric, head and neck, and cervical cancer. The standard PD-L1 expression test for atezolizumab is the Ventana SP142 assay which has been shown to have lower sensitivity than other PD-L1 assays in many studies [1, 2, 4, 5]. As such, it is impossible to validate this accurately in the CLIA lab, since there is no comparator assay, as there is for LDTs and the other FDA assays which have been shown to be equivalent. Furthermore, in breast cancer, the assay is read as a two-category immune cell (IC) score compared to the three- or four-category IC reading that was tested in two large, multi-institutional biomarker studies in lung cancer tissue [1, 2]. Both the NCCN  and the Blueprint 2  studies concluded that pathologists cannot accurately or reproducibly read the three- or four-category IC score, with interclass correlation coefficient (ICC) between 0.19 and 0.28.
Here, we reanalyzed the data from NCCN study  using the original IC readings of 13 pathologists collapsed into a two-category scale using OPA (the two categories mimic the IC scoring in the IMpassion 130 study, < 1% or > 1% immune cells). For the three categories, the OPA between the four assays is 29% but using the two-category scale, the OPA rises to 54%. Similarly, inter-pathologist OPA goes from 0% (no complete agreement between 13 pathologists on 90 slides with three-category scoring) to 18% for two-category scoring (or 67% if you exclude outlier pathologist 12 in Fig. 1). Thus, collapsing of the scoring system from three to two categories improves both assay and pathologist OPA although both remain low. For comparison, ER/PR and HER2 scores have OPAs in the 90-95% range [6, 7].
The low agreement between the assays is likely attributable to previously demonstrated lower SP142 sensitivity compared to other FDA-approved and laboratory-developed test (LDT) assays [1, 2]. It is unclear if there will be an expectation for CLIA labs already performing more sensitive PD-L1 assays, to make a switch to or an addition of the less sensitive SP142 assay for therapeutic eligibility determination. The survey data indicates that most labs are utilizing 22c3, followed by an LDT using E1L3N. To test if re-categorization of the IC component of this assay fixes this sensitivity problem, the IC scores of each NCCN study pathologist were plotted and collapsed into two categories (Fig. 1). This analysis suggests that for about one third of the pathologists, the positive/negative scoring system makes the assays equivalent, but another one third of the pathologists find dramatically fewer cases positive with the SP142 assay compared to the other assays. The variable sensitivity of the assays was unknown when the IMpassion trial began, but it would be unprecedented to have multiple assays with differential sensitivity for a single biomarker in one lab. Similarly, there is no precedent for how these variable assays could be separately standardized.
In summary, this analysis raises a significant concern for pathologists who need to provide accurate and reproducible companion diagnostic results for PD-L1. While the NCCN study data presented here are from lung cancer, not breast cancer tissue, there is no evidence that the biochemistry of the interaction has any difference between the tumor sites. While the lung cancer pathologists in the NCCN study were not trained to read IC scores, the Blueprint 2 study included 1.5 days of training for 15 pathologists and found very low concordance, suggesting that training will not solve this problem. We look forward to Genentech’s help in solving this problem. A potential solution would be a reanalysis using the SP263 assay (produced by the same vendor as the SP142 assay) or a bridging study between the SP142 assay and the SP263 assay using the IMpassion 130 tissues.
Availability of data and materials
Data is available from the authors on request.
Rimm DL, Han G, Taube JM, Yi ES, Bridge JA, Flieder DB, et al. A prospective, multi-institutional, pathologist-based assessment of 4 immunohistochemistry assays for PD-L1 expression in non-small cell lung cancer. JAMA Oncol. 2017;3(8):1051–8.
Tsao MS, Kerr KM, Kockx M, Beasley MB, Borczuk AC, Botling J, et al. PD-L1 immunohistochemistry comparability study in real-life clinical samples: results of blueprint phase 2 project. J Thorac Oncol. 2018;13(9):1302–11.
Schmid P, Adams S, Rugo HS, Schneeweiss A, Barrios CH, Iwata H, et al. Atezolizumab and nab-paclitaxel in advanced triple-negative breast cancer. N Engl J Med. 2018;379(22):2108–21.
Hirsch FR, McElhinny A, Stanforth D, Ranger-Moore J, Jansson M, Kulangara K, et al. PD-L1 immunohistochemistry assays for lung cancer: results from phase 1 of the blueprint PD-L1 IHC assay comparison project. J Thorac Oncol. 2017;12(2):208–22.
Buttner R, Gosney JR, Skov BG, Adam J, Motoi N, Bloom KJ, et al. Programmed death-ligand 1 immunohistochemistry testing: a review of analytical assays and clinical implementation in non-small-cell lung cancer. J Clin Oncol. 2017;35(34):3867–76.
Zhang H, Han M, Varma KR, Clark BZ, Bhargava R, Dabbs DJ. High fidelity of breast biomarker metrics: a 10-year experience in a single, large academic institution. Appl Immunohistochem Mol Morphol. 2018;26(10):697–700.
Hammond ME, Hayes DF, Dowsett M, Allred DC, Hagerty KL, Badve S, et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for immunohistochemical testing of estrogen and progesterone receptors in breast cancer. Arch Pathol Lab Med. 2010;134(6):907–22.
This work was supported in part by funding the Yale SPORE in Lung Cancer (P50CA196530) and funds from the Yale Comprehensive Cancer Center (P30CA016359).
This study was funded by Bristol-Myers Squibb in collaboration with the National Comprehensive Cancer Network Oncology Research Program and by the Breast Cancer Research Foundation.
Ethics approval and consent to participate
Consent for publication
Dr. Rimm reported serving as a consultant for AstraZeneca, Agendia, Bethyl Labs, Biocept, Bristol-Myers Squibb, Cell Signaling Technology, ClearSight, Genoptix/Novartis, Merck, OptraScan, Perkin Elmer, and Ultivue and receiving research grant support from Cepheid, Genoptix, Gilead Sciences, Pierre Fabre, and Perkin Elmer. Dr. Taube reported serving as a consultant for Bristol-Myers Squibb, AstraZeneca, and Merck and receiving research funding from Bristol-Myers Squibb. Dr. Bridge reported serving as a consultant for Cepheid Inc and National Comprehensive Cancer Network–Peregrine. Dr. Anders reported serving on the advisory board of Adaptive Biotechnologies and receiving research support from Bristol-Myers Squibb, Five Prime Diagnostics, National Institutes of Health, Stand Up 2 Cancer, and Fibrolamellar Cancer Foundation. Dr. Hirsch reported serving on the advisory boards of Genentech/Roche, Bristol-Myers Squibb, Merck, AstraZeneca, Novartis, Pfizer, Ventana, and HTG Molecular Diagnostics Inc. Dr. Wistuba reported receiving honoraria from Genentech/Roche, Bristol-Myers Squibb, Boehringer Ingelheim, AstraZeneca, Pfizer, Ariad, HTG Molecular Diagnostics Inc, and Asuragen and receiving research grants from Genentech/Roche, Bristol-Myers Squibb, HTG Molecular Diagnostics Inc, and Asuragen. Drs. Batenchuk and Burns were employed by Bristol-Myers Squibb. Dr Pusztai has received consulting fees and honoraria from Astra Zeneca, Merck, Novartis, Genentech, Eisai, Pieris, Immunomedics, Seattle Genetics, Almac and Syndax.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.