Comparison of SP142 and 22C3 PD-L1 assays in a population-based cohort of triple-negative breast cancer patients in the context of their clinically established scoring algorithms
Breast Cancer Research volume 25, Article number: 123 (2023)
Immunohistochemical (IHC) PD-L1 expression is commonly employed as predictive biomarker for checkpoint inhibitors in triple-negative breast cancer (TNBC). However, IHC evaluation methods are non-uniform and further studies are needed to optimize clinical utility.
We compared the concordance, prognostic value and gene expression between PD-L1 IHC expression by SP142 immune cell (IC) score and 22C3 combined positive score (CPS; companion IHC diagnostic assays for atezolizumab and pembrolizumab, respectively) in a population-based cohort of 232 early-stage TNBC patients.
The expression rates of PD-L1 for SP142 IC ≥ 1%, 22C3 CPS ≥ 10, 22C3 CPS ≥ 1 and 22C3 IC ≥ 1% were 50.9%, 27.2%, 53.9% and 41.8%, respectively. The analytical concordance (kappa values) between SP142 IC+ and these three different 22C3 scorings were 73.7% (0.48, weak agreement), 81.5% (0.63) and 86.6% (0.73), respectively. The SP142 assay was better at identifying 22C3 positive tumors than the 22C3 assay was at detecting SP142 positive tumors. PD-L1 (CD274) gene expression (mRNA) showed a strong positive association with all two-categorical IHC scorings of the PD-L1 expression, irrespective of antibody and cut-off (Spearman Rho ranged from 0.59 to 0.62; all p-values < 0.001). PD-L1 IHC positivity and abundance of tumor infiltrating lymphocytes were of positive prognostic value in univariable regression analyses in patients treated with (neo)adjuvant chemotherapy, where it was strongest for 22C3 CPS ≥ 10 and distant relapse-free interval (HR = 0.18, p = 0.019). However, PD-L1 status was not independently prognostic when adjusting for abundance of tumor infiltrating lymphocytes in multivariable analyses.
Our findings support that the SP142 and 22C3 IHC assays, with their respective clinically applied scoring algorithms, are not analytically equivalent where they identify partially non-overlapping subpopulations of TNBC patients and cannot be substituted with one another regarding PD-L1 detection.
Trial registration The Swedish Cancerome Analysis Network - Breast (SCAN-B) study, retrospectively registered 2nd Dec 2014 at ClinicalTrials.gov; ID NCT02306096.
Patients with triple-negative breast cancer (TNBC) have poorer prognosis compared to patients with other breast cancer subtypes and fewer treatment options due to the absent or low expression of estrogen and progesterone receptors and HER2 . Exploration of alternative therapy options for TNBC patients is ongoing and immune checkpoint inhibitors (ICIs) targeting the programmed death 1 (PD-1)/programmed death-ligand 1 (PD-L1) interaction are now approved in TNBC . However, questions remain to be answered regarding the optimal selection of patients who might benefit from ICI treatment.
PD-L1 protein expression determined by immunohistochemistry (IHC) is currently the only clinically applied predictive biomarker for checkpoint inhibition in TNBC and is relevant in the unresectable locally advanced or metastatic setting. However, PD-L1 evaluation in breast cancer varies where each ICI comes with a companion/complementary IHC assay where the antibodies, scoring systems, definition of positivity and predictive threshold is different across assays [3,4,5]. For optimal clinical use of PD-L1 as a biomarker, a unique and harmonized IHC workflow and scoring system should be developed. Several phase III clinical trials with different ICIs in TNBC have shown mixed results, but some have been promising in the metastatic and neoadjuvant setting. Commonly investigated ICIs in TNBC are atezolizumab together with the SP142 Ventana IHC assay (IMpassion trials) and pembrolizumab together with the 22C3 Dako IHC assay (Keynote trials), and these ICIs have been approved in TNBC in combination with chemotherapy [6,7,8,9,10,11,12,13,14,15,16,17,18,19]. SP142 has been found to have less sensitivity for PD-L1 staining on tumor cells (TCs) than on immune cells (ICs) in TNBC and the scoring system for SP142 is based on the proportion of tumor area occupied by PD-L1 expressing ICs [20, 21]. For SP142, a predictive threshold value of 1% has been found when adding atezolizumab to nab-paclitaxel in metastatic TNBC in the IMpassion130 phase III trial that led to the first accelerated approval of an ICI in TNBC. The combination is approved outside of the US but has been withdrawn by FDA since continued approval was contingent upon the results of the IMpassion131 trial which failed at showing significant clinical benefit of atezolizumab in combination with paclitaxel [6, 11, 22]. On the other hand, the scoring system for the 22C3 antibody is a combined positive score (CPS) that is based on PD-L1 expression in TCs and ICs as a proportion of the total number of TCs. For 22C3, a predictive threshold value of CPS 10 has been found in the metastatic setting in the Keynote-355 phase III trial . On the contrary, atezolizumab and pembrolizumab have in phase III trials shown clinical benefit in the neoadjuvant setting irrespective of PD-L1 status by SP142 and 22C3, respectively [9, 10, 14].
The IC+ scoring method is not clinically applied for 22C3, and CPS is not clinically applied for SP142. Several studies have shown inter-assay variability and discordance between the SP142 and 22C3 assays, each detecting partially non-overlapping subpopulations of PD-L1 positive TNBC patients [21, 23,24,25,26,27,28,29]. However, these studies have not been consistent in their comparison of scoring methods and their prognostic impacts. To our knowledge, only a few studies so far evaluated the agreement of the clinically established scoring algorithms of these assays in TNBC, reporting impaired concordance [21, 23, 25].
In our current study, we investigated the agreement between the SP142 and 22C3 assays in the context of their clinically used scoring systems in TNBC, assessed their correlation to PD-L1 expression at the mRNA level to investigate if the assays differ in their association with the mRNA status, and evaluated their prognostic value in a population-based early-stage TNBC cohort. The overall aim was to provide additional data about assay interchangeability to support PD-L1 analysis in TNBC and clinical decision making.
Material and methods
The origin of our TNBC cohort has been previously described by Staaf et al. . Briefly, a total of 408 TNBC patients were identified in Region Skåne between 2010/09 and 2015/03 by the Swedish National Breast Cancer Quality (NKBC) registry. Of those, 340 were enrolled in the Swedish Cancerome Analysis Network - Breast (SCAN-B) study (ClinicalTrials.gov ID NCT02306096), which is a population-based study in the southern health care region of Sweden and all patients with primary breast cancer are eligible (https://www.scan-b.lu.se/) . Eighty-four patients were thereafter excluded because of unclear TNBC status or insufficient tissue material. Of the 256 remaining patients included in our tissue microarray (TMA), 13 were excluded due to metastatic disease at diagnosis or prior to start of adjuvant chemotherapy (n = 8), bilateral breast cancer (n = 3), loss to follow-up before treatment start (n = 1), or non-TNBC status (n = 1). Clinicopathological characteristics and follow-up data was collected through clinical chart review and the last date of counting in events was 18th Oct 2019. Additional 11 patients were excluded since they only had TMA cores from residual disease after neoadjuvant chemotherapy (n = 6) or due to unevaluable TMA cores for the 22C3 staining (n = 5). Of the remaining 232 patients, who all underwent primary surgery (mastectomy or partial mastectomy), 166 received chemotherapy (CT-cohort) according to national guidelines, of which 155 received adjuvant and 11 neoadjuvant CT. Of these, 98.2% (163 of 166) received FEC or EC (5-fluorouracil, epirubicin, cyclophosphamide) based treatment with or without a taxane and three patients (1.8%) received less than 50% of planned CT. The remaining 66 patients did not receive any neo(adjuvant) CT, most often due to age or comorbidities (non-CT-cohort). Checkpoint inhibitors were not given to the patients in the cohort. Adjuvant radiotherapy was given according to national guidelines. All of the 166 CT patients were eligible for overall survival (OS) analysis, 165 for invasive disease-free survival (IDFS) and 163 for distant relapse-free interval (DRFI). In the non-CT-cohort, 64 were eligible for OS, 65 for IDFS and 63 for DRFI (Fig. 1, study flowchart). Clinicopathological characteristics in the CT-cohort (prior to eventual (neo)adjuvant CT) and the non-CT-cohort are presented in Table 1. RNA sequencing data for gene expression profiling (GEX) was available for 84% of the patients (194 out of 232 patients) through the SCAN-B consortium .
PD-L1 immunohistochemistry (IHC) and tissue microarray (TMA)
Scoring of PD-L1 expression by immunohistochemical testing was assessed in formalin-fixed, paraffin-embedded tumor samples in a TMA, using two different PD-L1 antibody clones: SP142 with Ventana BenchMark Ultra platform (Ventana Medical Systems, Inc., AZ, U.S) and 22C3 with Dako Autostainer Link 48 platform (Agilent, Inc., CA, U.S) IHC assays. Preparation and staining were done according to the manufacturer´s instructions. The TMA images were assessed in PathXL Philips Xplore (Koninklijke Philips N.V., NL). Each sample was represented by two TMA cores, each of 1.0 mm in diameter. PD-L1 in the adjuvant treated patients and the non-CT-cohort was evaluated on TMA cores from the surgical specimen. For the neoadjuvant patients, PD-L1 was evaluated on TMA cores from core needle biopsies taken prior to neoadjuvant treatment.
PD-L1 IHC scoring
We evaluated PD-L1 staining according to two scoring methods: CPS and staining in ICs. CPS was defined as the combined number of PD-L1 stained TCs, tumor infiltrating lymphocytes (TILs) and macrophages (intratumorally and in adjacent stroma) divided by the total number of TCs, multiplied by 100. We evaluated CPS at a threshold of 1 and 10 according to PD-L1 evaluation in clinical phase III TNBC studies with pembrolizumab and the 22C3 assay [7, 8, 10]. The IC+ score was defined as percentage of the tumor area (non-necrotic, non-sclerotic area) covered by PD-L1 stained tumor infiltrating ICs and evaluated at a threshold of 1% as performed in phase III TNBC trials with atezolizumab and the SP142 assay [6, 9, 11, 13]. The score from the TMA core with highest value was set as the respective CPS and IC+ score for the tumor. PD-L1 expression in TCs (in CPS) included partial or complete membranous staining and in ICs (in CPS and IC+) membranous and/or cytoplasmic staining. Scoring of SP142 PD-L1 expression was done by a physician and a board-certified breast cancer pathologist where consensus in non-matching scoring had to be reached for 4,7% of the tumors (Additional file 1: Table S1). The 22C3 scoring was performed by a physician and in cases that were not clearly obvious, a board-certified breast cancer pathologist was consulted and consensus reached. IHC staining examples are illustrated in Fig. 2A. We scored CPS using the 22C3 assay and IC+ with both SP142 and 22C3 (note it is experimental scoring of IC+ with 22C3 since the IC+ scoring is not clinically applied for 22C3). We did not evaluate CPS for SP142 since it has been shown to have impaired sensitivity for PD-L1 staining in TCs in TNBC [20, 21]. When investigating the concordance between the assays, the SP142 IC+ of ≥ 1% and 22C3 CPS of ≥ 10 scores were compared as they are the only clinically established predictive cut-offs in TNBC. Moreover, since 22C3 CPS ≥ 1 also has been investigated in clinical trials, the concordance between SP142 IC+ ≥ 1% and 22C3 CPS ≥ 1 was evaluated. In addition to this, to compare under more similar, but explorative, scoring conditions, the concordance between SP142 IC+ and 22C3 IC+ was evaluated.
Evaluation of tumor infiltrating lymphocytes (TILs)
Abundance of stromal TILs was evaluated by a board-certified breast cancer pathologist on hematoxylin–eosin stained whole slides from surgical specimen before eventual adjuvant chemotherapy and from pre-treatment core needle biopsies for the neoadjuvant treated patients. Abundance was calculated as percentage of TILs occupying the tumoral stromal area according to the international TILs working group (https://www.tilsinbreastcancer.org/) . If more than one slide was available per patient, the average score was applied. Threshold for high versus low TILs (as binary variable) was set to 30% as performed in a previous pooled analysis of the prognostic value of TILs in early-stage TNBC patients , which also was near the mean value of TIL abundance in our cohort (27% in the overall cohort, 29% in the CT-cohort).
In the survival analyses, OS, IDFS and DRFI were defined as endpoints with support of the STEEP criteria . OS was the time from diagnosis of primary breast cancer to death of any cause. IDFS was the time from primary diagnosis to the diagnosis of a breast cancer related invasive event (locoregional or distant) or, if no relapse had occurred, to death of any cause. In the absence of event in OS and IDFS the case was censored at last follow-up. DRFI was defined as the time from diagnosis to the diagnosis of a distant relapse of breast cancer or breast cancer related death, the case was censored at death of any other cause or at last follow-up if no DRFI event had occurred. Contralateral breast cancer and distant recurrences with uncertain origin were not included in DRFI but were included in IDFS. Follow-up time was defined as the time from diagnosis to date of death or to last follow-up.
Statistical analyses and analyses of RNA sequencing data
Analyses of RNAseq data were performed in R (v 3.6.1), all remaining statistical analyses with SPSS (v 26.0). Concordance rate (expressed as percentage) was calculated to evaluate IHC inter-test reliability and kappa statistic applied as a measurement of the level of agreement. A kappa coefficient of ≥ 0.80 was interpreted as strong agreement, 0.60–0.79 as good, 0.40–0.59 as weak, 0.21–0.39 as minimal and < 0.20 as none agreement . Area-proportional Venn diagrams were drawn with https://www.biovenn.nl/ . Chi-square test was applied when comparing categorical values between groups (chi-square test for trend if more than two groups were compared). Nonparametric Mann–Whitney test was applied to compare non-categorical values between two groups. Survival data were analyzed by Kaplan–Meier estimates along with log-rank test and with Cox regression, reporting hazard ratio (HRs) and 95% confidence intervals (CIs). Multivariable Cox regression analyses were performed by including, aside from PD-L1 status, other traditional and prognostic factors: age at diagnosis, tumor size, lymph node status, Nottingham histologic grade (NHG) and TIL abundance as binary covariates. Four multivariable regression analyses were performed, i.e., one for each PD-L1 scoring method: SP142 IC+ , 22C3 CPS 10, 22C3 CPS 1 and 22C3 IC+. RNA sequencing data was matched against patient data generating a list of 16,258 genes across 194 samples. FKPM values were Log2-transformed, imputed (missing data to 0), mean-centered and scaled (samples and genes). The correlation between PD-L1 gene expression (by RNAseq) and PD-L1 protein expression (analyzed by IHC) was estimated using the Spearman method and visualized with boxplots (the median is indicated by the central line, upper and lower limits of the box represent the upper and lower quartiles and whiskers the × 1.5 interquartile range). A p-value less than 0.05 was considered statistically significant and all p-tests were two sided.
Frequency of PD-L1 IHC expression
A higher positive detection rate (Table 1) was observed for SP142 IC ≥ 1% than for 22C3 CPS ≥ 10, 50.9% (118/232) versus 27.2% (63/232), when using these clinically applied predictive cut-offs (from the advanced TNBC setting).
Since 22C3 CPS ≥ 1 has also been investigated in clinical trials, we analyzed the percentage of PD-L1 positivity using this lower cut-off for 22C3. As expected, this resulted in a higher positive detection rate (53.9% (125/232)) compared to 22C3 CPS ≥ 10.
In an explorative analysis, to evaluate 22C3 under more similar conditions as SP142, we applied the IC+ scoring method to 22C3. The positive detection rate for 22C3 IC ≥ 1% was 41.8% (97/232).
Comparison between SP142 and 22C3
When comparing SP142 and 22C3 with the clinically applied scoring methods and cut-offs (IC ≥ 1% and CPS ≥ 10, respectively), a kappa value of 0.48 was obtained (interpreted as week agreement). Approximately half of the tumors (47.8%; 111/232) were negative with both antibodies, whereas 60 tumors (25.9%) were positive with both antibodies, resulting in a concordance rate of 73.7%. Fifty-eight tumors (25%) were positive with SP142, but negative with 22C3, whereas three tumors (1.3%) showed the opposite pattern (Fig. 2B). Taken together, almost half of the tumors (49.2%; 58/118) that stained PD-L1 positive with SP142 were considered to be negative with 22C3, when using these clinically established predictive cut-offs.
The kappa value increased to 0.63 (interpreted as good agreement) and the concordance rate to 81.5% when a threshold of ≥ 1 for CPS was applied for 22C3 where 189 tumors (out of 232) showed concordant PD-L1 status (89 tumors negative with both antibodies and 100 tumors positive with both; Fig. 2C). A lower number of tumors that stained positive with SP142 but negative with 22C3 was found than when using the ≥ 10 cut-off for CPS (n = 18 vs. n = 58). The number of tumors with the opposite pattern (i.e. negative with SP142 but positive with 22C3) was increased from 3 to 25.
Next, we evaluated the concordance between the two antibodies when scored with the same scoring method and cut-off, i.e. IC ≥ 1% (Fig. 2D, note that IC+ is not normally employed for the 22C3 antibody). This comparison resulted in the best concordance rate of 86.6% (201 concordant tumors: 109 negative with both and 92 positive with both) and a kappa-value of 0.73 (interpreted as good agreement). Five tumors were negative with SP142 but positive with 22C3 and 26 showed the opposite pattern.
Association of SP142 and 22C3 with PD-L1 (CD274) gene expression (mRNA)
We detected a significant positive association between PD-L1 IHC expression and PD-L1 (CD274) gene expression (mRNA) in the overall cohort. The Spearman correlation coefficients were similar between PD-L1 gene expression and all the two-categorical IHC scorings (rs = 0.59 for SP142 IC+, rs = 0.60 for both 22C3 CPS 1 and CPS 10, rs = 0.62 for 22C3 IC+; all p-values < 0.001; Fig. 3A–D). When stratifying the 22C3 CPS into three categories (i.e. < 1, 1–9 and ≥ 10), a positive stepwise association between PD-L1 (CD274) gene expression and PD-L1 protein expression was observed (Fig. 3E; rs = 0.67), establishing a good degree of association between transcript and protein measurements.
We also investigated PD-L1 gene expression levels in SP142 IC and 22C3 CPS concordant and discordant groups, respectively (Fig. 3F, G). Here, transcript levels in the discordant groups (i.e. 22C3 CPS < 10 and SP142 IC ≥ 1% or 22C3 CPS ≥ 10 and SP142 IC < 1%) were found to be at an intermediate level between the concordant positive group and the concordant negative group. No significant difference in PD-L1 mRNA expression was found between the two discordant groups.
Clinicopathological features in the CT-cohort and the non-CT-cohort
Clinicopathological characteristics differed in patients receiving (neo)adjuvant CT and in those not receiving CT. The patients in the CT-cohort were younger (p < 0.001), had higher median TIL abundance, more proliferative (p = 0.007) and higher-grade tumors (p = 0.004), higher rate of PD-L1 expressing tumors (p = 0.006 for SP142 IC and p = 0.028 for 22C3 CPS status) and tended to have fewer deaths (p = 0.059) but had similar rate of relapses as compared to the non-CT-cohort (Table 1). Due to these differences, we chose to evaluate clinicopathological features in relation to PD-L1 status and perform outcome analyses separately in the CT-cohort and the non-CT-cohort.
Association of PD-L1 status with clinicopathological features
In the CT-cohort, tumors with SP142 IC ≥ 1% were significantly associated with higher NHG (p = 0.004), higher Ki-67 proliferation index (p = 0.005), histological medullary features (p = 0.001) and increased stromal TIL abundance (p < 0.001), whereas age at diagnosis, tumor size and lymph node status were not significantly associated with PD-L1 status (Table 2). When using CPS ≥ 10 for 22C3, only medullary features and TIL abundance were significantly associated with PD-L1 status (both p values < 0.001) and the association between PD-L1 and NHG and Ki-67 did not reach statistical significance (Table 2). With the other cut-offs for 22C3 (CPS ≥ 1 and IC ≥ 1%; Additional file 2: Table S2), the results were similar to those obtained for SP142, with significant associations to NHG (p < 0.001 and p = 0.012, respectively), Ki-67 level (p = 0.009 and p = 0.006, respectively), medullary features (p = 0.002 for both 22C3 CPS 1 and 22C3 IC+) and TIL abundance (p < 0.001 for both CPS 1 and IC+).
In the non-CT-cohort, a significant positive association between TIL abundance and PD-L1 status was observed, irrespective of PD-L1 IHC evaluation method (all p-values < 0.001 for SP142 IC+ , 22C3 CPS 1 and 22C3 IC+ ; for 22C3 CPS 10: p = 0.006 for median TIL score and p = 0.026 for TILs as binary covariate). Associations between the other clinicopathological parameters and SP142 IC ≥ 1%, 22C3 CPS ≥ 10 or 22C3 IC ≥ 1% did not reach significance (Additional file 3: Table S3). When using 22C3 CPS ≥ 1 cut-off, NHG was significantly associated with PD-L1 IHC expression (p = 0.045) and Ki-67 borderline significant (p = 0.051; Additional file 3: Table S3).
Association of PD-L1 with patient outcome in the CT-cohort
When using the clinically established cut-offs for both SP142 (IC ≥ 1%) and 22C3 (CPS ≥ 10) in univariable Cox regression analyses, a positive PD-L1 status was significantly associated with a better DRFI (HR = 0.47, 95% CI 0.22–1.00, p = 0.049 for SP142 IC+ and HR = 0.18, 95% CI 0.04–0.76, p = 0.019 for 22C3 CPS 10; Table 3 and Fig. 4). The HRs for IDFS and OS also indicated a better prognosis for patients with PD-L1 positive tumors (HRs ranging from 0.46 to 0.53), but only reaching significant level for IDFS and SP142 IC status (95% CI 0.26–0.89, p = 0.02). The results for 22C3 CPS ≥ 1 and 22C3 IC ≥ 1% showed a similar pattern although only reaching significancy for 22C3 CPS 1 and IDFS (HR = 0.53, 95% CI 0.29–0.98, p = 0.043; Table 3 and Additional file 3: Fig. S1).
Next, we performed a subgroup analysis where we divided the 22C3 CPS 10 negative group (i.e., those with CPS < 10) into one group positive with SP142 (i.e. IC ≥ 1%; n = 47) and one group negative with SP142 (IC < 1%, n = 71). No significant difference in DRFI was observed between these two groups (log rank p = 0.562; Fig. 5). For the group with 22C3 CPS ≥ 10, a similar division was not meaningful since all the patients in the CT-cohort that had 22C3 CPS ≥ 10 also scored SP142 IC ≥ 1%. These results suggest that if information for PD-L1 status with 22C3 CPS 10 is available, SP142 does not add any further prognostic information for DRFI.
In multivariable Cox regression analysis, PD-L1 status was found not significantly associated to outcome for any of the clinical endpoints, irrespective of IHC assay and cut-off (Table 3). Of note though, a trend towards better DRFI was observed for 22C3 CPS ≥ 10 staining (HR = 0.26, 95% CI 0.06–1.20, p = 0.084). Stromal TIL abundance was the only covariate showing independent significant association to outcome in multivariable analyses, where it was positively associated with improved IDFS irrespective of PD-L1 assay and cut-off included in the analysis (HRs ranging from 0.24 to 0.27 and p-values from 0.003 to 0.007, Table 3A-D) and with a better DRFI in a multivariable model where SP142 IC+ was included (HR = 0.33, 95% CI 0.11–0.99, p = 0.047; Table 3A).
Association of PD-L1 with patient outcome in the non-CT-cohort
The scarcity of patients in the non-CT-cohort did not allow for robust multivariable Cox regression analyses. PD-L1 status was not significantly associated with DRFI in univariable analysis (HRs ranging from 0.56 to 0.77, p-values not significant). For IDFS, the HRs for PD-L1 status were similar as in the CT-cohort (HRs ranging from 0.53 to 0.65 compared 0.46 to 0.57 in the CT-cohort), but in this small group of TNBC patients not treated with (neo)adjuvant CT with few events, the p-values were not significant (Additional file 5: Table S4 and Additional file 6: Fig. S2). Stromal TIL abundance was not significantly associated with any of the clinical endpoints in univariable analyses (HRs ranging from 0.90 to 1.34). Age (as continuous variable) was negatively associated with OS (HR = 1.07, 95% CI 1.01–1.14, p = 0.027) and IDFS (HR = 1.05, 95% CI 1.00–1.11, p = 0.043), tumor size was negatively associated with all the endpoints (HR = 2.41, 95% CI 1.08–5.39, p = 0.003 for IDFS; HR = 2.54, 95% CI 1.02–6.32, p = 0.045 for OS; HR = 4.63, 95% CI 1.00–21.53, p = 0.051 for DRFI) and lymph node status negatively associated with DRFI (HR = 8.06, 95% CI 2.13–30.59, p = 0.002; Additional file 5: Table S4).
To date, two different immune checkpoint inhibitors (ICIs) have been incorporated in the treatment of TNBC; pembrolizumab in both early-stage and metastatic TNBC and atezolizumab in the metastatic setting. Atezolizumab is still approved outside of the US but has been withdrawn by the FDA for metastatic TNBC. Each of these ICIs comes with a different PD-L1 IHC antibody assay, Ventana SP142 and Dako 22C3, respectively, that have different scoring methods and cut-offs [17, 18]. It is of clinical interest to harmonize these assays in the attempt to simplify the use of PD-L1 IHC expression as a predictive biomarker for checkpoint inhibition response. In this context, it has been recommended that a concordance rate of at least 90% is needed for assays to be considered analytically equivalent . In our analysis, the comparison between SP142 IC ≥ 1% and 22C3 CPS ≥ 10, the currently clinically applied scoring methods and predictive cut-offs, showed a concordance rate of only 73.7% and kappa value of 0.48. These results indicate a weak concordance, as previously reported [21, 23]. This low rate of concordance in our cohort was mainly driven by the low positive percentage agreement of only 50.8% (118 SP142 IC ≥ 1% and 60 of these were also 22C3 CPS ≥ 10) where SP142 IC ≥ 1% expression was much more frequent than 22C3 CPS ≥ 10 and where 22C3 CPS 10 was not able to identify almost half (49.2%) of tumors that scored positive with SP142. Conversely, SP142 IC+ failed to identify 4.8% of tumors that scored positive with 22C3 CPS 10. We found better concordance rate of 81.5% (kappa value 0.68) when comparing SP142 IC ≥ 1% and 22C3 CPS ≥ 1, in line with two previously published studies [21, 25], though higher than reported by the IMpassion 130 sub-study of 63.5% . The 22C3 CPS 1 scoring was not able to identify 15.3% of tumors that scored positive with SP142 and, on the other hand, SP142 was not able to identify 20.0% of tumors that scored positive with 22C3 CPS 1. We observed the best concordance rate of 86.6% between the two assays using the IC+ scoring for both (kappa value 0.73), which was in line with some previous results [25, 27, 28], but better than reported in the IMpassion 130 sub-study of 68.8% . Our findings deviating from the IMpassion 130 sub-study might be explained by the lower rate of 22C3 CPS 1 and 22C3 IC+ positivity in our study, which in turn led to a substantially better negative percentage agreement in our cohort, resulting in a higher concordance rate.
PD-L1 (CD274) gene expression (mRNA) showed a strong positive association with all the IHC scorings of PD-L1 expression, irrespective of antibody and cut-off. PD-L1 gene expression could not explain the difference between SP142 and 22C3 CPS since both discordant groups (i.e. 22C3 CPS < 10 and SP142 IC ≥ 1% or 22C3 CPS ≥ 10 and SP142 < 1%) had similar PD-L1 gene expression levels.
We found that PD-L1 expression was positively associated with TIL abundance, NHG, Ki-67 level and histological medullary features. We also investigated the prognostic value of the different PD-L1 IHC scorings and found that PD-L1 expression when evaluated with SP142 IC+ and 22C3 CPS had a significant protective effect in patients that received (neo)adjuvant CT. However, PD-L1 status was not independently prognostic in multivariable regression analyses when adjusting for TIL abundance and other traditional prognostic features, where only TILs had an independent effect on outcome. Of the four different PD-L1 scorings and the three clinical endpoints, the prognostic impact of PD-L1 was strongest for 22C3 CPS ≥ 10 and DRFI. When dividing the CT-subgroup that had 22C3 CPS < 10 into SP142 IC positive and SP142 negative, we found that the SP142 status did not add any further prognostic value regarding DRFI if information for PD-L1 status with 22C3 is available. Keep in mind though that SP142 is relevant in predicting response to atezolizumab in the metastatic setting [6, 23]. It has previously been suggested that 22C3 is a better prognostic marker than SP142 in primary breast cancer patients  and our results suggest that 22C3 CPS at a threshold of 10 gives a better division into DRFI prognostic groups than SP142 IC+ in early-stage TNBC.
We chose to perform outcome analyses separately in the CT-cohort and the non-CT-cohort for several reasons. Older age and comorbidity (the primary reasons why (neo)adjuvant CT was not administered in the non-CT cohort), and thereby non-breast cancer related deaths in the non-CT-cohort, are competing risk factors regarding breast cancer specific events and diluting the OS results and, in part, the IDFS analyses. Moreover, TIL abundance and PD-L1 expression, both of which were lower in the non-CT-cohort than in the CT-cohort, are known to be positively associated with CT-response and prognosis in early TNBC [9, 10, 14, 33, 39, 40]. This in turn might partly explain why the prognostic impact of TILs and PD-L1 status was weaker than in the CT-cohort and not significant.
The population-based cohort is the main strength of our study, thus representing PD-L1 and TIL status in an early-stage TNBC population. A weakness is the small tissue cores in the TMA, potentially leading to inaccurate evaluations of PD-L1 expression due to intra-tumoral PD-L1 heterogeneity when compared to scoring on histological whole sections [21, 41,42,43,44,45]. Interestingly, neoadjuvant CT in TNBC is administered more frequently and becoming a standard of care compared to adjuvant CT. The evaluation of PD-L1 would be performed on core needle biopsies instead of whole sections in these patients, as it is often the case for metastatic lesions . Core needle biopsy is more comparable with TMA in terms of size than whole section slides, and this aspect needs to be taken into consideration in the clinical setting when choosing thresholds for PD-L1 expression. Another caveat of our study is that we scored PD-L1 in primary TNBC tumors which have been found in a meta-analysis to differ from PD-L1 expression in metastatic lesions . We have explored the analytical concordance of the SP142 and 22C3 assays. Unfortunately we cannot explore the predictive value of the interchangeability these assays due to the retrospective, non-randomized nature of our study where the patients did not receive immune checkpoint blockade. Further studies addressing that issue are warranted.
In summary, the PD-L1 IHC staining concordance between the clinically validated scoring algorithms for SP142 (IC ≥ 1%) and 22C3 (CPS ≥ 10) was impaired in our early-stage TNBC cohort. The concordance was better when evaluated with 22C3 CPS ≥ 1 or the same IC+ scoring method for both assays. The SP142 assay is better at identifying 22C3 positive tumors than the 22C3 assay is at identifying SP142 positive tumors. PD-L1 expression was of positive prognostic value in patients treated with (neo)adjuvant CT where it was strongest for DRFI and 22C3 CPS ≥ 10. However, PD-L1 status was not independently prognostic when adjusting for TIL abundance in multivariable analyses. Our findings suggest that these two antibody assays, with their respective clinically established scoring method and cut-offs, detect partially non-overlapping subpopulations of TNBC patients in the early-stage setting and are not substitutable with one another regarding PD-L1 detection and prognostic value. Further studies are warranted to investigate the predictive value of the interchangeability of these assays.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request (clinicopathological features, PD-L1 and TIL scoring also available in Additional file 7: Table S5).
Li X, Yang J, Peng L, Sahin AA, Huo L, Ward KC, et al. Triple-negative breast cancer has worse overall survival and cause-specific survival than non-triple-negative breast cancer. Breast Cancer Res Treat. 2017;161(2):279–87.
Jacob SL, Huppert LA, Rugo HS. Role of immunotherapy in breast cancer. JCO Oncol Pract. 2023. https://doi.org/10.1200/op.22.00483.
Miglietta F, Griguolo G, Guarneri V, Dieci MV. Programmed cell death ligand 1 in breast cancer: technical aspects, prognostic implications, and predictive value. Oncologist. 2019;24(11):e1055–69.
Chen N, Higashiyama N, Hoyos V. Predictive biomarkers of immune checkpoint inhibitor response in breast cancer: looking beyond tumoral PD-L1. Biomedicines. 2021. https://doi.org/10.3390/biomedicines9121863
Isaacs J, Anders C, McArthur H, Force J. Biomarkers of immune checkpoint blockade response in triple-negative breast cancer. Curr Treat Options Oncol. 2021;22(5):38.
Schmid P, Adams S, Rugo HS, Schneeweiss A, Barrios CH, Iwata H, et al. Atezolizumab and nab-paclitaxel in advanced triple-negative breast cancer. N Engl J Med. 2018;379(22):2108–21.
Cortes J, Cescon DW, Rugo HS, Nowecki Z, Im SA, Yusof MM, et al. Pembrolizumab plus chemotherapy versus placebo plus chemotherapy for previously untreated locally recurrent inoperable or metastatic triple-negative breast cancer (KEYNOTE-355): a randomised, placebo-controlled, double-blind, phase 3 clinical trial. Lancet. 2020;396(10265):1817–28.
Winer EP, Lipatov O, Im SA, Goncalves A, Muñoz-Couselo E, Lee KS, et al. Pembrolizumab versus investigator-choice chemotherapy for metastatic triple-negative breast cancer (KEYNOTE-119): a randomised, open-label, phase 3 trial. Lancet Oncol. 2021. https://doi.org/10.1016/S1470-2045(20)30754-3.
Mittendorf EA, Zhang H, Barrios CH, Saji S, Jung KH, Hegg R, et al. Neoadjuvant atezolizumab in combination with sequential nab-paclitaxel and anthracycline-based chemotherapy versus placebo and chemotherapy in patients with early-stage triple-negative breast cancer (IMpassion031): a randomised, double-blind, phase 3 trial. Lancet. 2020;396(10257):1090–100.
Schmid P, Cortes J, Pusztai L, McArthur H, Kümmel S, Bergh J, et al. Pembrolizumab for early triple-negative breast cancer. N Engl J Med. 2020;382(9):810–21.
Miles D, Gligorov J, André F, Cameron D, Schneeweiss A, Barrios C, et al. Primary results from IMpassion131, a double-blind, placebo-controlled, randomised phase III trial of first-line paclitaxel with or without atezolizumab for unresectable locally advanced/metastatic triple-negative breast cancer. Ann Oncol. 2021;32(8):994–1004.
Emens LA. Immunotherapy in triple-negative breast cancer. Cancer J. 2021;27(1):59–66.
Gianni L, Huang CS, Egle D, Bermejo B, Zamagni C, Thill M, et al. Pathologic complete response (pCR) to neoadjuvant treatment with or without atezolizumab in triple-negative, early high-risk and locally advanced breast cancer: NeoTRIP Michelangelo randomized study. Ann Oncol. 2022;33(5):534–43.
Schmid P, Cortes J, Dent R, Pusztai L, McArthur H, Kümmel S, et al. Event-free survival with pembrolizumab in early triple-negative breast cancer. N Engl J Med. 2022;386(6):556–67.
Cortes J, Rugo HS, Cescon DW, Im SA, Yusof MM, Gallardo C, et al. Pembrolizumab plus chemotherapy in advanced triple-negative breast cancer. N Engl J Med. 2022;387(3):217–26.
Emens LA, Adams S, Barrios CH, Diéras V, Iwata H, Loi S, et al. First-line atezolizumab plus nab-paclitaxel for unresectable, locally advanced, or metastatic triple-negative breast cancer: IMpassion130 final overall survival analysis. Ann Oncol. 2021;32(8):983–93.
Korde LA, Somerfield MR, Hershman DL. Use of immune checkpoint inhibitor pembrolizumab in the treatment of high-risk, early-stage triple-negative breast cancer: ASCO guideline rapid recommendation update. J Clin Oncol. 2022;40(15):1696–8.
Gennari A, André F, Barrios CH, Cortés J, de Azambuja E, DeMichele A, et al. ESMO Clinical Practice Guideline for the diagnosis, staging and treatment of patients with metastatic breast cancer. Ann Oncol. 2021;32(12):1475–95.
Kwapisz D. Pembrolizumab and atezolizumab in triple-negative breast cancer. Cancer Immunol Immunother. 2021;70(3):607–17.
Emens LA, Molinero L, Loi S, Rugo HS, Schneeweiss A, Diéras V, et al. Atezolizumab and nab-paclitaxel in advanced triple-negative breast cancer: biomarker evaluation of the impassion130 study. J Natl Cancer Inst. 2021;113(8):1005–16.
Carter JM, Polley MC, Leon-Ferre RA, Sinnwell J, Thompson KJ, Wang X, et al. Characteristics and spatially defined immune (micro)landscapes of early-stage PD-L1-positive triple-negative breast cancer. Clin Cancer Res. 2021;27(20):5628–37.
Emens LA, Loi S. Immunotherapy approaches for breast cancer patients in 2023. Cold Spring Harb Perspect Med. 2023. https://doi.org/10.1101/cshperspect.a041332
Rugo HS, Loi S, Adams S, Schmid P, Schneeweiss A, Barrios CH, et al. PD-L1 immunohistochemistry assay comparison in atezolizumab plus nab-paclitaxel-treated advanced triple-negative breast cancer. J Natl Cancer Inst. 2021;113(12):1733–43.
Ahn S, Woo JW, Kim H, Cho EY, Kim A, Kim JY, et al. Programmed death ligand 1 immunohistochemistry in triple-negative breast cancer: evaluation of inter-pathologist concordance and inter-assay variability. J Breast Cancer. 2021;24(3):266–79.
Huang X, Ding Q, Guo H, Gong Y, Zhao J, Zhao M, et al. Comparison of three FDA-approved diagnostic immunohistochemistry assays of PD-L1 in triple-negative breast carcinoma. Hum Pathol. 2021;108:42–50.
Lee SE, Park HY, Lim SD, Han HS, Yoo YB, Kim WS. Concordance of programmed death-ligand 1 expression between SP142 and 22C3/SP263 assays in triple-negative breast cancer. J Breast Cancer. 2020;23(3):303–13.
Noske A, Wagner DC, Schwamborn K, Foersch S, Steiger K, Kiechle M, et al. Interassay and interobserver comparability study of four programmed death-ligand 1 (PD-L1) immunohistochemistry assays in triple-negative breast cancer. Breast. 2021;60:238–44.
Pang JB, Castles B, Byrne DJ, Button P, Hendry S, Lakhani SR, et al. SP142 PD-L1 scoring shows high interobserver and intraobserver agreement in triple-negative breast carcinoma but overall low percentage agreement with other PD-L1 clones SP263 and 22C3. Am J Surg Pathol. 2021;45(8):1108–17.
Schmidt G, Guhl MM, Solomayer EF, Wagenpfeil G, Hammadeh ME, Juhasz-Boess I, et al. Immunohistochemical assessment of PD-L1 expression using three different monoclonal antibodies in triple negative breast cancer patients. Arch Gynecol Obstet. 2022. https://doi.org/10.1007/s00404-022-06529-w.
Staaf J, Glodzik D, Bosch A, Vallon-Christersson J, Reuterswärd C, Häkkinen J, et al. Whole-genome sequencing of triple-negative breast cancers in a population-based clinical study. Nat Med. 2019;25(10):1526–33.
Saal LH, Vallon-Christersson J, Häkkinen J, Hegardt C, Grabau D, Winter C, et al. The sweden cancerome analysis network - breast (SCAN-B) initiative: a large-scale multicenter infrastructure towards implementation of breast cancer genomic analyses in the clinical routine. Genome Med. 2015;7(1):20.
Salgado R, Denkert C, Demaria S, Sirtaine N, Klauschen F, Pruneri G, et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs working group 2014. Ann Oncol. 2015;26(2):259–71.
Loi S, Drubay D, Adams S, Pruneri G, Francis PA, Lacroix-Triki M, et al. Tumor-infiltrating lymphocytes and prognosis: a pooled individual patient analysis of early-stage triple-negative breast cancers. J Clin Oncol. 2019;37(7):559–69.
Hudis CA, Barlow WE, Costantino JP, Gray RJ, Pritchard KI, Chapman JA, et al. Proposal for standardized definitions for efficacy end points in adjuvant breast cancer trials: the STEEP system. J Clin Oncol. 2007;25(15):2127–32.
McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276–82.
Hulsen T, de Vlieg J, Alkema W. BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics. 2008;9:488.
Fitzgibbons PL, Bradley LA, Fatheree LA, Alsabeh R, Fulton RS, Goldsmith JD, et al. Principles of analytic validation of immunohistochemical assays: guideline from the College of American Pathologists Pathology and Laboratory Quality Center. Arch Pathol Lab Med. 2014;138(11):1432–43.
Cha YJ, Kim D, Bae SJ, Ahn SG, Jeong J, Lee HS, et al. PD-L1 expression evaluated by 22C3 antibody is a better prognostic marker than SP142/SP263 antibodies in breast cancer patients after resection. Sci Rep. 2021;11(1):19555.
Denkert C, von Minckwitz G, Darb-Esfahani S, Lederer B, Heppner BI, Weber KE, et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 2018;19(1):40–50.
Loibl S, Untch M, Burchardi N, Huober J, Sinn BV, Blohmer JU, et al. A randomised phase II study investigating durvalumab in addition to an anthracycline taxane-based neoadjuvant therapy in early triple-negative breast cancer: clinical results and biomarker analysis of GeparNuevo study. Ann Oncol. 2019;30(8):1279–88.
Stovgaard ES, Bokharaey M, List-Jensen K, Roslind A, Kümler I, Høgdall E, et al. PD-L1 diagnostics in the neoadjuvant setting: implications of intratumoral heterogeneity of PD-L1 expression in triple negative breast cancer for assessment in small biopsies. Breast Cancer Res Treat. 2020;181(3):553–60.
Dill EA, Gru AA, Atkins KA, Friedman LA, Moore ME, Bullock TN, et al. PD-L1 expression and intratumoral heterogeneity across breast cancer subtypes and stages: an assessment of 245 primary and 40 metastatic tumors. Am J Surg Pathol. 2017;41(3):334–42.
Noske A, Steiger K, Ballke S, Kiechle M, Oettler D, Roth W, et al. Comparison of assessment of programmed death-ligand 1 (PD-L1) status in triple-negative breast cancer biopsies and surgical specimens. J Clin Pathol. 2023. https://doi.org/10.1136/jcp-2022-208637.
Choi H, Ahn SG, Bae SJ, Kim JH, Eun NL, Lee Y, et al. Comparison of programmed cell death ligand 1 status between core needle biopsy and surgical specimens of triple-negative breast cancer. Yonsei Med J. 2023;64(8):518–25.
Dobritoiu F, Baltan A, Chefani A, Billingham K, Chenard MP, Vaziri R, et al. Tissue selection for PD-L1 testing in triple negative breast cancer (TNBC). Appl Immunohistochem Mol Morphol. 2022;30(8):549–56.
Cardoso F, Kyriakides S, Ohno S, Penault-Llorca F, Poortmans P, Rubio IT, et al. Early breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up†. Ann Oncol. 2019;30(8):1194–220.
Boman C, Zerdes I, Mårtensson K, Bergh J, Foukakis T, Valachis A, et al. Discordance of PD-L1 status between primary and metastatic breast cancer: a systematic review and meta-analysis. Cancer Treat Rev. 2021;99:102257.
The authors would like to acknowledge all patients, clinicians, and hospital staff participating in the SCAN-B study, the staff at the central SCAN-B laboratory at Division of Oncology, Lund University, the Swedish National Quality Register for Breast Cancer (NKBC), Regional Cancer Center South, and the South Swedish Breast Cancer Group (SSBCG). We gratefully thank Kristina Lövgren, Lena Tran and Susanne André for excellent technical assistance.
Open access funding provided by Lund University. The study was made possible through support from the Mrs. Berta Kamprad Foundation, Governmental Funding of Research within the Swedish National Health Service (ALF), Swedish Breast Cancer Association, the Swedish Cancer Society (Cancerfonden), Region Skåne, Anna-Lisa and Sven-Erik Lundgren Foundation, the Anna and Edwin Berger Foundation, Lund University Research Foundation, Skåne University Hospital Research Foundation and the Marcus and Marianne Wallenberg Foundation.
Ethics approval and consent to participate
The Regional Ethical Review Board in Lund, Sweden, has approved the SCAN-B study (applicable registration number 2009/658, 2015/277, 2016/742, 2018/267 and 2019/01252). Written informed consent was received from all enrolled participants in the SCAN-B study and the study was performed in accordance with the Declaration of Helsinki.
Consent for publication
The authors declare no competing interests except for JH who has obtained speaker's honoraria or advisory board remunerations from Roche, Novartis, Pfizer, EliLilly, MSD, Veracyte and ExactSciences, has received institutional research support from Cepheid, Roche and Novartis and who is a co-founder and shareholder of Stratipath AB.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Showing inter-core and interobserver PD-L1 concordances in the overall cohort
Showing clinicopathological features in the CT-cohort in relation to 22C3 CPS 1 and 22C3 IC PD-L1 status
Showing clinicopathological features in the non-CT-cohort in relation to PD-L1 status
Demonstrating Kaplan Meier estimates according to 22C3 CPS 1 and 22C3 IC status in the CT-cohort
Containing results from univariable regression analyses in the non-CT-cohort
Demonstrating Kaplan Meier estimates according to PD-L1 status in the non-CT-cohort
Containing clinicopathological features, follow-up data, PD-L1 and TIL scores used in our study
About this article
Cite this article
Sigurjonsdottir, G., De Marchi, T., Ehinger, A. et al. Comparison of SP142 and 22C3 PD-L1 assays in a population-based cohort of triple-negative breast cancer patients in the context of their clinically established scoring algorithms. Breast Cancer Res 25, 123 (2023). https://doi.org/10.1186/s13058-023-01724-2