Quantitative assessment of the spatial heterogeneity of tumor-infiltrating lymphocytes in breast cancer
© The Author(s). 2016
Received: 21 March 2016
Accepted: 20 July 2016
Published: 29 July 2016
Tumor-infiltrating lymphocyte (TIL) count in breast cancer carries prognostic information and represents a potential predictive marker for emerging immunotherapies. However, the distribution of the lymphocyte subpopulations is not well defined. The goals of this study were to examine intratumor heterogeneity in TIL subpopulation counts in different fields of view (FOV) within each section, in different sections from the same biopsy, and between biopsies from different regions of the same cancer using quantitative immunofluorescence (QIF).
We used multiplexed QIF to quantify cytokeratin-positive epithelial cells, and CD3-positive, CD8-positive and CD20-positive lymphocytes in tissue sections from multiple biopsies obtained from different areas of 31 surgically resected primary breast carcinomas (93 samples total). Log2-transformed QIF scores or concordance and variance component analyses with linear mixed-effects models were used. Cohen’s kappa index [k] of high versus low scores, defined as above and below the median, was used to measure sample similarity between areas.
We found a strong positive correlation between CD3 and CD8 levels across all patients (Pearson correlation coefficient [CC] = 0.827). CD3 and CD8 showed a weaker but significant association with CD20 (CC = 0.446 and 0.363, respectively). For each marker, the variation between different FOVs in the same section was higher than the variation between sections or between biopsies of the same cancer. The intraclass correlation coefficients (ICC) were 0.411 for CD3, 0.324 for CD8, and 0.252 for CD20. In component analysis, 66–69 % of the variance was attributable to differences between FOVs in the same section and 30–33 % was due to differences between biopsies from different areas of the same cancer. Section to section differences were negligible. Concordance for low versus high marker status assignment in single biopsies compared to all three biopsies combined yielded k = 0.705 for CD3, k = 0.655 for CD8, and k = 0.603 for CD20.
T and B lymphocytes show more heterogeneity across the dimensions of a single section than between different sections or regions of a given breast tumor. This observation suggests that the average lymphocyte score from a single biopsy of a tumor is reasonably representative of the whole cancer.
With the advent of effective immunotherapies for cancer, there has been renewed interest in the tumor immune infiltrate. A number of studies have shown the prognostic value of the presence of tumor-infiltrating lymphocytes (TILs) in a range of cancer types [1–3]. Increased levels of CD8+ cytotoxic T cells, in particular, has been associated with both better outcome  and response to programmed cell death receptor 1 (PD-1) therapy in melanoma  and microsatellite instable-high colorectal carcinoma . Similarly, patients who express high levels of programmed cell death ligand 1 (PD-L1) display prominent TILs and have been shown to respond more favorably to immunostimulatory therapy .
The spatial organization of the infiltrating lymphocytes is not well defined and represents a potential confounding factor in the assessment of TILs. The constituents of anti-tumor immunity such as macrophages, natural killer (NK) cells, mast cells, and lymphocytes are organized in different spatial patterns both in and around tumors, presumably representing the a range of immune responses . Tertiary lymphoid structures, analogous to structures of the lymph node with germinal centers, dendritic cells, highly proliferative B cells, and high endothelial venules, have been noted in a variety or neoplastic malignancies  and tend to correlate with more favorable outcomes . In contrast, T cells may infiltrate in either a sporadic or in a more uniform manner . These variable patterns of immune cell infiltration represent a highly heterogeneous appearance and present a challenge to reproducibly quantifying and meaningfully defining TILs.
Traditionally, TILs, like most other cancer biomarkers, have been scored by pathologists using standard hematoxylin and eosin (H&E)-stained tissue sections. Though conventional and efficient, this method is limited in its semiquantitative feature of measurement and is prone to high interscorer subjectivity and variability. The issue of assessment of TILs has been addressed by an international consortium of pathologists in a round robin study . The initial efforts showed only modest reproducibility, but subsequent studies where scoring aids were used resulted in good concordance between pathologists . While this method appears be sufficient for pathologist-based assessment, it is nonquantitative, cannot discriminate between TIL subsets, and would be insufficient for assessment of the distribution of TILs within a tumor. In addition, the possible impact of intratumor heterogeneity for use of TILs as tissue biomarker in breast cancer remains unknown. Here, we have used a previously validated multiplexed quantitative immunofluorescence (QIF) approach  to measure TILs in prospectively collected biopsies from three different regions of resected breast tumors. We describe the distribution of different TIL phenotypic markers, both within separate regions of a tumor and within a given tissue section, and then apply statistical analysis to determine the degree of variance for each marker across the tumor.
Tissue collection and patient cohort
Tissue sampling and heterogeneity assessment
Age at diagnosis
Multiplexed quantitative immunofluorescence staining for TILs
Protein detection of CD3, CD8, CD20, cytokeratin, and 4’,6-diamidino-2-phenylindole (DAPI) were simultaneously quantified on the same slide for every patient, as previously detailed by Brown et al., . Briefly, fresh cuts of whole tissue sections were deparaffinized and rehydrated before undergoing antigen retrieval using an EDTA buffer (pH = 8) for 20 minutes at 97 °C (PT module, Lab Vision, Thermo Fisher Scientific, Waltham, MA, USA). Slides were then incubated with dual endogenous peroxidase block (Dako, Glostrup, Denmark) for 10 minutes to block endogenous peroxidase activity and then incubated with 0.3 % bovine serum albumin in a 0.05 % Tween solution for 30 minutes to block nonspecific antigens. Fluorescent staining for pancytokeratin, CD3, CD8, and CD20 was performed using a sequential multiplexed protocol with different isotype-specific primary antibodies. Antibodies against these targets were used to detect epithelial tumor cells (cytokeratin, clone M3515, Dako), all T lymphocytes (CD3 IgG, clone E272, Novus Biologicals, Littleton, CO, USA), cytotoxic T lymphocytes (CD8 IgG1, clone C8/144B, Dako), and B lymphocytes (CD20, IgG2a, clone L26, Dako). All nuclei were then tagged with DAPI (Life Technologies, Carlsbad, CA, USA). Secondary antibodies conjugated to horseradish peroxidases (HRPs) and specific to each primary antibody isotype were used (anti-rabbit EnVision, Dako; anti-mouse IgG1, eBioscience, San Diego, CA, USA; anti-mouse IgG2a, Dako), while tyramide-bound fluorophores were added to bind to the HRPs (biotinylated tyramide, PerkinElmer, Waltham, MA, USA; streptavidin-Alexa750, Life Technologies; TSA™Plus fluorescein-tyramide, PerkinElmer; cyanine 5, PerkinElmer, respectively). A fluorophore-conjugated goat anti-rabbit secondary antibody was used against the cytokeratin antibody (Goat anti-Rabbit Alexa546, Life Technologies). Residual, unbound HRPs were blocked between incubations with a 0.15 % hydrogen peroxide benzoic hydrazide solution.
Slides were stained in three batches with a LabVision autostainer, in which samples from 10 to 11 patients were stained in each run. All biopsies from the same tumor were stained in the same batch to reduce experimental variability of expression of each target within patient samples. Morphologically normal human tonsil whole tissue sections were included in each batch as lymphocytic-positive control slides and to account for any variability in protein expression between batches. Additional file 1: Figure S1 shows small batch to batch variation for each marker.
Fluorescence measurement and scoring
Quantitative measurement of fluorescent signal was obtained using automated quantitative analysis (AQUA®) technology (Genoptix, Inc., Carlsbad, CA, USA), which allows for objective and accurate measurement of protein expression within predetermined tumor and/or other subcellular compartments, as previously described . FOVs, or areas of interest, on each slide were selected for in a preliminary low-resolution scan based on nuclear DAPI staining. Each FOV was then captured at high-resolution fivefold, with fluorescent wavelengths matching the five fluorophores used during staining (DAPI, FITC, Cy3, Cy5, and Cy7).
In order to accurately quantify the signal intensity of the emission wavelengths in each fluorescent channel with AQUA® software, areas not expressing invasive breast carcinomas as demonstrated by cytokeratin staining [e.g., normal breast tissue, ductal carcinoma in situ (DCIS)] were excluded from analysis, as well as any preparative artifacts (e.g., folded or damaged tissue). QIF scores for each FOV were generated for each channel by dividing the target marker pixel intensity by the total tissue area in that particular FOV (as defined by DAPI staining). Scores were normalized to exposure time and bit depth during time of capture to allow proper comparison across all samples. We used the average QIF scores of a given marker from all FOV in a given section to represent marker expression at the section level. We calculated the average QIF score from all sections from a single biopsy to represent marker expression at the biopsy level.
For statistical analysis, QIF scores for each marker were log2 transformed to minimize the possible effects of the differential score distribution across cases. Pearson’s correlation coefficients (CC) and intraclass correlation coefficients (ICC) were calculated for each marker to measure the similarity of marker scores between FOVs, section to section in the same biopsy, and between biopsies from the same tumor. The variance components analysis used a linear mixed effect (LME) regression model was used to estimate the contribution of each source of variation to the total variation observed. Biopsies were also categorized into lymphocyte low versus high groups using the median for each lymphocyte subtype marker. Cohen’s kappa coefficient (k) was then used to calculate the concordance in TIL category obtained from assessing a single, randomly chosen biopsy versus the averaged results from all three biopsies from a given cancer. Statistical analyses were performed using the R v3.2.2 statistical platform (R Foundation for Statistical Computing) and GraphPad Prism v6.0 for Windows (GraphPad Software, Inc, San Diego, CA, USA).
Validation of quantitative immunofluorescence staining
Systematic staining of serially sectioned, morphologically normal human tonsil tissue was used as positive control. It revealed the expected levels and expression patterns of CD3, CD8, and CD20 (Fig. 1b). CD3 and CD8 showed a membranous staining pattern and were located predominantly in areas outside of the follicular germinal centers. Staining with CD20 showed a membranous cellular staining of cell exclusively within the germinal centers that are typically rich in B cells. Cytokeratin positivity was found in the squamous epithelium lining of the crypts, characteristic of tonsil tissue. QIF scores for CD3, CD8, and CD20 showed high concordance and remained reproducible for each marker between tonsil slides stained in different batches indicating limited interbatch variation (Additional file 1: Figure S1).
Quantitative assessment of TILs by immunofluorescence of breast cancer
Intratumor variability of TIL subpopulations
When sources of variation in a single cancer were examined, variance components analysis using a linear mixed-effects regression showed that for all three markers, 30–33 % of the variation in expression levels is a result of between-biopsy differences while 66–69 % of variation is due to variable scores between FOVs in the same section. Only 2 % or less of the variation is due to differences in scores between serial sections of the same biopsy (Fig. 5b).
We also calculated average QIF scores for each biopsy and used the median score of the entire population as threshold to assign high TIL (i.e., above median) and low TIL (i.e., below median) categories to each biopsy. To test whether a single biopsy can provide a representative score for the entire cancer, we calculated Cohen’s kappa coefficient (k) for agreement in TIL classifications (high versus low) obtained from one random biopsy and from the average score from all three biopsies of the same cancer (only the 26 patients with all three viable biopsies were included in this analysis). Kappa values were 0.705 for CD3, 0.655 for CD8, and 0.603 for CD20 indicating very good to excellent agreement.
To assess the representivity of any given core, we assumed the mean of three cores to be representative of the tumor, then determined how often any given core was more than one standard deviation from the mean. Since the analysis is continuous, it is hard to prove that this method defines the number of cores that could be nonrepresentative of the whole case, but is an estimate of that parameter. We found only two out of the 89 total cores showed CD3 FOV mean score outside of the standard deviation of the overall average CD3 score for the case. By this definition, any given core is 97.8 % likely to be representative for CD3 expression. For CD8, none of the 89 cores in question had CD8 FOV mean scores outside of each the standard deviation for each patient and only one core had a mean CD20 score outside of the patient’s combined CD20 scores (98.9 % accurate).
There is growing interest in quantifying and reporting total TIL count and TIL subpopulations in clinical specimens due to their prognostic and possibly predictive value for immunotherapies . An international group of pathologists recently proposed a standardized method to assess TILs with the intent to facilitate including this parameter in routine pathology reports . The preferred way to establish the histologic diagnosis of breast cancer is core needle biopsy. These biopsies yield small amounts of tissue and sampling bias may influence biomarker results obtained from needle biopsies. Many clinically relevant markers [estrogen receptor (ER), human epidermal growth factor receptor 2 (HER2), Ki67] can be determined reliably from small biopsies of a cancer. However, markers that display high intratumor heterogeneity may yield results that are not representative of the entire tumor. The purpose of this study was to examine intratumor heterogeneity of TIL subpopulations. We examined the distribution of TIL populations using quantitative immunofluorescence and measured intratumor heterogeneity at three levels: heterogeneity between microscopic (20× magnification) FOVs in the same section, heterogeneity between average TIL scores between sections of the same biopsy, and average scores between biopsies from the same cancer.
This method of analysis led us to a few key observations. The first is that TILs, predominantly as measured by CD3 or CD8, show reasonable homogeneity between cores from specimens that were biopsied in three spatially distinct locations. The mean levels of either CD3 or CD8 (Fig. 2) show good concordance across the three cores from the same specimen. Thus, while heterogeneity was broad within a specimen, the overall assessment of a given core, in most cases, appears to be representative of the mean for the entire case. This is reassuring in that it suggests that a single core biopsy, as often obtained in a clinical setting, may be sufficient to represent the TILs from the entire tumor. We also observed that B cells exist in small clusters in the tumor microenvironment. The concept of small clusters of B cells, tertiary lymphoid structures, is well established . This anatomical feature of B cell infiltrates explains the greater intratumor heterogeneity of CD20 scores.
Our study is not without limitations. For instance, the use of fluorescence intensity scores that incorporate the intensity of each marker per cell prevents the accurate determination of the absolute number of lymphocytes and reduces our capacity to compare the relative abundance of each cell subpopulation. In addition, the inclusion in our study of cases with different biological breast cancer subtypes [(ER+, HER2+, triple-negative breast cancer (TNBC)], without uniform treatment and relatively short follow-up limits our capacity to explore the clinical implications of TILs heterogeneity in breast cancer. Finally, we have used AQUA® technology, which averages intensity over an FOV rather than actually counting each cell. While these values are not equal and have the potential to assess very different variables, we have shown a comparison of AQUA® scores and CD8 and CD20 cell counts in a sample lung TMA in Additional file 1: Figure S4.
In summary, we have applied an objective and reproducible immunofluorescence-based assay to quantify the distribution of TIL expression in multiple spatially separate regions from a population of breast tumors. Though our patient cohort was relatively small, we demonstrated that, in this population, CD3, CD8, and CD20 show substantial heterogeneity but that heterogeneity is greatest within the core biopsy and to a lesser extent between biopsies of the same tumor. While this is a small study, our data suggests that a single core may be sufficient to estimate the TIL heterogeneity for an entire breast tumor. Future studies with larger patient populations with outcome data are needed to validate this observation.
CC, correlation coefficient; ER, estrogen receptor; DCIS, ductal carcinoma in situ; FOV, field of view; H&E, hematoxylin and eosin; HER2, human epidermal growth factor receptor 2; ICC, intraclass correlation coefficient; PD-1, programmed cell death receptor 1; PD-L1, programmed cell death receptor 1 ligand; QIF, quantitative immunofluorescence; TIL, tumor-infiltrating lymphocytes; TNBC, triple-negative breast cancer
We would like to thank Lori Charette and Yale Pathology Tissue Services for productions of patient tissue slides used in this study. This study was also supported by the Connecticut Breast Health Initiative, the American Society of Breast Surgeons Foundation, Gilead Sciences, the Lion Heart Foundation (ABC) and the Breast Cancer Research Foundation.
NM carried out the immunofluorescence staining and analysis of data and drafted the manuscript. KS participated in the design of the study, made substantial contributions to the analysis and interpretation of data, and helped revise the manuscript. CH provided statistical analysis and interpretation of data and helped revise the manuscript. OS provided tissue and pathology review and helped revise the manuscript. FT provided tissue and pathology review and helped revise the manuscript. MB provided tissue annotation and technical assistance and helped revise the manuscript. AC provided the tissue of all patients in the study and helped revise the manuscript. LP designed the study, made substantial contributions to the analysis and interpretation of data, helped revise the manuscript, and provided financial support to carry out the study. DR conceived and designed the study, revised the final version of the manuscript for submission, and provided finances and resources to carry out the study. All authors have read and approved the final version of the manuscript.
Within the last 12 months, DLR has served as a paid consultant or advisor to Genoptix/Novartis, Cernostics, BMS, Biocept, FivePrime, Perkin Elmer, and Metamark Genetics. All other authors declare no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Salgado R, Denkert C, Campbell C, Savas P, Nucifero P, Aura C, de Azambuja E, Eidtmann H, Ellis CE, Baselga J, et al. Tumor-infiltrating lymphocytes and associations with pathological complete response and event-free survival in HER2-positive early-stage breast cancer treated with lapatinib and trastuzumab: a secondary analysis of the NeoALTTO Trial. JAMA Oncol. 2015;1(4):448–54.View ArticlePubMedGoogle Scholar
- Dieci MV, Mathieu MC, Guarneri V, Conte P, Delaloge S, Andre F, Goubar A. Prognostic and predictive value of tumor-infiltrating lymphocytes in two phase III randomized adjuvant breast cancer trials. Ann Oncol. 2015;26(8):1698–704.View ArticlePubMedPubMed CentralGoogle Scholar
- Schalper KA, Brown J, Carvajal-Hausdorf D, McLaughlin J, Velcheti V, Syrigos KN, Herbst RS, Rimm DL. Objective measurement and clinical significance of TILs in non-small cell lung cancer. J Natl Cancer Inst. 2015;107(3):9.Google Scholar
- Liu S, Lachapelle J, Leung S, Gao D, Foulkes WD, Nielsen TO. CD8+ lymphocyte infiltration is an independent favorable prognostic indicator in basal-like breast cancer. Breast Cancer Res. 2012;14(2):R48.View ArticlePubMedPubMed CentralGoogle Scholar
- Tumeh PC, Harview CL, Yearley JH, Shintaku IP, Taylor EJ, Robert L, Chmielowski B, Spasic M, Henry G, Ciobanu V, et al. PD-1 blockade induces responses by inhibiting adaptive immune resistance. Nature. 2014;515(7528):568–71.View ArticlePubMedPubMed CentralGoogle Scholar
- Llosa NJ, Cruise M, Tam A, Wicks EC, Hechenbleikner EM, Taube JM, Blosser RL, Fan H, Wang H, Luber BS, et al. The vigorous immune microenvironment of microsatellite instable colon cancer is balanced by multiple counter-inhibitory checkpoints. Cancer Discov. 2015;5(1):43–51.View ArticlePubMedGoogle Scholar
- Herbst RS, Soria JC, Kowanetz M, Fine GD, Hamid O, Gordon MS, Sosman JA, McDermott DF, Powderly JD, Gettinger SN, et al. Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients. Nature. 2014;515(7528):563–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Fridman WH, Pages F, Sautes-Fridman C, Galon J. The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer. 2012;12(4):298–306.View ArticlePubMedGoogle Scholar
- Martinet L, Garrido I, Filleron T, Le Guellec S, Bellard E, Fournie JJ, Rochaix P, Girard JP. Human solid tumors contain high endothelial venules: association with T- and B-lymphocyte infiltration and favorable prognosis in breast cancer. Cancer Res. 2011;71(17):5678–87.View ArticlePubMedGoogle Scholar
- Pages F, Galon J, Dieu-Nosjean MC, Tartour E, Sautes-Fridman C, Fridman WH. Immune infiltration in human tumors: a prognostic factor that should not be ignored. Oncogene. 2010;29(8):1093–102.View ArticlePubMedGoogle Scholar
- Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, Angell H, Fredriksen T, Lafontaine L, Berger A, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782–95.View ArticlePubMedGoogle Scholar
- Salgado R, Denkert C, Demaria S, Sirtaine N, Klauschen F, Pruneri G, Wienert S, Van den Eynden G, Baehner FL, Penault-Llorca F, et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann Oncol. 2015;26(2):259–71.View ArticlePubMedGoogle Scholar
- Denkert C, Wienert S, Poterie A, Loibl S, Budczies J, Badve S, Bago-Horvath Z, Bane A, Bedri S, Brock J, Chmielik E, Christgen M, Colpaert C, Demaria S, Van den Eynden G, Floris G, Fox SB, Gao D, Ingold Heppner B, Kim SR, Kos Z, Kreipe HH, Lakhani SR, Penault-Llorca F, Pruneri G, Radosevic-Robin N, Rimm DL, Schnitt SJ, Sinn BV, Sinn P, Sirtaine N, O’Toole SA, Viale G, Van de Vijver K, de Wind R, von Minckwitz G, Klauschen F, Untch M, Fasching PA, Reimer T, Willard-Gallo K, Michiels S, Loi S, Salgado R. Standardized evaluation of tumor-infiltrating lymphocytes in breast cancer: results of the ring studies of the international immuno-oncology biomarker working group. Mod Pathol. 2016. (elctronic ahead of print).Google Scholar
- Brown JR, Wimberly H, Lannin DR, Nixon C, Rimm DL, Bossuyt V. Multiplexed quantitative analysis of CD3, CD8, and CD20 predicts response to neoadjuvant chemotherapy in breast cancer. Clin Cancer Res. 2014;20(23):5995–6005.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee HJ, Park IA, Song IH, Shin SJ, Kim JY, Yu JH, Gong G. Tertiary lymphoid structures: prognostic significance and relationship with tumour-infiltrating lymphocytes in triple-negative breast cancer. J Clin Pathol. 2016;69(5):422–30.View ArticlePubMedGoogle Scholar