Skip to main content
  • Research article
  • Open access
  • Published:

Computational pathology of pre-treatment biopsies identifies lymphocyte density as a predictor of response to neoadjuvant chemotherapy in breast cancer



There is a need to improve prediction of response to chemotherapy in breast cancer in order to improve clinical management and this may be achieved by harnessing computational metrics of tissue pathology. We investigated the association between quantitative image metrics derived from computational analysis of digital pathology slides and response to chemotherapy in women with breast cancer who received neoadjuvant chemotherapy.


We digitised tissue sections of both diagnostic and surgical samples of breast tumours from 768 patients enrolled in the Neo-tAnGo randomized controlled trial. We subjected digital images to systematic analysis optimised for detection of single cells. Machine-learning methods were used to classify cells as cancer, stromal or lymphocyte and we computed estimates of absolute numbers, relative fractions and cell densities using these data. Pathological complete response (pCR), a histological indicator of chemotherapy response, was the primary endpoint. Fifteen image metrics were tested for their association with pCR using univariate and multivariate logistic regression.


Median lymphocyte density proved most strongly associated with pCR on univariate analysis (OR 4.46, 95 % CI 2.34-8.50, p < 0.0001; observations = 614) and on multivariate analysis (OR 2.42, 95 % CI 1.08-5.40, p = 0.03; observations = 406) after adjustment for clinical factors. Further exploratory analyses revealed that in approximately one quarter of cases there was an increase in lymphocyte density in the tumour removed at surgery compared to diagnostic biopsies. A reduction in lymphocyte density at surgery was strongly associated with pCR (OR 0.28, 95 % CI 0.17-0.47, p < 0.0001; observations = 553).


A data-driven analysis of computational pathology reveals lymphocyte density as an independent predictor of pCR. Paradoxically an increase in lymphocyte density, following exposure to chemotherapy, is associated with a lack of pCR. Computational pathology can provide objective, quantitative and reproducible tissue metrics and represents a viable means of outcome prediction in breast cancer.

Trial registration NCT00070278; 03/10/2003


Women with high-risk early breast cancer are increasingly being offered chemotherapy before definitive surgery because neoadjuvant chemotherapy can enable breast-conserving surgery [1]. Complete eradication of tumour cells in the surgically removed tumour bed or pathological complete response (pCR) is associated with improved survival [2, 3]. The likelihood of pCR is profoundly affected by the oestrogen receptor (ER) and human epidermal growth factor receptor 2 (HER2) status of the primary tumour [2, 4]. In spite of these differences, improved prediction of the probability of pCR is needed because alternative regimens of chemotherapy or enrolment in clinical trials might be offered to patients deemed unlikely to experience pCR at baseline.

Novel pathological and genomic predictors of pCR have been described. Hatzis et al. used gene-expression microarrays to generate a gene set encompassing modules for response to endocrine therapy and cytotoxic chemotherapy, which identified patients likely to undergo pCR and to have longer survival [5]. The proportion of tumour infiltrating lymphocytes has also been shown to predict pCR in several studies of neoadjuvant chemotherapy [612]. Automated quantitative estimates of tumour morphology using digital images of tissue sections have been shown to be associated with prognosis [13, 14]. Therefore, comparable computational analysis of histological sections might provide a similar method for prediction of pCR.

We hypothesized that systematic quantitative analysis of tumour morphology at diagnosis would objectively identify tissue characteristics associated with pCR. We undertook a digital pathology study using a newly developed image processing method for single cell detection and material from the Neo-tAnGo randomized controlled trial [15], both from diagnosis and at surgery, in order to objectively identify tissue features associated with pCR and to investigate changes in quantitative morphological metrics between pre-treatment and post-treatment samples and their relationship to pCR.


Patients and clinical samples

Neo-tAnGo was a phase III, randomized trial with two-by-two factorial design addressing both the role of gemcitabine in a sequential neoadjuvant chemotherapy regimen of epirubicin/cyclophosphamide and paclitaxel, and the role of sequencing of these treatment components [15]. The trial recruited women with high-risk early breast cancer between 2005 and 2007 across 57 centres in the UK. Women with HER2-positive disease did not receive neo-adjuvant trastuzumab although most did receive adjuvant trastuzumab depending on local protocols. A total of 812 patients were included in the primary endpoint analysis [15]. Pathological complete response, the primary endpoint, was defined as the complete absence of tumour cells in resected breast tissue and axillary lymph nodes. Whether pCR had occurred was determined by independent analysis of histopathology reports by two investigators (EP and HME) as previously described [16]. The trial found no effect of the addition of gemcitabine on the proportion of cases with pCR but did find that sequencing of taxanes before anthracyclines led to an overall increase in pCR [15]. Details of eligibility and ascertainment of clinical characteristics are provided in the main trial report [15]. The trial was approved by the multicentre research ethics committee and subsequently by the local research ethics committees at all participating centres (full names and details are provided in Additional file 1). All patients provided written informed consent and the trial was registered ( NCT00070278).

Image acquisition, processing and pathology review

Haematoxylin and eosin (H&E)-stained histological slides from formalin-fixed paraffin embedded (FFPE) pre-treatment core biopsies and tumours resected at surgery were requested from all centres for central review and digitization. Image analysis was conducted by AD at the Institute of Astronomy in Cambridge, as part of a collaboration with Oncology [17]. Our image-processing pipeline is summarized in Fig. 1. Our approach is entirely automated and consists of identifying tissue for analysis, segmenting cell nuclei and finally using machine-learning to classify nuclei as cancer, stromal or lymphocyte based on a training set. Adipocytes were included in the stromal category because based on nuclear features alone, it was not possible to reliably identify them. Further details are provided in Additional file 1.

Fig. 1
figure 1

Overview of the image processing method. a Full-face H&E scanned images consist of four levels (L0–L3) across a gradation of resolutions. The levels L3 (lowest resolution) and L0 (highest resolution) are used to process each image. b Automated identification of regions of interest was performed by dividing image layer L3 into several small blocks (grid) and by analysing the pixel intensity distribution of each block. c Each image block found to contain tissue was mapped onto layer L0 and image segmentation and object detection (green ellipses) was conducted to construct an object catalogue. d Illustrative representation as a contour map of lymphocyte density derived using a k-nearest neighbour algorithm of the 50 nearest like-class neighbours. e Distribution of lymphocyte metrics by categories of lymphocytic infiltration based on central pathology review. SVM support vector machine

Statistical analyses

All image analyses were conducted prior to receipt of clinical data from the trial statistician. Associations between continuous automated image metrics and categorical clinical variables were tested using Wilcoxon’s rank-sum test or the Kruskal-Wallis test. Pearson’s correlation coefficient was used to investigate the relationship between continuous variables. Logistic regression was used to test for associations with pCR providing an odds ratio (OR), 95 % confidence interval (CI) and p value. Candidate image-based predictors were initially tested against pCR in univariate analysis, with absolute cell counts as log-transformed variables. Next, a multivariate model was fitted iteratively in a backward stepwise manner to retain metrics significantly associated with pCR after adjustment for others remaining in the model. These metrics were finally included in a multivariate analysis further adjusted for patient age, radiological tumour size at diagnosis, lymph node status, histological grade, ER status and HER2 status. All image metrics were modelled as continuous variables. Whether an image metric predicted pCR differently between taxane sequencing groups (first vs second) was investigated by inclusion of an interaction term in a logistic regression model and the fit of this model compared to a model lacking the term using a likelihood ratio test. Statistical analyses were conducted in Intercooled Stata version 11.2 (Stata Corp, College Station, TX, USA). This study was conducted in compliance with the Reporting recommendations for tumor marker prognostic studies (REMARK) criteria [18] as detailed in Additional file 1.


A digital pathology resource and image-processing pipeline for quantitative cell-level metrics

We used material from 765 patients enrolled in the Neo-tAnGo randomized controlled trial. Additional file 2 depicts the flow of patients through each analytical stage in this study. A total of 2,436 slides were received and digitized, of which 1,992 contained tumour or tumour bed with the remainder containing adjacent normal breast tissue. All images used for analyses are available to download together with open-source image analysis code at and

On average, each patient’s tumour or residual tumour bed was represented in 1.2 pre-treatment slides (range 1–23) and 1.8 slides from the post-treatment surgical specimen (range 1–35). Figure 1 summarizes our novel image processing pipeline: scanned slide images were extracted as four layers differing in their size and resolution, and using the lowest resolution image, areas that contained tissue were automatically identified and white space was excluded from further analysis. Next, using the highest resolution layer, single cell nuclei were detected, and using a support-vector-machine (SVM) approach trained using around 1,000 objects per class, cells were sub-classified into readily distinguishable categories (cancer, stromal or lymphocyte). Metrics relating to these three cell types and describing absolute cell number (count), relative cell type proportion (fraction) and density (minimum, median and maximum) based on the 50 nearest like-class neighbors, were computed. That is, every detected cell was assigned a density estimate based on the distance between it and the 50 nearest cells of the same type. Pathologist assessment of lymphocytic infiltration in pre-treatment biopsies with paired automated data, to which the pathologist was blinded, was available for comparison in 377 samples. All automated metrics of lymphocytic infiltration were significantly associated with pathologist scores (p <0.0001 for all three comparisons; Fig. 1), attesting to the validity of the automated image analysis approach.

Image metrics were calculated for a total of 765 patients of which 623 provided data on pre-treatment biopsies and 699 provided data on post-treatment surgical samples, with paired (pre- and post-treatment) data available in 557 patients. Patient and tumor characteristics are detailed in Table 1. Fifteen machine-learning-derived image metrics were used for downstream correlative analysis with clinical and pathological features. Additional files 3 and 4 depict histograms of the distributions of image metrics for pre-treatment biopsies and post-treatment surgical samples. The correlation matrix of image metrics from data generated using pre-treatment biopsies is depicted in Additional file 5 and from surgical samples in Additional file 6. In pre-treatment biopsies, the strongest positive correlation was between the absolute number of lymphocytes and the number of cancer cells (rho = 0.84) and, conversely, the strongest negative correlation was between the relative fraction of cancer cells and the fraction of lymphocytes (rho = –0.59). These contrasting correlations highlight the extent to which absolute cell counts simply reflect sample cellularity, warranting cautious interpretation.

Table 1 Patient and tumour characteristics

Image metrics reflect molecular subtype of the primary tumor

Figure 2 depicts the distribution of image metrics by the molecular subtype of the tumour based on ER and HER2 status. The distribution of cancer cell fraction significantly differed between groups (p = 0.04), with the highest median cancer cell fraction observed in ER-negative, HER2-negative tumours. Similarly, the distribution of stromal cell fraction was significantly different between groups (p = 0.00001), with the highest median level observed in ER-positive, HER2-negative tumours. The distribution of lymphocyte fraction was also significantly different between groups (p = 0.002) with the highest median level observed in ER-negative, HER2-positive tumours. Median lymphocyte density was not correlated with patient age at diagnosis (Pearson’s correlation coefficient –0.01), suggesting that the association with tumour molecular subtype was not age-related. However, these relationships did not hold following chemotherapy in post-treatment surgical samples for which none of the metrics were significantly different across molecular subtypes (Additional file 7), likely due to the diminishing influence of tumour cells on the composition of samples, owing both to widespread cell death and to other effects of chemotherapy on the tissue environment.

Fig. 2
figure 2

Distribution of pre-treatment sample image metrics by tumour molecular subtype. Horizontal grey lines represent median values. Results of the Kruskal-Wallis test are depicted within graphs; red text denotes p values <0.05. ER oestrogen receptor, HER2 human epidermal growth factor receptor 2

Chemotherapy response is most strongly associated with median lymphocyte density

Univariate analysis of the 15 image metrics was conducted using pretreatment biopsies in 614 patients, of whom 98 (16 %) had tumors that underwent pCR. Six of the fifteen metrics were significantly (at a nominal p value <0.05) associated with pCR (Fig. 3 and Additional file 8). Of these six, four were related to lymphocytes, one to cancer cells and one to stromal cells. The association between median lymphocyte density and pCR was by far the strongest (OR 4.46, 95 % CI 2.34-8.50, p <0.0001) with the next most significant being maximum lymphocyte density (OR 3.48, 95 % CI 1.54, 7.86, p = 0.003). To determine whether sample cellularity influenced the relationship between median lymphocyte density and pCR, a model adjusted for the total number of cells in a sample was fit. This showed that sample cellularity had little effect on this association, which remained significant in the model (OR 4.56, 95 % CI 2.27, 9.15, p <0.0001). A multivariate model comprising these 15 predictors was modified in a backward stepwise manner resulting in a final model of 5 significant predictors, including median lymphocyte density (Additional file 8). However, on further adjustment of this model for clinical variables only median lymphocyte density was significantly associated with pCR (Additional file 9). When included in a multivariate model with only clinical predictors, median lymphocyte density remained significantly associated with pCR (OR 2.42, 95 % CI 1.08, 5.40, p = 0.03; Table 2). However, approximately one third of observations were lost between univariate (n = 614) and multivariate analyses (n = 406). For deciles of median lymphocyte density, differences in the proportion of cases that underwent pCR varied between 4.9 % (3/61) for the first decile up to 35.5 % (22/62) for the last.

Fig. 3
figure 3

Association between image metrics and pathological complete response (pCR). Manhattan plot illustrates p values (–log10) from univariate logistic regression analyses testing the association between 15 image metrics and pCR

Table 2 Univariate and multivariate logistic regression models of clinical factors and median lymphocyte density

Figure 4 depicts the distribution of pCR, receptor status and cellular composition in all samples ranked according to the median lymphocyte density for each sample. In addition to depicting the sample-level relationship between these variables, the relationship between median lymphocyte density, pCR and molecular subtype is also depicted in Fig. 4. To address whether the association between median lymphocyte density and pCR significantly differed by ER status, we compared the fit of two logistic regression models: one with an interaction term between median lymphocyte density and ER status, and one without. The likelihood ratio test comparing these models was not significant (p = 0.72), however, it should be noted that for comparison of effect between subgroups these analyses are relatively underpowered, precluding reliable conclusions. In addition, Fig. 4 highlights that although higher median lymphocyte density is generally associated with a larger lymphocyte fraction (rho = 0.69), it is not simply a reflection of higher numbers of infiltrating lymphocytes. There are both instances where the relative fraction of lymphocytes is high but where the density of lymphocytes is low compared to other samples, and conversely in some instances the fraction of lymphocytes is relatively low but their median density is high (Fig. 4). Additional file 10 depicts this relationship as a scatter plot, further highlighting the existence of outlier cases. This suggests that a measure of lymphocytic density may reflect a functional aspect of the immune response not entirely encompassed by lymphocyte fraction. Additional file 11 depicts example images together with contour representations of lymphocyte density and automated metrics for each image. Median lymphocyte density was positively associated with both axillary lymph node status (p = 0.002) and histological grade (p <0.00001) but not with tumor size (p = 0.47) as depicted in Additional file 12.

Fig. 4
figure 4

Relationship between median lymphocyte density, cellular composition, pathological complete response (pCR) and receptor status. Bar plots illustrating the relationship between pathological complete response, oestrogen receptor (ER) status, human epidermal growth factor receptor 2 (HER2) status, cellular composition and median lymphocyte density. Plots are sorted by increasing level of median lymphocyte density. For illustration, median lymphocyte density has been rescaled to positive values

Increased post-treatment lymphocyte density is associated with relative chemoresistance

We further investigated whether the change in median lymphocyte density between pre- and post-treatment samples was reflected in the probability of pCR and the extent to which the degree of change varied between patients. Change in median lymphocyte density across all 557 samples for which paired data was available is depicted as a waterfall plot in Fig. 5. In the majority of cases (75.6 %, 421/557 cases) lymphocyte density decreased in the surgical specimen. In addition, on assessment of the association between change in lymphocyte density and pCR in the 553 cases with sufficient data, there was strong association between reduction in lymphocyte density in the surgical sample and higher likelihood of pCR (OR 0.28, 95 % CI 0.17, 0.47, p <0.0001) with 17 % (71/418) of cases with a decrease in density undergoing pCR, in comparison to just 6.7 % (9/135) of those where there was an increase. Change in lymphocyte density was negatively correlated with median lymphocyte density at diagnosis (Correlation coefficient –0.6, p <0.0001). In a model adjusted for median lymphocyte density at diagnosis, change in lymphocyte density remained significantly associated with pCR (OR 0.38, 95 % CI 0.21, 0.71, p = 0.002).

Fig. 5
figure 5

Change in lymphocyte density and association with pathological complete response (pCR). Plot depicts the change in median lymphocyte density between paired pre-treatment and post-treatment surgical samples. The primary sort key is taxane sequence and the secondary sort key is change in lymphocyte density. Annotated rug depicts oestrogen receptor (ER), human epidermal growth factor receptor 2 (HER2) status and taxane sequence for samples corresponding to those depicted in the plot above

The main finding of the Neo-tAnGo trial was that administration of a taxane prior to other chemotherapy agents led to a significant increase in pCR [15]. Therefore, we next evaluated whether the association between increased post-treatment lymphocyte density and reduced likelihood of pCR was equally distributed according to whether a taxane was received first or second (Fig. 5). The association between increased post-treatment lymphocyte density and reduced likelihood of pCR was significantly stronger where a taxane was received second compared to where it was received first (likelihood-ratio test for interaction, p = 0.02). The OR in the taxane second group was 0.12 (95 % CI 0.05, 0.31, p <0.0001) compared to 0.45 (95 % CI 0.24, 0.84, p = 0.01) in the taxane first group. It should be noted that the proportion of cases with increased post-treatment lymphocyte density was equally distributed across taxane groups (24 % and 25 %).


We used computational pathology to generate image metrics of both pre- and post-treatment biopsies in a randomized controlled trial of neoadjuvant chemotherapy in breast cancer in order to investigate associations with chemosensitivity. Median lymphocyte density in pre-treatment biopsies emerged as the best predictor of response to chemotherapy, improving prediction based on known clinical factors. In addition, change in lymphocyte density between pre and post-treatment samples revealed that an increase in lymphocyte density was, paradoxically, associated with relative chemoresistance.

Computational pathology was used here to generate objective quantitative estimates of the cellular composition of tissue samples and, importantly, of the spatial heterogeneity of different cell types across a tissue section. Median density of lymphocytes, a spatial estimate, outperformed simple cellular quantification. Similarly, we have previously reported that the spatial distribution of stromal cells is prognostic in breast cancer and that this feature is not easily measurable by genomic assays [13]. Previous work in the context of breast and prostate cancer, has demonstrated the capacity of a computational approach to interrogate spatial, relational and geometric features of tissues for outcome prediction [14, 19, 20]. Many of these parameters could not be practically estimated by other means, and in this respect machine learning can provide deeper insight into tissue morphology than is possible by manual evaluation.

We performed a data-driven selection of tissue features associated with pCR. Median lymphocyte density emerged as the best predictor of chemosensitivity. In a previous study we have reported an association between automated estimates of lymphocytic infiltration and breast cancer survival in ER-negative disease [13, 21]. Similarly, studies of patients who received neo-adjuvant chemotherapy based on genomic assays and histopathology have also reported an association between the immune response and pCR [22]. Ignatiadis et al. conducted a meta-analysis of gene-expression data from pre-surgical specimens in 996 patients and investigated associations between 17 previously reported gene modules and pCR [22]. They found that the gene modules most reliably associated with pCR across cancer subtypes were those relating to the immune response. Tumour infiltrating lymphocytes estimated by a pathologist from H&E sections have also been found to be associated with outcome and response to chemotherapy [8, 23], including some in the neo-adjuvant setting [6, 10], largely in accord with the findings of this study.

To our knowledge, this is the first report showing that an increase in lymphocyte density following the perturbation of chemotherapy is associated with a lower likelihood of pCR. Previous studies have also shown that the composition of the post-treatment immune repertoire is associated with survival [24, 25]. In addition, we found that the sequence in which chemotherapy agents were administered affected the strength of this association. Where patients received a taxane second, increased lymphocyte density was more strongly associated with relative resistance to chemotherapy than in patients who received a taxane first.

A limitation of this study is that we were not able to digitize and analyse samples from all patients enrolled in the Neo-tAnGo trial and that associated clinical data were not complete, leading to a loss of around one third of observations in multivariate analyses. This is inevitable in the context of large multicentre trials. A second limitation is that by using H&E sections we have not accounted for the immune phenotype or functional state of infiltrating lymphocytes.

The functional basis of the interaction we observed between the immune response and chemotherapy is uncertain. It should, however, be noted that all patients received an anthracycline (epirubicin) as part of their treatment. Anthracyclines have been extensively investigated in pre-clinical studies as a chemotherapeutic agent with a tumoricidal effect that can in part be attributed to stimulation of the immune response [26]. For example, a recent study reported that upon exposure to an anthracycline tumour cells produce type I interferons evoking an immunological cascade reminiscent of that seen in cells infected by a virus and that this effect may be partly explained by the release of self-RNAs by dying cells [27]. That greater clinical benefit of anthracyclines is significantly associated with the presence of a pre-existing immune response (tumour-infiltrating lymphocytes) has also been shown in several clinical studies [8, 23, 28]. However, our results suggest that the effect of chemotherapy may be more complex than simply boosting pre-existing immune attack. First, we find that where the density of lymphocytes is increased following treatment, fewer tumours undergo pCR. That is, in this subset of around one quarter of patients, tumour cells apparently continue to resist the effects of immune attack in spite of its increased intensity. Second, we find that this effect is significantly greater where a taxane (paclitaxel) was administered after, as opposed to before, other agents. In the Neo-tAnGo trial, giving paclitaxel before the other agents significantly increased the proportion of cases with pCR [15]. While the association between increased post-treatment lymphocyte density and treatment resistance holds whether paclitaxel is given first or second, the effect is significantly larger in cases where it is given second. Collectively, these findings raise the possibility that not only is a chemotherapy-stimulated immune response not universally effective, but that the efficacy of this response can be influenced by the sequence in which tumour cells are exposed to different chemotherapeutic agents, most notably taxanes and anthracyclines.

The existence of a substantial subgroup of relatively resistant tumours in which lymphocyte density is increased following chemotherapy further suggests a clinical opportunity. The variability in the immune response between primary breast tumours is well-known and recent genomic analyses suggest that some of this difference may be explained by the mutational burden of the primary tumour [29]. Analyses of clinical trials of immune checkpoint inhibitors report that responses are best where there is a significant pre-existing immune response to the primary tumour [30, 31]. Given that a large subset of breast tumours evoke only a mild immune response if any [23, 32], methods for increasing immune attack against immunologically quiescent tumours are needed. Therefore the subgroup we have identified may benefit from receiving chemotherapy first, to increase the immune response, followed by immune checkpoint inhibitors to amplify its effect.

Digitization of pathology slides from clinical trials affords the important advantages of providing an enduring archive of tumour pathology and the opportunity for systematic image analysis. We anticipate analyses such as ours becoming more common as digital pathology is implemented more widely in clinical trials. We provide a valuable resource of digital pathology images to the research community, together with all our image-analysis codes. In addition, linked multiplatform genomic annotation (gene expression, copy number, targeted sequencing) will be made available for a subset of cases, following primary reporting of the data. While invaluable resources such as the Human Protein Atlas already provide access to an enormous array of tissue images [33], some already utilized in translational studies [34, 35], widely available digital images of tumour tissue from large high-quality clinical studies with molecular annotation such as ours are currently exceedingly rare. However, general access to such resources will be necessary to fulfil the potential of computational pathology as a novel modality that spans the research and clinical arenas.


This data-driven analysis of computational pathology metrics reveals that median lymphocyte density is an independent predictor of response to neoadjuvant chemotherapy in breast cancer. Paradoxically, we find that an increase in the density of lymphocytes following chemotherapy is associated with relative resistance to chemotherapy. Computational pathology is a novel and quantitative method for interrogation of tumour tissues, which can improve prediction of clinical endpoints.



confidence interval


oestrogen receptor


formalin-fixed paraffin-embedded


haematoxylin and eosin


human epidermal growth factor receptor 2


odds ratio


pathological complete response


support vector machine


  1. Killelea BK, Yang VQ, Mougalian S, Horowitz NR, Pusztai L, Chagpar AB, et al. Neoadjuvant chemotherapy for breast cancer increases the rate of breast conservation: results from the National Cancer Database. J Am Coll Surg. 2015;220(6):1063–9.

    Article  PubMed  Google Scholar 

  2. Cortazar P, Zhang L, Untch M, Mehta K, Costantino JP, Wolmark N, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164–72.

    Article  PubMed  Google Scholar 

  3. Berruti A, Amoroso V, Gallo F, Bertaglia V, Simoncini E, Pedersini R, et al. Pathologic complete response as a potential surrogate for the clinical outcome in patients with breast cancer after neoadjuvant therapy: a meta-regression of 29 randomized prospective studies. J Clin Oncol. 2014;32(34):3883–91.

    Article  PubMed  Google Scholar 

  4. Houssami N, Macaskill P, von Minckwitz G, Marinovich ML, Mamounas E. Meta-analysis of the association of breast cancer subtype and pathologic complete response to neoadjuvant chemotherapy. Eur J Cancer. 2012;48(18):3342–54.

    Article  CAS  PubMed  Google Scholar 

  5. Hatzis C, Pusztai L, Valero V, Booser DJ, Esserman L, Lluch A, et al. A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. JAMA. 2011;305(18):1873–81.

    Article  CAS  PubMed  Google Scholar 

  6. Denkert C, Loibl S, Noske A, Roller M, Muller BM, Komor M, et al. Tumor-associated lymphocytes as an independent predictor of response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol. 2010;28(1):105–13.

    Article  CAS  PubMed  Google Scholar 

  7. Denkert C, von Minckwitz G, Brase JC, Sinn BV, Gade S, Kronenwett R, et al. Tumor-infiltrating lymphocytes and response to neoadjuvant chemotherapy with or without carboplatin in human epidermal growth factor receptor 2-positive and triple-negative primary breast cancers. J Clin Oncol. 2015;33(9):983–91.

    Article  CAS  PubMed  Google Scholar 

  8. Loi S, Sirtaine N, Piette F, Salgado R, Viale G, Van Eenoo F, et al. Prognostic and predictive value of tumor-infiltrating lymphocytes in a phase III randomized adjuvant breast cancer trial in node-positive breast cancer comparing the addition of docetaxel to doxorubicin with doxorubicin-based chemotherapy: BIG 02-98. J Clin Oncol. 2013;31(7):860–7.

    Article  CAS  PubMed  Google Scholar 

  9. Salgado R, Denkert C, Demaria S, Sirtaine N, Klauschen F, Pruneri G, et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann Oncol. 2015;26(2):259–71.

    Article  CAS  PubMed  Google Scholar 

  10. Adams S, Gray RJ, Demaria S, Goldstein L, Perez EA, Shulman LN, et al. Prognostic value of tumor-infiltrating lymphocytes in triple-negative breast cancers from two phase III randomized adjuvant breast cancer trials: ECOG 2197 and ECOG 1199. J Clin Oncol. 2014;32(27):2959–66.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Ocana A, Diez-Gonzalez L, Adrover E, Fernandez-Aramburo A, Pandiella A, Amir E. Tumor-infiltrating lymphocytes in breast cancer: ready for prime time? J Clin Oncol. 2015;33(11):1298–9.

    Article  PubMed  Google Scholar 

  12. Tsoutsou PG, Bourhis J, Coukos G. Tumor-infiltrating lymphocytes in triple-negative breast cancer: a biomarker for use beyond prognosis? J Clin Oncol. 2015;33(11):1297–8.

    Article  PubMed  Google Scholar 

  13. Yuan Y, Failmezger H, Rueda OM, Ali HR, Graf S, Chin SF, et al. Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling. Sci Transl Med. 2012;4(157):157ra143.

    Article  PubMed  Google Scholar 

  14. Beck AH, Sangoi AR, Leung S, Marinelli RJ, Nielsen TO, van de Vijver MJ, et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci Transl Med. 2011;3(108):108ra113.

    Article  PubMed  Google Scholar 

  15. Earl HM, Vallier AL, Hiller L, Fenwick N, Young J, Iddawela M, et al. Effects of the addition of gemcitabine, and paclitaxel-first sequencing, in neoadjuvant sequential epirubicin, cyclophosphamide, and paclitaxel for women with high-risk early breast cancer (Neo-tAnGo): an open-label, 2x2 factorial randomised phase 3 trial. Lancet Oncol. 2014;15(2):201–12.

    Article  CAS  PubMed  Google Scholar 

  16. Provenzano E, Vallier AL, Champ R, Walland K, Bowden S, Grier A, et al. A central review of histopathology reports after breast cancer neoadjuvant chemotherapy in the neo-tango trial. Br J Cancer. 2013;108(4):866–72.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ali HR, Irwin M, Morris L, Dawson SJ, Blows FM, Provenzano E, et al. Astronomical algorithms for automated analysis of tissue protein expression in breast cancer. Br J Cancer. 2013;108(3):602–12.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. BMC Med. 2012;10:51.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Donovan MJ, Hamann S, Clayton M, Khan FM, Sapir M, Bayer-Zubek V, et al. Systems pathology approach for the prediction of prostate cancer progression after radical prostatectomy. J Clin Oncol. 2008;26(24):3923–9.

    Article  PubMed  Google Scholar 

  20. Cordon-Cardo C, Kotsianti A, Verbel DA, Teverovskiy M, Capodieci P, Hamann S, et al. Improved prediction of prostate cancer recurrence through systems pathology. J Clin Invest. 2007;117(7):1876–83.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346–52.

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Ignatiadis M, Singhal SK, Desmedt C, Haibe-Kains B, Criscitiello C, Andre F, et al. Gene modules and response to neoadjuvant chemotherapy in breast cancer subtypes: a pooled analysis. J Clin Oncol. 2012;30(16):1996–2004.

    Article  CAS  PubMed  Google Scholar 

  23. Ali HR, Provenzano E, Dawson SJ, Blows FM, Liu B, Shah M, et al. Association between CD8+ T-cell infiltration and breast cancer survival in 12 439 patients. Ann Oncol. 2014;25(8):1536–43.

    Article  CAS  PubMed  Google Scholar 

  24. Ladoire S, Mignot G, Dabakuyo S, Arnould L, Apetoh L, Rébé C, et al. In situ immune response after neoadjuvant chemotherapy for breast cancer predicts survival. J Pathol. 2011;224(3):389–400.

    Article  CAS  PubMed  Google Scholar 

  25. Garcia-Martinez E, Gil GL, Benito AC, Gonzalez-Billalabeitia E, Conesa MA, Garcia Garcia T, et al. Tumor-infiltrating immune cell profiles and their change after neoadjuvant chemotherapy predict response and prognosis of breast cancer. Breast Cancer Res. 2014;16(6):488.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Mattarollo SR, Loi S, Duret H, Ma Y, Zitvogel L, Smyth MJ. Pivotal role of innate and adaptive immunity in anthracycline chemotherapy of established tumors. Cancer Res. 2011;71(14):4809–20.

    Article  CAS  PubMed  Google Scholar 

  27. Sistigu A, Yamazaki T, Vacchelli E, Chaba K, Enot DP, Adam J, et al. Cancer cell-autonomous contribution of type I interferon signaling to the efficacy of chemotherapy. Nat Med. 2014;20(11):1301–9.

    Article  CAS  PubMed  Google Scholar 

  28. West NR, Milne K, Truong PT, Macpherson N, Nelson BH, Watson PH. Tumor-infiltrating lymphocytes predict response to anthracycline-based chemotherapy in estrogen receptor-negative breast cancer. Breast Cancer Res. 2011;13(6):R126.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell. 2015;160(1-2):48–61.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Herbst RS, Soria JC, Kowanetz M, Fine GD, Hamid O, Gordon MS, et al. Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients. Nature. 2014;515(7528):563–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Powles T, Eder JP, Fine GD, Braiteh FS, Loriot Y, Cruz C, et al. MPDL3280A (anti-PD-L1) treatment leads to clinical activity in metastatic bladder cancer. Nature. 2014;515(7528):558–62.

    Article  CAS  PubMed  Google Scholar 

  32. Ali HR, Glont SE, Blows FM, Provenzano E, Dawson SJ, Liu B, et al. PD-L1 protein expression in breast cancer is rare, enriched in basal-like tumours and associated with infiltrating lymphocytes. Ann Oncol. 2015;26(7):1488–93.

    CAS  PubMed  Google Scholar 

  33. Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220):1260419.

    Article  PubMed  Google Scholar 

  34. Oh EY, Christensen SM, Ghanta S, Jeong JC, Bucur O, Glass B, et al. Extensive rewiring of epithelial-stromal co-expression networks in breast cancer. Genome Biol. 2015;16(1):128.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Kumar A, Rao A, Bhavani S, Newberg JY, Murphy RF. Automated analysis of immunohistochemistry images identifies candidate location biomarkers for cancers. Proc Natl Acad Sci U S A. 2014;111(51):18249–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank the patients who enrolled in the Neo-tAnGo trial and permitted use of their tissue for research. We are grateful to the many healthcare professionals who recruited patients to the trial and enabled the conduct of translational research by provision of bio specimens. We are grateful to the Histopathology Core Facility at the CRUK Cambridge Institue for digital slide archiving. We acknowledge funding from Cancer Research UK and NIHR Cambridge Biomedical Research Centre. HRA is an NIHR Academic Clinical Lecturer supported by a Career Development Fellowship from the Pathological Society of Great Britain and Northern Ireland and a Starter Grant for Clinical Lecturers from the Academy of Medical Sciences.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Helena M. Earl or Carlos Caldas.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

HRA and CC conceived of and designed the study. AD and HRA conducted image analysis. HRA conducted statistical analyses. NW and MJI supervised image analysis methods. HB conducted slide scanning and associated data curation. EP was the lead trial pathologist and conducted pathology review of pre-treatment biopsies. LH was the trial statistician and provided matched, curated clinical data. HME, A-LV, LH, JD, SB and CC led the Neo-tAnGo trial. JA, MI, TH, KM and SH recruited patients and provided samples. HRA, AD, CC, PDP, JB, MI and NW led collaborative studies between Oncology and Astronomy. HRA and CC wrote the manuscript and all authors provided edits and agreed to its publication.

Additional files

Additional file 1:

Supplementary methods. (DOCX 38 kb)

Additional file 2:

CONSORT diagram illustrating the flow of patients at each analytical stage. (PDF 7 kb)

Additional file 3:

Histograms depicting the distribution of all image metrics from pre-treatment samples. (PDF 25 kb)

Additional file 4:

Histograms depicting the distribution of all image metrics from post-treatment surgical samples. (PDF 26 kb)

Additional file 5:

Correlation matrix of all image metrics from pre-treatment samples. Correlation matrix of 15 image metrics derived from pre-treatment biopsies. Correlations are Pearson’s coefficients. Text size is proportional to the strength of the correlation. Pink boxes denote positive correlations and blue boxes denote negative correlations. (PDF 82 kb)

Additional file 6:

Correlation matrix of all image metrics from post-treatment surgical samples. Correlation matrix of 15 image metrics derived from post-treatment surgical samples. Correlations are Pearson’s coefficients. Text size is proportional to the strength of the correlation. Pink boxes denote positive correlations and those blue boxes denote negative correlations. (PDF 82 kb)

Additional file 7:

Distribution of post-treatment sample image metrics by tumor molecular subtype. Horizontal gray lines represent median values. Results of Kruskal-Wallis tests are depicted within graphs; red text denotes p values <0.05. (PDF 177 kb)

Additional file 8:

Univariate and multivariate logistic regression models for all 15 image metrics. (DOCX 16 kb)

Additional file 9:

Multivariate logistic regression model including clinical prognostic factors and image metrics derived from model depicted in Additional File 8. (DOCX 14 kb)

Additional file 10:

Scatter plot of median lymphocyte density and lymphocyte fraction in pre-treatment biopsies. Depicted rho value is a Pearson’s correlation coefficient. a.u. arbitrary units. (PDF 19 kb)

Additional file 11:

Examples of images and associated metrics from cases with varying median lymphocyte density. Depicted density metrics have been arbitrarily offset to depict positive values but are normalized across samples, hence comparable between plots (right y-axis). Note that absolute counts are depicted against different scales (left y-axis). (PDF 63 kb)

Additional file 12:

Distribution of median lymphocyte density from pre-treatment biopsies by clinical variables. Horizontal gray lines represent median values. Results of Kruskal-Wallis tests are depicted within graphs; red text denotes p values <0.05. (PDF 62 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ali, H.R., Dariush, A., Provenzano, E. et al. Computational pathology of pre-treatment biopsies identifies lymphocyte density as a predictor of response to neoadjuvant chemotherapy in breast cancer. Breast Cancer Res 18, 21 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: