- Research article
- Open Access
On-treatment biomarkers can improve prediction of response to neoadjuvant chemotherapy in breast cancer
Breast Cancer Researchvolume 21, Article number: 73 (2019)
Neoadjuvant chemotherapy is increasingly given preoperatively to shrink breast tumours prior to surgery. This approach also provides the opportunity to study the molecular changes associated with treatment and evaluate whether on-treatment sequential samples can improve response and outcome predictions over diagnostic or excision samples alone.
This study included a total of 97 samples from a cohort of 50 women (aged 29–76, with 46% ER+ and 20% HER2+ tumours) with primary operable breast cancer who had been treated with neoadjuvant chemotherapy. Biopsies were taken at diagnosis, at 2 weeks on-treatment, mid-chemotherapy, and at resection. Fresh frozen samples were sequenced with Ion AmpliSeq Transcriptome yielding expression values for 12,635 genes. Differential expression analysis was performed across 16 patients with a complete pathological response (pCR) and 34 non-pCR patients, and over treatment time to identify significantly differentially expressed genes, pathways, and markers indicative of response status. Prediction accuracy was compared with estimations of established gene signatures, for this dataset and validated using data from the I-SPY 1 Trial.
Although changes upon treatment are largely similar between the two cohorts, very few genes were found to be consistently different between responders and non-responders, making the prediction of response difficult. AAGAB was identified as a novel potential on-treatment biomarker for pathological complete response, with an accuracy of 100% in the NEO training dataset and 78% accuracy in the I-SPY 1 testing dataset. AAGAB levels on-treatment were also significantly predictive of outcome (p = 0.048, p = 0.0036) in both cohorts. This single gene on-treatment biomarker had greater predictive accuracy than established prognostic tests, Mammaprint and PAM50 risk of recurrence score, although interestingly, both of these latter tests performed better in the on-treatment rather than the accepted pre-treatment setting.
Changes in gene expression measured in sequential samples from breast cancer patients receiving neoadjuvant chemotherapy resulted in the identification of a potentially novel on-treatment biomarker and suggest that established prognostic tests may have greater prediction accuracy on than before treatment. These results support the potential use and further evaluation of on-treatment testing in breast cancer to improve the accuracy of tumour response prediction.
Chemotherapy is among the most common effective treatments for breast cancer, alongside radiotherapy, hormone therapy, and targeted treatments. Neoadjuvant chemotherapy is given prior to surgery with the aim to reduce the tumour burden and to provide early information on the response to treatment . Studies have shown patients with tumours that have a pathological complete response (pCR) following neoadjuvant chemotherapy are much less likely to recur than those in women with residual disease . Neoadjuvant chemotherapy is now considered as the standard of care in breast cancer and has seen a rise in recent years with data from powered studies suggesting that the pathological complete response achieved following neoadjuvant chemotherapy might be a surrogate of good prognosis . A recent meta-analysis also showed significant tumour response and an increase in the rate of breast-conserving surgery following NACT with good rates of long-term local recurrence (5.5% vs. 15.9% adjuvant chemotherapy), however with an increase in the rate of short-term local relapses (1.35 RR 0–4 years, 1.53 RR 5–9 years) .
Neoadjuvant treatment provides a “window of opportunity” (Fig. 1a), where sequential sampling of a tumour enables observation of the changes that occur in response to treatment to be measured and considered in the context of response and outcome . Neoadjuvant therapy studies and pre-surgical treatments allow for a unique in vivo analysis of tumour treatment response , as well as the possibility of predicting the response to treatment earlier in the treatment . It has been suggested that on-treatment biomarkers may be superior to those measured before exposure to treatment [3, 7]. On-treatment information has already been shown to be informative for the accurate prediction of response to endocrine therapy . Here, it was found that patients with elevated Ki67 levels (higher than 10%) at 2 or 4 weeks exhibited resistance to endocrine therapy and were triaged to neoadjuvant chemotherapy . We have also demonstrated the potential of on-treatment biomarkers by developing a four-gene signature which combined pre-treatment expression levels or two biomarkers (IL6ST and NGFRAP1) with patient-matched 2-week on-treatment expression levels of two proliferation markers (ASPM, MCM4) to accurately predict the response to endocrine therapy in a blinded independent validation set .
Gene expression-based studies of neoadjuvant chemotherapy treatment to date have largely been limited to studying the association of pre-treatment samples with pathological response [9, 10]. Patient-matched sequential sampling gene expression studies have been previously attempted; however, they have not evaluated the predictive capacity or proposed new on-treatment predictive biomarkers [11,12,13].
In this study, we present the largest sequentially sampled patient-matched analysis of neoadjuvant chemotherapy-treated breast cancer tumours to evaluate whether on-treatment biomarkers can improve the accuracy of predicting response before resection. Numbers of patients with sequential breast tumour samples are limited, but we compare and validate our results with the data from the I-SPY 1 Trial.
Materials and methods
Patients, response criteria, and samples
The NEO study consists of 50 breast cancer patients with sequentially sampled biopsies at four time points, pre-treatment (PT, 34 samples), 2 weeks on treatment (T2, 12 samples), mid-chemo (TM, 23 samples), and at surgical resection (TS, 24 samples) with three clinically defined response statuses: complete responders (pCR by resection), good responders (tumour volume reduction, but lack of pCR), and non-responders (progressive disease or small tumour volume changes on treatment). Patients were of mixed histological grade and HER2 status; ages ranged from 29 to 76. Patients were primarily treated with 3 cycles of FEC and docetaxel with Herceptin where appropriate. Three patients received paclitaxel, one patient received additional carboplatin, one patient received Epi-cyclophosphamide and paclitaxel, and one patient received docetaxel and cyclophosphamide. Eligible patients were women with histologically confirmed invasive breast tumours and with no evidence of distant metastatic disease, no prior history of malignancy, and fit enough to receive chemotherapy in the opinion of the responsible clinician irrespective of age. All cases were discussed at the breast MDM in Edinburgh Breast Unit at the Western General Hospital, and consensus from this meeting was to be treated with neoadjuvant chemotherapy.
Core needle (16-gauge) biopsies were taken from the primary breast tumours before treatment (PT) and between 10 and 14 days after the first dose (T2) of chemotherapy. A third sample was taken at the mid-chemotherapy point day 20–21 (TM), and finally, a core biopsy was taken from the excision specimen (TS) after it has been removed prior to submission to pathology. Fixed and frozen samples of normal and tumour tissue were collected from all specimens.
Gene expression profiling
RNA extraction was performed via Ribo0-RNAseq, and whole transcriptome sequencing was performed with Life Sciences Ion AmpliSeq™ Transcriptome Human Gene Expression Kit. This generated greater than 8 M reads per sample with an average of more than 90% valid reads for 12,365 targeted genes. Most analyses were performed in R (http://www.r-project.org) using packages available through CRAN (http://cran.r-project.org/) and Bioconductor (http://www.bioconductor.org/). Outside of the R environment, the stand-alone application Multiple Experiment Viewer (http://mev.tm4.org/) was utilised for pairwise ranked product feature selection, and DAVID (https://david.ncifcrf.gov/) was used for pathway identification. Additionally, the python package scikit-learn  was used for unsupervised clustering analysis. Ninety-seven samples were analysed over 13 AmpliSeq chips, but no systematic batch effects were evident and no batch correction was performed within the training data. Gene expression data for the NEO study has been made publicly available at the NCBI GEO data repository under accession GSE122630.
The I-SPY 1 Trial is composed of patients with invasive breast cancer > 3 cm, or at least one tumour-positive axillary lymph node . Patients were treated with an anthracycline-based chemotherapy followed by taxanes . Samples were normalised and corrected for background red/green signal; Bioconductor R packages marray and limma  were used to this end. From the original 221 patients, only 36 had matching pre- and on-treatment samples, and 39 had matching biopsy and excision samples; pathological complete response was used for response criteria. Pairwise gene expression was handled with SAM and follow-up analysis with Ingenuity Pathway Analysis from QIAGEN Bioinformatics. I-SPY 1 Trial data is hosted at NCBI GEO under accession GSE32603 .
Statistical analysis methods
Principal component analysis (PCA) was performed on unsupervised gene lists to reduce dimensionality and visualise differences in response at all times and to identify present differences between patient treatment statuses. Local Fisher discriminant analysis (LFDA)  was used at each time point to determine if the response groups could be distinguished with treatment time with a semi-supervised clustering approach, concurrently with class advised K-means clustering. LFDA is a form of supervised dimensionality reduction that maximises between-class scattering and minimises within class scatter, and is a refined version of normal Fisher discriminant analysis ; this exploratory analysis was used in order to visualise comparative differences in treatment time, not as a means of feature selection. Pair-wise significance analysis of microarrays  using the siggenes package in R was used to consider the consistency of differentially expressed genes due to treatment in the sequential patient-matched samples. Rank Product analysis was used to identify differentially expressed genes between response classes at each time point. Successive levels of standard p value (0.05, 0.01, 0.001), without correction for multiple testing, were used in order to determine the number of differentially expressed genes, and at lower p values which the time points had the most strongly differentiating genes. Significance analysis of microarrays was also performed using varying false discovery rates (1%, 5%, 10%) to try to identify common differentially expressed genes between responders and non-responders across both datasets at each time point. Gene score enrichment analysis was used to validate the time point selection by looking for the highest number of enriched pathways. The gene list from the most differential time point (TM) using the NEO dataset was extracted and used in a random forest model (10,000 trees, m-try as the square root of the feature number) using pCR status as the class label (clinician-identified pCR and non-pCR). The most deterministic genes for class prediction were fed into a classification and regression tree in order to produce a maximally reduced and repeatable model; this methodology is further described by Turnbull et al. . The CART decision tree was applied to the NEO dataset for training and tested in the independent I-SPY 1 dataset using the same cut-points determined by mean-centring the datasets. This protocol was repeated using the gene list from the pre-treatment only samples, using the same p values and tree configurations for selection. Survival analysis was performed at different time points using the log-rank test. Intrinsic subtypes, Mammaprint, and risk or relapse scores were estimated from the gene expression data using the GeneFu R package .
Gene expression differences between responding and non-responding breast cancer tumours treated with chemotherapy are subtle and time dependent
Unsupervised principal component analysis was first used to assess whether sequential patient-matched samples from patients receiving chemotherapy (Fig. 1b) would cluster by time point or response status. There was no significant grouping of patients according to sampling time: pre, early, or later after chemotherapy in either the NEO or I-SPY 1 studies (Fig. 1b). There were no significant differences between the two cohorts in terms of age, grade hormone receptor, and HER2 status, and the subset of patients with mid-chemo samples was not significantly different from the whole NEO cohort (Table 1). Patient-matched samples enable the pairwise analysis to look for consistent changes in the gene expression during treatment. Pairwise significance analysis of microarray analysis using a 10% false discovery rate (FDR) identified a relatively small proportion of overlapping upregulated (5%) and downregulated (4%) genes between the two studies. However, genes that were increased or decreased in response to treatment in one study were also clearly and consistently increased or decreased in the other study (Additional file 1: Figure S1A), further suggesting it would be difficult to discriminate responders from non-responders. Indeed, there was no clustering by response status before or during treatment (Additional file 1: Figure S1B). These results likely reflect the considerable inter-patient differences being substantially larger and more significant than the subtler commonalities in gene expression of a particular time point or response class of each tumour. More encouragingly, semi-supervised LFDA of each time point revealed significant separation on-treatment that was not apparent in pre-treatment samples; this indicated that there are meaningful differences between the classes, as early as 2 weeks on-treatment (Fig. 2a). Complete responders and non-responsive patients were more clearly separated than partially responding patients. These results suggest that there is a potentially greater predictive value looking at on-treatment than pre-treatment biomarkers.
Responding and non-responding tumours are more different upon exposure to chemotherapy
In an attempt to quantify the molecular differences between the response groups at each time point, rank product analysis was performed at different standard p values (0.05, 0.01, and 0.001). This approach was hampered by different numbers of samples at each time point (with T2 having very few samples); however, the number of genes differentially expressed at all p values tended to be greater during rather than before treatment (Fig. 2b). Similar results were also seen using 1%, 5%, and 10% FDR (Fig. 2b). The biggest differences between the response classes were at TM (mid-chemo), which agrees with the LDFA results, which showed the least amount of overlap of the response classes at TM. Gene set enrichment analysis across the response classes at each time point also demonstrated more enriched pathways after 2 weeks of treatment (29), mid-chemo (30), and resection (29), compared to pre-treatment (18) (Additional file 2: Figure S2A). Next, we sought to examine common differentially expressed genes between responders and non-responders across the two datasets. Far more genes were commonly significantly differentially expressed (FDR = 10% between responders and non-responders on-treatment in the NEO and I-SPY 1 datasets compared with pre-treatment. In accordance with the LFDA results, more significantly differentially expressed genes (1814) were observed between on-treatment samples, with 6% (197), but only one was common between NEO and I-SPY pre-treatment (Additional file 2: Figure S2B and Additional file 4). Examination of the 468 most significantly differentially expressed genes (p < 0.001) between responders and non-responders in the NEO dataset at mid-chemo did not clearly distinguish between response groups or time points illustrated by the heatmap in Additional file 3: Figure S3, further demonstrating that identifying biomarkers of response to chemotherapy is very difficult.
We were also keen to evaluate whether the intrinsic subtype assigned to tumours would alter upon treatment. Looking at the NEO and I-SPY datasets, together we found that basal tumours were relatively stable with only 2/19 (11%) tumours changing. More tumours were classified as Luminal A or normal-like on-treatment, which likely reflects a reduction in the expression of proliferation genes during chemotherapy (Fig. 2c).
AAGAB is a promising potential novel on-treatment biomarker of response to chemotherapy
The mid-chemo gene list from the NEO dataset (1102 genes, unadjusted p value = 0.01) was fed to a random forest model for further feature selection and classification and regression tree (CART) model, which reported AAGAB as the most predictive gene for response prediction in the NEO training dataset with 100% accuracy for pCR prediction on the mid-chemo samples (Fig. 3a). Validation was conducted completely independently on publicly available sequentially sampled chemotherapy data from the I-SPY 1 Trial  and reported 76% accuracy using AAGAB at the same expression level on the scaled and centred expression data at the on-treatment time point prior to resection (T2). For comparison, the pre-treatment only sample gene lists were put through the same protocol in order to consider whether highly predictive models could be generated before chemotherapy. IGF1R was the most predictive pre-treatment marker with an accuracy of 74% and 63% in the NEO and I-SPY datasets, respectively (Table 2). AAGAB was the sixth most accurate predictor (65%, 57%); receiver operator curves show the relative specificity and sensitivity of this marker pre- and on-treatment (Fig. 3b). Gene expression levels of AAGAB were lower in responders across all time points in the NEO cohort but were most significantly different at mid-chemo. In the I-SPY dataset, AAGAB was significantly lower before treatment and at excision (Fig. 3c). We wondered whether AAGAB was lower in responders due to a reduction in proliferation, but Pearson correlation analysis with common proliferation-associated genes (TOP2A, BUB1, MKI67, MCM2, FOXM1, and PCNA) demonstrated no significant correlation to any of these genes (Fig. 3d), suggesting that AAGAB is independent of proliferation. Survival analysis demonstrated that response status predicted by AAGAB level, at mid chemo in the NEO study and at 2 weeks in the I-SPY 1, was significantly associated with the outcome (NEO p = 0.048, I-SPY 1 p = 0.0036) (Fig. 3e). Interestingly, the level of AAGAB before treatment was not associated with the outcome in either cohort (p = 0.71 and p = 0.2, Fig. 3e). None of the other top 10 pre- or on-treatment markers was significantly associated with the outcome in both datasets (Table 2); only one gene (ARF5) was associated with the outcome in the NEO dataset (p = 0.004). Taken together, the single gene on-treatment biomarker AAGAB appears to outperform novel pre-treatment markers and established prognostic tests in predicting pCR and long-term outcome to chemotherapy.
Comparison of pre- and on-treatment predictions of response and outcome
We were also keen to assess whether estimations of established prognostic signatures might be different upon treatment and if on-treatment might be more accurate. All and almost all responding patients were predicted to have poor outcomes with the estimated Mammaprint , PAM50 , or rorS  signatures in pre-treatment samples of the NEO cohort, whereas around half of the responding patients were predicted as good outcome using on-treatment data (Fig. 4a). Overall accuracy improved by 2–8% using on- rather than pre-treatment data; however, improvement in the predictive power of these tests was not uniform between response classes. Good outcome predictions for responders to neoadjuvant chemotherapy saw an aggregate increase in predictive power from 11 to 44.4%, whilst poor outcome predictions for non-responders saw a moderate decrease in accuracy, 75 to 63%. None of the gene expression signatures either pre- or on-treatment or established prognostic markers (NPI, Grade, Her2 status) was significantly associated with the outcome in contrast to the remarkable performance of on-treatment measurement of AAGAB (Fig. 4b).
Determining molecular differences between tumours to select the most effective treatment is the defining feature of precision oncology. Accurately predicting which patients will respond to treatment before exposure relies on a highly specific target. In breast cancer, ER status is a good indicator of response to endocrine treatment, but resistance, both primary and acquired, is common. Chemotherapy is an unselective treatment, relying on cancer cells growing faster than normal cells. The results presented here, along with others [7, 8], suggest on-treatment biomarkers have improved value in predicting whether tumours respond to treatment and are associated with the outcome. Changes in gene expression in sequential patient-matched were fairly consistent in response to chemotherapy across two independent datasets, regardless of the response status. Identifying molecular markers between responding and non-responding tumours was much more challenging. We previously demonstrated that lobular and ductal breast cancers respond to endocrine treatment in the same way, despite clear histological and molecular distinctions that are apparent and maintained on-treatment , demonstrating that pre-treatment variations do not necessarily lead to differences in response. The results of this study are somewhat exploratory, rather than definitive, but further illustrate the considerable potential value of on-treatment sampling.
There are no universally agreed-upon markers predictive of response to chemotherapy, and the few that have been investigated in the neoadjuvant setting typically centre around established markers including ER, P53 HER2, and Ki-67 ; thus, the introduction of new novel biomarkers can expand the currently available clinical options for physicians. A study published over a decade ago stated that the differences in gene expression between responders and non-responders to neoadjuvant chemotherapy must be rather subtle . The results presented here confirm this statement; however, our results suggest that on-treatment biomarkers may provide important information for predicting response.
As cancer is inherently a proliferative disease, measuring the change in markers of proliferation on-treatment is logical and genes like ki-67 have been demonstrated previously to be potentially a new clinical tool for disease prognosis and prediction [24, 25]. It is therefore all the more interesting that the potentially novel biomarker identified in this study, AAGAB is not tightly correlated with known markers of proliferation. AAGAB has primarily been studied for its role in punctate palmoplantar keratoderma  and the role of adaptin in the clathrin-independent endocytosis of epidermal growth factors. The level of AAGAB was found to be prognostic of response (p < 0.001) in renal cancers (favourably) and in thyroid cancers (unfavourably) from the TCGA study, and expression is elevated in breast cancer, relative to the normal breast (p < 0.001). However, the exact role of AAGAB in breast cancer is currently unclear and potentially warrants further investigation. Clearly, further validation of the role of AAGAB in breast cancer is warranted and will be performed as new neoadjuvant chemotherapy datasets become available. This study supports the use and identification of genes or markers from on-treatment biopsies as a tool for improving patient response classification. We propose that the use of on-treatment samples offers valuable insight into the dynamic changes correlated with response, and submit our findings as support for continued neoadjuvant sampling, and novel biomarker generation.
We have identified AAGAB as a novel on-treatment biomarker for accurate prediction of pCR and outcome in patients treated with neoadjuvant chemotherapy. A semi-supervised analysis and evaluation of estimations of established molecular signatures also highlight the potential value of on-treatment biomarkers. Combining on-treatment biomarkers with known clinical prognostic factors could further improve the accuracy of response predictions and deserve further study. On-treatment expression changes in the neoadjuvant setting may offer greater possibilities for the identification and creation of more future novel biomarkers.
Availability of data and materials
Gene expression data associated from the study has been made publicly available at NCBI GEO under the accession number GSE122630.
Schott AF, Hayes DF. Defining the benefits of neoadjuvant chemotherapy for breast cancer. J Clin Oncol. 2012;30:1747–9.
Esserman LJ, Berry DA, DeMichele A, Carey L, Davis SE, Buxton M, Hudis C, Gray JW, Perou C, Yau C, Livasy C, Krontiras H, Montgomery L, Tripathy D, Lehman C, Liu MC, Olopade OI, Rugo HS, Carpenter JT, Dressler L, Chhieng D, Singh B, Mies C, Rabban J, Chen Y-Y, Giri D, van ‘t Veer L, Hylton N. Pathologic complete response predicts recurrence-free survival more effectively by cancer subset: results from the I-SPY 1 TRIAL--CALGB 150007/150012, ACRIN 6657. J Clin Oncol. 2012;30:3242–9.
Untch M, Konecny GE, Paepke S, von Minckwitz G. Current and future role of neoadjuvant therapy for breast cancer. Breast. 2014;23:526–37.
Early Breast Cancer Trialists’ Collaborative Group (EBCTCG). Long-term outcomes for neoadjuvant versus adjuvant chemotherapy in early breast cancer: meta-analysis of individual patient data from ten randomised trials. Lancet Oncol. 2018;19:27–39.
Sims AH, Bartlett JM. Approaches towards expression profiling the response to treatment. Breast Cancer Res. 2008;10:115.
Macaskill EJ, Dixon JM. Neoadjuvant endocrine therapy. In: Kuerer HM, editor. Breast surgical oncology. New York: McGraw Hill Medical; 2010.
Turnbull AK, Arthur LM, Renshaw L, Larionov AA, Kay C, Dunbier AK, Thomas JS, Dowsett M, Sims AH, Dixon JM. Accurate prediction and validation of response to endocrine therapy in breast cancer. J Clin Oncol. 2015;33:2270–8.
Ellis MJ, Suman VJ, Hoog J, Goncalves R, Sanati S, Creighton CJ, DeSchryver K, Crouch E, Brink A, Watson M, Luo J, Tao Y, Barnes M, Dowsett M, Budd GT, Winer E, Silverman P, Esserman L, Carey L, Ma CX, Unzeitig G, Pluard T, Whitworth P, Babiera G, Guenther JM, Dayao Z, Ota D, Leitch M, Olson JA, Allred DC, et al. Ki67 proliferation index as a tool for chemotherapy decisions during and after neoadjuvant aromatase inhibitor treatment of breast cancer: results from the American College of Surgeons Oncology Group Z1031 Trial (Alliance). J Clin Oncol. 2017;35:1061–9.
Sorlie T, Perou CM, Fan C, Geisler S, Aas T, Nobel A, Anker G, Akslen LA, Botstein D, Borresen-Dale AL, Lonning PE. Gene expression profiles do not consistently predict the clinical treatment response in locally advanced breast cancer. Mol Cancer Ther. 2006;5:2914–8.
Hatzis C, Pusztai L, Valero V, Booser DJ, Esserman L, Lluch A, Vidaurre T, Holmes F, Souchon E, Wang H, Martin M, Cotrina J, Gomez H, Hubbard R, Chacón JI, Ferrer-Lozano J, Dyer R, Buxton M, Gong Y, Wu Y, Ibrahim N, Andreopoulou E, Ueno NT, Hunt K, Yang W, Nazario A, DeMichele A, O’Shaughnessy J, Hortobagyi GN, Symmans WF. A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. JAMA. 2011;305:1873–81.
Magbanua MJM, Wolf DM, Yau C, Davis SE, Crothers J, Au A, Haqq CM, Livasy C, Rugo HS, Esserman L, Park JW, van ‘t Veer LJ. Serial expression analysis of breast tumors during neoadjuvant chemotherapy reveals changes in cell cycle and immune pathways associated with recurrence and response. Breast Cancer Res. 2015;17:73.
Hannemann J, Oosterkamp HM, Bosch CAJ, Velds A, Wessels LFA, Loo C, Rutgers EJ, Rodenhuis S, van de Vijver MJ. Changes in gene expression associated with response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol. 2005;23:3331–42.
Gonzalez-Angulo AM, Iwamoto T, Liu S, Chen H, Do KA, Hortobagyi GN, Mills GB, Meric-Bernstam F, Symmans WF, Pusztai L. Gene expression, molecular class changes, and pathway analysis after neoadjuvant systemic therapy for breast cancer. Clin Cancer Res. 2012;18:1109–19.
Abraham A, Pedregosa F, Eickenberg M, Gervais P, Muller A, Kossaifi J, Gramfort A, Thirion B, Varoquaux G. Machine learning for neuroimaging with scikit-learn. Front Neuroinform. 2014;8:14.
Smyth GK, Michaud J, Scott HS. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005;21:2067–75.
Tang Y, Li W. lfda: an R package for local Fisher discriminant analysis and visualization; 2016.
Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci. 2001;98:5116–21.
Gendoo DM, Ratanasirigulchai N, Schröder MS, Paré L, Parker JS, Prat A, Haibe-Kains B. Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer. Bioinformatics. 2016;32(7):1097–9.
van ‘t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530.
Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, Quackenbush JF, Stijleman IJ, Palazzo J, Marron JS, Nobel AB, Mardis E, TO N, Ellis MJ, Perou CM, Bernard PS. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27:1160.
Wallden B, Storhoff J, Nielsen T, Dowidar N, Schaper C, Ferree S, Liu S, Leung S, Geiss G, Snider J, Vickery T, Davies SR, Mardis ER, Gnant M, Sestak I, Ellis MJ, Perou CM, Bernard PS, Parker JS. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med Genet. 2015;8:54.
Arthur LM, Turnbull AK, Webber VL, Larionov AA, Renshaw L, Kay C, Thomas JS, Dixon JM, Sims AH. Molecular changes in lobular breast cancers in response to endocrine therapy. Cancer Res. 2014;74:5371–6.
Tewari M, Krishnamurthy A, Shukla HS. Predictive markers of response to neoadjuvant chemotherapy in breast cancer. Surg Oncol. 2008;17:301–11.
Colozza M, Azambuja E, Cardoso F, Sotiriou C, Larsimont D, Piccart MJ. Proliferative markers as prognostic and predictive tools in early breast cancer: where are we now? Ann Oncol Off J Eur Soc Med Oncol. 2005;16:1723–39.
Urruticoechea A, Smith IE, Dowsett M. Proliferation marker Ki-67 in early breast cancer. J Clin Oncol. 2005;23:7212–20.
Pohler E, Mamai O, Hirst J, Zamiri M, Horn H, Nomura T, Irvine AD, Moran B, Wilson NJ, Smith FJD, Goh CSM, Sandilands A, Cole C, Barton GJ, Evans AT, Shimizu H, Akiyama M, Suehiro M, Konohana I, Shboul M, Teissier S, Boussofara L, Denguezli M, Saad A, Gribaa M, Dopping-Hepenstal PJ, McGrath JA, Brown SJ, Goudie DR, Reversade B, et al. Haploinsufficiency for AAGAB causes clinically heterogeneous forms of punctate palmoplantar keratoderma. Nat Genet. 2012;44:1272–6.
We are grateful for the help and support of the Wellcome Trust Clinical Research Facility under the direction of Lee Murphy.
Cancer Research UK is thanked for providing funding for a PhD stipend and a development fund to the Edinburgh CRUK Centre. AHS is grateful for the funding from Breast Cancer Now.
Ethics approval and consent to participate
The protocol for the “NEO study: a study of factors predicting response to neoadjuvant chemotherapy in breast cancer” was approved by the South East Scotland Research Ethics Committee 01 with REC reference: 13/SS/0236. All patients signed an informed consent at least 24 h after receiving the patient information leaflet and after having the chance to discuss thoroughly their participation to this study either with treating clinician or research nurses.
Consent for publication
All authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure S1. A, Pairwise significant analysis of microarrays (FDR = 10%) demonstrating that whilst only a relatively small proportion of genes are significantly up- or downregulated in response to chemotherapy in both datasets, overall changes in patient-matched sequential samples response to treatment are highly consistent. Red = upregulated, blue = downregulated on- relative to pre-treatment. Gene lists are in Additional file 4: Table S1. B, Unsupervised principal component analysis cannot distinguish responding from non-responding breast tumours receiving chemotherapy, before or on-treatment. (JPG 126 kb)
Figure S2. A, Gene set enrichment analysis (GSEA) results showing greater numbers of enriched pathways between responders and non-responders on-treatment compared to pre-treatment in the NEO dataset. B, Venn diagrams indicating that there were many more overlapping significantly differentially expressed genes between responders and non-responders across the two studies on-treatment compared to pre-treatment. Gene lists are for FDR = 10% (see Additional file 4: Table S3). (JPG 72 kb)
Figure S3. Heatmap of the 468 most significantly differentially expressed genes (p < 0.001) between responders and non-responders in the NEO dataset at mid-chemo, demonstrating rather poor separation between the response groups and time points. Gene list is in Additional file 4: Table S2. (JPG 91 kb)
Table S1. Gene lists of pairwise analysis of pre- and on-treatment sequential patient-matched samples from NEO and I-SPY datasets using Significance analysis of microarrays with FDR = 10%. Table S2. Gene lists distinguishing between responders and non-responders at different time points across the NEO and ISPY datasets using rank product analysis with p < 0.05, p < 0.01, and p < 0.001. Pre-treatment (TP), 2 weeks (T2), mid-chemo (TM), and surgery (TS). Table S3. Gene lists distinguishing between responders and non-responders at different time points across the NEO and I-SPY datasets using significance analysis of microarrays with FDR = 10%, FDR = 5%, and FDR = 1%. Pre-treatment (TP), 2 weeks (T2), mid-chemo (TM), and surgery (TS). (XLSX 1184 kb)