Evaluation of ultra-deep targeted sequencing for personalized breast cancer care

Introduction The increasing number of targeted therapies, together with a deeper understanding of cancer genetics and drug response, have prompted major healthcare centers to implement personalized treatment approaches relying on high-throughput tumor DNA sequencing. However, the optimal way to implement this transformative methodology is not yet clear. Current assays may miss important clinical information such as the mutation allelic fraction, the presence of sub-clones or chromosomal rearrangements, or the distinction between inherited variants and somatic mutations. Here, we present the evaluation of ultra-deep targeted sequencing (UDT-Seq) to generate and interpret the molecular profile of 38 breast cancer patients from two academic medical centers. Methods We sequenced 47 genes in matched germline and tumor DNA samples from 38 breast cancer patients. The selected genes, or the pathways they belong to, can be targeted by drugs or are important in familial cancer risk or drug metabolism. Results Relying on the added value of sequencing matched tumor and germline DNA and using a dedicated analysis, UDT-Seq has a high sensitivity to identify mutations in tumors with low malignant cell content. Applying UDT-Seq to matched tumor and germline specimens from the 38 patients resulted in a proposal for at least one targeted therapy for 22 patients, the identification of tumor sub-clones in 3 patients, the suggestion of potential adverse drug effects in 3 patients and a recommendation for genetic counseling for 2 patients. Conclusion Overall our study highlights the additional benefits of a sequencing strategy, which includes germline DNA and is optimized for heterogeneous tumor tissues.

sample collection or assay. Remnant tissue samples were taken for research use at the time of clinically indicated surgeries under the direct supervision of licensed members of the pathology department at the respective institutions. Each patient who consented to donating a tissue sample was also requested to give either a blood or saliva sample.
Blood was obtained by venipuncture of the median cubital vein into an uncoated serum tube and allowed to clot at room temperature. Saliva was collected in an Oragene saliva kit (OGR-500) (DNA Genotek, Ottawa, Canada). Detailed standard pathologic characteristics are provided in Table S1. ASCO/CAP guidelines [1,2] were followed for evaluation of estrogen receptor, progesterone receptor and Her2. The cellularity of the frozen sample was estimated as the average cellularity of the tumor tissue from two adjacent H&E stained paraffin-embedded sections. Most specimens were collected from untreated primary tumors but our cohort included patients with regional or distant disease treated with neoadjuvant therapy prior to surgery or biopsy.

Description:
We included 38 patients from our breast cancer clinics for which frozen tissue specimen was available from the biorepository and properly consented for genetic analysis (Table S1). Three specimens were from distant sites (lymph node, liver and brain metastasis), two were from primary invasive lobular carcinomas, and 33 were from primary invasive ductal carcinoma including 5 with mixed ductal/lobular features. The majority were diagnosed with Nottingham histologic grade II (N=16) and grade III (N=12). All specimens underwent routine molecular testing for receptor status (ER and PR using immunohistochemical staining). Her2 status was assayed using either immunostaining (N=19), in situ hybridization (N=13), or both (N=5), resulting in the identification of 6 specimens that were Her2-overexpressed by ASCO/CAP standards.
Histological inspection revealed a broad range of cellularity ( Figure 1). More than half of the tumors contained fewer than 60% malignant cells in the section studied for diagnosis. These samples would have failed criteria for inclusion in most cancer genomic studies such as the TCGA study. Ten samples had less than 40% tumor cells, and therefore failed the criteria for analysis by classical Sanger sequencing. Finally, 4 tumors had less than 20% cellularity failing to meet the criteria for genetic profiling at most diagnostic laboratories, including the ones using next generation sequencing. We 2 anticipated that the sensitivity of our assay would overcome this difficulty and permit the identification of actionable mutations in the majority of the samples collected.

UDT-Seq assay design
Gene selection We assembled a panel of 47 genes to be analyzed by UDT-Seq. The genes were selected for their clinical importance or their relevance to breast cancer genetics and treatment (Table 1). Thirteen genes were selected because of the availability of an FDA-approved drug (e.g. ABL, EGFR, ERBB2) targeting the genes or their pathway, with the rationale of repurposing approved drugs based on mutational profile. An additional ten genes were selected by the same rational for drugs that were in clinical trial at the time of the study (PTEN, PIK3CA and KRAS). Importantly, for some of the trials, the presence of a genetic alteration is an eligibility criterion. We included 12 additional genes known to be mutated in at least 1% of breast cancer [3]. Finally we added 11 genes for their importance in the germline only; 5 of them in the DNA repair pathway (e.g. MLH1, MSH6, CHK2), 5 pharmacogenomic (PKG) genes (CY2D6, DPYD, CYP2C9, TPMT, VKORC1), and CFTR. Of note, 3 PKG genes and CFTR were not selected for their relevance to breast cancer, but rather to explore the feasibility of including additional types of genes in a clinical setting.
Primer design: The custom primer library was designed using the Primer3 v2.3.0 algorithm and the primers aligned back to the genome using isPCR v33, selecting only primer pairs resulting in one predicted amplicon. Our primer design pipeline included exhaustive primer selection across the targeted intervals with the following ideal parameters: 200 bp product length, 16-20 bp primer length, 40-60% GC, and avoiding sequences containing SNPs (dbSNP135). When no primer was designed under ideal parameters, the automated pipeline altered the parameters in the following orderproduct length, primer length and GC content -until the entire target region was covered (Table S2). After designing the locus-specific portion of the primers, sequence tails corresponding to a portion of the Illumina adaptor sequence were added to the sequence-specific portion of the primers prior to synthesis. The sequence added to the forward primers was CGCTCTTCCGATCTCTG and the sequence added to the reverse primer was CGCTCTTCCGATCTGAC. The tri-nucleotide sequence in bold was inserted between the specific target and the universal primer to confer adapter-strand specificity so that only the reads originating from the same end of the amplicon would be sequenced simultaneously. This ensured that the sequencing error rate could be 3 computed as a function of the read direction (forward or reverse). All primer pairs were used to prepare a primer droplet library (Raindance Technologies) using 5 pairs per droplet. While alternative approaches are frequently designed to examine only mutational hotspots [4], we chose to sequence the entire coding regions of the genes in the panel since hotspots continue to be identified and an increasing number of studies report functional mutations outside of hotspots. We designed 1,736 pairs of primers that targeted 99% (889/894) of the selected exons. Remarkably, 83% of the primer pairs were designed with the most stringent criteria controlling the presence of SNP, GC content and amplicon size, to ensure a more uniform amplification. The panel of amplicons had a total size of 228 kb, of which 154 kb were covering the targeted exons.

Germline Variation Summary
We identified a total of 586 inherited variants in the 38 patients, with a median of 140 per patient, of which, 498 were present in dbSNP (482 SNVs and 16 indels) and 88 were novel (61 SNVs and 27 indels). A total of 253 variants were unique to one patient (175 in dbSNP, 78 novel). 182 variants were non-silent (Table S8), and of those, 57 were predicted to be deleterious missense, 3 nonsense, 19 frameshift and 7 other mutations (splice sites, codon indels, start loss) by MutationTaster [5]. In total, 92% (79/86) of these predicted deleterious variants were seen in 3 or fewer patients and affected 28 genes. These figures are in agreement with the current understanding of human population genetics [6,7].