Molecular mechanism and clinical impact of APOBEC3B-catalyzed mutagenesis in breast cancer

Cancer genomic DNA sequences enable identification of all mutations and suggest targets for precision medicine. The identities and patterns of the mutations themselves also provide critical information for deducing the originating DNA damaging agents, causal molecular mechanisms, and thus additional therapeutic targets. A classic example is ultraviolet light, which crosslinks adjacent pyrimidines and leads to C-to-T transitions. A new example is the DNA cytosine deaminase APOBEC3B, which was identified recently as a source of DNA damage and mutagenesis in breast, head/neck, cervix, bladder, lung, ovary, and to lesser extents additional cancer types. This enzyme is normally an effector protein in the innate immune response to virus infection but upregulation in these cancer types causes elevated levels of genomic C-to-U deamination events, which manifest as C-to-T transitions and C-to-G transversions within distinct DNA trinucleotide contexts (preferentially 5’-TCA and 5’-TCG). Genomic C-to-U deamination events within the same trinucleotide contexts also lead to cytosine mutation clusters (kataegis), and may precipitate visible chromosomal aberrations such as translocations. Clinical studies indicate that APOBEC3B upregulation correlates with poorer outcomes for estrogen receptor-positive breast cancer patients, including shorter durations of disease-free survival and overall survival after surgery. APOBEC3B may therefore have both diagnostic and prognostic potential. APOBEC3B may also be a candidate for therapeutic targeting because inhibition of this non-essential enzyme is predicted to decrease tumor mutation rates and diminish the likelihood of undesirable mutation-dependent outcomes such as recurrence, metastasis, and the development of therapy resistant tumors.


Introduction -passive versus active mutational processes in cancer
During all developmental stages, even when cells are not actively dividing, our DNA is subjected to continual damage by a wide variety of agents and mechanisms. The majority of these insults are mitigated by DNA repair processes, which usually restore the original DNA sequence in an error-free manner. However, some DNA damage events escape repair and manifest as somatic mutations. These mutations, by and large, occur randomly across the genome over the course of an individual's lifetime. In some instances, however, the 'wrong combination' of somatic mutations can transform a normal cell into a tumor cell ( Figure 1). The ongoing accumulation of additional mutations also contributes to the growth of local tumor cells, the development of metastatic outgrowths, and the emergence of therapy resistance.
Breast cancer is an extremely heterogeneous disease [1][2][3]. This review discusses the likelihood that a considerable proportion of this heterogeneity and associated phenotypes are due to a dominant acting enzyme called apolipoprotein B editing catalytic subunit 3B (APOBEC3B; see Box 1) that deaminates genomic DNA cytosines, promotes higher than normal mutation rates, and thereby enables accelerated tumor evolution. APOBEC3B does not fit into classical tumor suppressor/oncogene paradigms. Rather, it defines a new class of cancer facilitator, a type of 'enabling characteristic' [4], because it promotes the genetic diversity that provides tumors with increased adaptive capabilities. For instance, in contrast to kinase activation, which can provide a tumor cell and all of its descendants with a measurable growth advantage, APOBEC3B will cause a different repertoire of mutations in every cell it affects. APOBEC3B could have been responsible for the initial kinase activating mutation, as well as additional genetic changes that occur in descendent cells, including therapy resistance mutations, especially if they confer a selective advantage. However, APOBEC3B will not confer the same overt phenotype in every cell in which it is expressed; rather, the compounded effects of the somatic mutations will dictate the overall phenotype of each cell and the larger tumor.
Before proceeding, it is important to distinguish somatic and germline mutations. Germline mutations are inherited from our parents. These are most frequently passed from generation to generation in a Mendelian manner, but they can also arise de novo in our parent's germ cells. The best-known examples in breast cancer occur in the BRCA1 and BRCA2 genes [5]. Mutations in BRCA1 and BRCA2 can compromise recombination repair, which results in elevated rates of some types of DNA damage and, consequently, an elevated risk of acquiring a cancerous combination of mutations [6]. Additional common inherited DNA repair defects are unlikely to have major roles in breast cancer, although corrupted DNA repair processes clearly contribute to many other tumor types. A prominent example is inherited defects in mismatch repair (MMR), which cause persistence of DNA replication errors, elevated mutation rates, and predisposition to colorectal tumors [7]. Moreover, BRCA2 is also known as FANCD1, because some mutations in this gene predispose to a syndrome called Fanconi anemia, characterized by bone marrow failure and blood cancer [8]. In comparison, somatic mutations occur during all stages of development, including carcinogenesis. These mutations are attributable to diverse types of DNA damage that by genetic or stochastic means escape repair.
The recombination and mismatch repair processes discussed above can be considered 'passive mutational processes' , and the DNA repair enzymes involved genomic 'custodians' or 'caretakers' [6,9]. The sum of all DNA damage events that escape repair can be considered, for lack of a better term, the spontaneous mutation level. These passively and randomly acquired mutations will happen regardless of other factors and, therefore, are largely 'unavoidable'. The overall spontaneous mutation level and the associated composite mutation pattern are therefore expected to be similar from person to person within the human population. Defects in a particular repair process will result in an elevation of the type(s) of DNA damage typically repaired by this system.
In contrast, a growing number of DNA damage sources can be considered 'active mutational processes'. Accordingly, these processes may be 'avoidable' (at least, in theory). Well-known external DNA damaging agents are ultraviolet light (UV) and tobacco compounds such as nicotinederived nitrosamine ketone (NNK), which are associated with skin and lung cancers, respectively [10,11]. UV causes DNA pyrimidine dimers (C∧C, C∧T, T∧C, and T∧T), which are substrates for nucleotide excision repair [10]. However, a lapse in error-free repair allows pyrimidine dimers to persist and template the insertion of adenine bases during lesion by-pass DNA replication (Figure 2a). If an adenine base becomes 'mispaired' with a UV-linked cytosine, the next round of DNA replication or repair yields a C-to-T transition mutation in a dipyrimidine context. NNK from tobacco smoke can become activated and alkylate guanine bases (*G) [11]. These adducts are 'read' poorly by DNA replication polymerases, resulting in *G/A mispairs that lead to G-to-T transversion mutations. This type of tobacco-induced mutation is not known to occur within any particular DNA sequence context [12].
In addition to avoidable external sources of mutation, there are at least two significant and potentially preventable internal sources of mutation. The first comprises a group of translesion synthesis (TLS) DNA polymerases including POL κ (kappa), POL ι (iota), POL η (eta), POL ζ (zeta), and REV1, which are low fidelity enzymes that lack the capacities to proofread and perform processive DNA synthesis [13]. These enzymes are recruited to sites of DNA damage to temporarily take the place of a replicative DNA polymerase and perform lesion bypass synthesis. For example, as a specialized excision repair component POL η will usually insert the correct DNA base opposite UV-induced pyrimidine dimers [14], but other DNA POLs, such as the major replicative polymerases δ and ε, and the TLS POL κ are not adapted to this type of lesion and have a tendency to insert adenine bases that result in hallmark C-to-T transitions [15] (Figure 2a). The existence of these mechanisms can be rationalized by the fact that lesion bypass synthesis is relatively benign in comparison to a failure to complete DNA replication before cell division. Notably, multiple DNA polymerases also have important roles in developing B lymphocytes by contributing to somatic hypermutation and the overall process of antibody diversification and affinity maturation, which, interestingly, is initiated by the APOBEC-related DNA cytosine deaminase AID (activation-induced DNA cytosine deaminase) [16] (elaborated below). As enzymatic sources of mutation, DNA polymerases are theoretically inhibitable and their mutagenic contributions preventable, although the consequences of doing so may be undesirable at the organismic level because the benefits of copying DNA, bypassing DNA lesions, as well as generating adaptive immunity through antibody diversification may far outweigh the disadvantages of inhibiting these processes (even selectively or transiently).
Recent work has resulted in the discovery of APO-BEC3B as an active, enzymatic source of mutation in breast cancer [17]. APOBEC3B was described originally as an antiviral enzyme, and a member of a larger family of DNA cytosine deaminases [18]. APOBEC3B uses water to catalyze the conversion of cytosine to uracil bases in single-stranded DNA. Uracil in DNA is a premutagenic lesion because DNA polymerases 'read' it as thymine and pair it properly through two hydrogen bonds to adenine during DNA replication ( Figure 2b). Thus, one of several hallmarks of APOBEC3B mutagenesis is C-to-T transition mutations within specific local DNA sequence contexts based on the intrinsic biochemical nature of the deaminase itself (elaborated below).

Discovery and biological functions of enzyme-catalyzed DNA cytosine deamination
Humans have the potential to encode up to nine enzymes with DNA cytosine deaminase activity (APOBEC1, AID, and APOBEC3A/B/C/D/F/G/H), as well as two related proteins that have yet to elicit this activity (APOBEC2 and APOBEC4) [18]. The DNA cytosine deaminase activity of APOBEC1, AID, and several APOBEC3 enzymes was discovered originally using an Escherichia coli-based drug resistance assay [19,20]. Like all organisms, this bacterium has a characteristic mutation frequency, and cells expressing AID, APOBEC1, APOBEC3C, or APOBEC3G were shown to elicit 4-to 20-fold higher mutation frequencies. E. coli engineered to lack the uracil-specific DNA repair enzyme uracil DNA glycosylase (UDG) also have higher mutation frequencies. However, the combination of APOBEC expression and UDG deficiency caused synergistic increases in mutation frequency, in some instances >100-fold, demonstrating that these proteins catalyze C-to-U lesions in DNA and that uracil base excision repair counteracts most of the damage [19,20]. In addition, each APOBEC3 enzyme was shown to induce a distinct spectrum of drug-resistance mutations. AID preferentially caused mutations at cytosine bases preceded by purines (5'-AC and GC), APOBEC1 at cytosines preceded by thymines (5'-TC), APOBEC3C also at cytosines preceded by thymines but with a more relaxed distribution (5'-TC), and APOBEC3G uniquely at cytosines preceded by another cytosine (5'-CC). These data indicated that the DNA cytosine deamination Figure 2 Local mutation preferences for UV-A and APOBEC3B-induced mutagenesis. Top row (a): ultraviolet (UV)-A crosslinks adjacent pyrimidine bases (lesion depicted by double dash sign (=)). Several DNA polymerases will bypass this lesion by inserting two adenines. Excision repair or another round of DNA replication will convert these C/A mispairs into C-to-T transition mutations. Bottom row (b): APOBEC3B (A3B) catalyzes the hydrolytic deamination of single-stranded DNA cytosine into uracil (lesion depicted in biochemically preferred 5'-TCA context). Uracil in DNA is recognized as a 'normal' thymine by DNA polymerases, and it therefore templates the insertion of an adenine in the complementary DNA strand. Uracil base excision repair or another round of DNA replication will convert the U-A base pair into a C-to-T transition mutation. Additional mutagenic outcomes are depicted in Figure 3.
mechanism is a conserved hallmark of this protein family and, importantly, that each enzyme has distinct intrinsic local substrate preferences. Subsequent studies have demonstrated DNA cytosine deaminase activity for all human APOBEC family members, except APOBEC2 and APOBEC4, in analogous mutator assays, biochemical studies with recombinant enzymes, and a variety of biological systems [18].
The DNA cytosine deaminase activity of AID is essential for antibody gene diversification through the distinct processes of somatic hypermutation and class switch recombination [16]. In antigen responding B lymphocytes, AID catalyzed C-to-U lesions in expressed heavy and light chain variable regions are processed by uracil base excision repair, MMR, and TLS polymerases into all six types of base substitutions. This mutagenic process is coupled to a selection process for B cells expressing antibodies with higher affinities for foreign antigen and results in large numbers of mutations within antibody gene variable regions. Simultaneously, AID catalyzed Cto-U lesions in switch regions of the expressed heavy chain gene are also processed by uracil base excision repair and MMR proteins into DNA breaks, which lead to recombination of upstream and downstream switch regions and expression of a new antibody isotype. Although these processes are complex, they are relevant here because they provide critical precedents for DNA deaminationinduced mutagenesis in cancer, most importantly by demonstrating that uracil lesions can lead to both simple and complex mutational outcomes dependent upon downstream 'repair' processes. Moreover, considerable evidence indicates that misprocessed AID-catalyzed deamination events at immunoglobulin loci, as well as off-target deamination events, contribute to B cell lymphomas [21]. APOBEC1, the namesake, is the only family member with physiologically confirmed RNA substrates. APOBEC1 and editing cofactors are required for converting APOB mRNA C6666 into uracil, which yields a premature stop codon and a truncated APOB protein [22]. RNA sequencing comparisons of wild-type and Apobec1-null mice have revealed dozens more editing events, mostly in mRNA 3' untranslated regions with unclear functional roles [23]. However, APOBEC1 is also a potent DNA Cto-U editing enzyme [19,24], with likely roles in innate immunity (related to APOBEC3 proteins discussed below) [18,25]. Moreover, dysregulated APOBEC1 may also play a part in cancer mutagenesis because its overexpression in transgenic mice has been shown to induce liver and colon tumors [26]. At least for mice, deep sequencing has yet to determine whether this cancer association is due to DNA and/or RNA cytosine editing activity. However, recent biochemical, computational, and functional experiments have implicated human APOBEC1 in esophageal adenocarcinomas, and tended to favor a DNA deamination mechanism [27].
Over 1,000 studies this past decade have focused on APOBEC3 enzymes because they have the remarkable capability to deaminate retrovirus and retrotransposon cDNA replication intermediates [18,25,28]. Most studies to date have focused on the mechanism of HIV-1 restriction. The current model posits that APOBEC3D/F/G/H package into assembling viral particles, travel with the particles until a new target cell is infected, and then deaminate nascent viral cDNA cytosines during reverse transcription. The resulting viral cDNA uracils then template the insertion of genomic strand adenines, and immortalize the original lesions as G-to-A mutations. Unrestrained APOBEC3 activity can completely inactivate HIV-1, but this rarely occurs in vivo because the viral protein Vif nucleates the formation of an E3 ubitquitin ligase complex that protects the genetic integrity of the virus by degrading the APOBEC3 enzymes. Nevertheless, APOBEC3 associated G-to-A mutations are frequently observed in viral sequences from infected patients, implying that this mutagenic process may actually facilitate HIV-1 pathogenesis by contributing to virus evolution, immune escape, and drug resistance. APOBEC3 proteins have also been implicated in restricting a large number of DNA-based parasites, including cancer-associated viruses Epstein-Barr virus, hepatitis B virus, human papilloma virus, and human T-lymphotropic virus-1 [18,25,28]. The prevailing model indicates that the entire APOBEC family provides an overlapping innate immune defense to a wide variety of known and likely many more unknown DNAbased parasites.

Discovery and mechanism of APOBEC3B mutagenesis in breast cancer
Over the past several years, multiple lines of evidence have hinted at a novel enzymatic process in cancer mutagenesis and inspired experiments to directly test an APOBEC mutator hypothesis. First, initial speculations were provoked by the discovery of a family of enzymes with DNA deaminase activity and distinct local preferences [19,29,30]. Second, array hybridization experiments suggested upregulation of at least one APOBEC3 in several cancers [19]. Third, precedents established by AID indicated that chromosomal DNA could be susceptible to enzymatic deamination [16]. Fourth, genetic experiments in yeast indicated that APOBEC3 enzymes could mutate the genomic DNA of a eukaryote [31]. Fifth, early cancer gene DNA sequencing studies using Sanger methodology revealed mutation patterns biased to cytosine bases [32][33][34]. Sixth, biochemical studies showed that methylcytosine is more prone to spontaneous deamination than normal cytosine [35], yet in many breast tumors the majority of mutations do not occur at potentially methylated 5'-CG dinucleotide motifs [32][33][34]. Last, strand-coordinated cytosine mutation clusters could be induced by chemicals or APOBEC3G overexpression in yeast selection experiments [36,37], and analogous clusters biased to 5'-TC dinucleotides were discovered by deepsequencing breast and other cancer genomes [36,38,39]. Such mutational clusters have been termed kataegis, due to a likeness with the concentration and intensity of thundershowers [38]. It is important to note that although each of these studies supports, to varying degrees, a role for enzymatic DNA cytosine deamination in cancer mutagenesis, none was demonstrative and other mechanisms were plausible [40].
The hybridization and sequencing studies cited above, plus availability of matched normal and tumor tissues and a wide variety of cell lines, indicated that breast cancer might provide an opportunity to test an APOBEC3 mutator hypothesis. Quantitative PCR assays were used to determine which APOBEC3 family members are expressed in cancerous breast tissues in comparison with matched adjacent or contralateral normal breast tissues [17]. Several family members were expressed weakly or undetectably in both tissue types (AID, APOBEC1, APOBEC2, and APOBEC4), and others were expressed similarly in both tissue types (APOBEC3A/D/F/G/H). APOBEC3C was expressed at lower levels in tumor tissues. Only one family member, APOBEC3B, was expressed at significantly higher levels in tumor tissues in comparison with matched normal tissues, and it was barely detectable in normal breast tissues [17]. Similar proportions and magnitudes of APOBEC3B overexpression were evident in RNA sequencing data sets from independent tumors [17]. Moreover, nearly two-thirds of all breast cancer cell lines showed upregulated levels of APOBEC3B in comparison with control lines such as MCF10A [17]. Taken together, these data showed that APOBEC3B is significantly and constitutively upregulated in a large proportion of breast tumors and cancer cell lines.
Two additional lines of evidence strongly linked APO-BEC3B overexpression to breast cancer mutagenesis. First, elevated APOBEC3B expression levels correlated with higher C-to-T and overall base substitution mutation loads [17]. These analyses were done by comparing APOBEC3B mRNA levels and tumor mutation counts. A significant positive correlation was observed for the entire cohort, but correlations were clearest upon comparison of pooled data from the bottom and top third of APOBEC3B expressing patients. The top third has a median of 30 C-to-T mutations per exome, whereas the lower third has less than 20. Moreover, the top third has a median of nearly 70 base substitutions per exome, whereas the lower third has less than 40 (these differences are explained in a model below). Second, a series of biochemical experiments was done with the catalytic domain of APOBEC3B to deduce its intrinsic DNA cytosine deamination preference or 'signature' in vitro, and this was shown to closely resemble the actual cytosine mutation pattern in breast tumors [17]. Specifically, recombinant APOBEC3B preferred to deaminate cytosines within 5'-TCA and 5'-TCG motifs at least 5-fold and, in many instances, >50-fold better than any other trinucleotidecontaining single-stranded DNA. The same two trinucleotides, 5'-TCA and 5'-TCG, were the most commonly mutated cytosine-containing motifs in two independent breast cancer mutation data sets.
Finally, genetic knockdown experiments were used to demonstrate that APOBEC3B is responsible for elevated levels of DNA damage and mutation in several breast cancer cell lines [17]. APOBEC3B mRNA knockdown caused a corresponding depletion of all measurable DNA deaminase activity in breast cancer cell line nuclear extracts. APOBEC3B knockdown cells also had lower steady state levels of genomic uracil as quantified by mass spectrometry and lower mutation frequencies as judged by drug resistance experiments. Moreover, fewer C-to-T mutations were found in the TP53 and c-MYC genes of APOBEC3B-depleted cells in comparison with controls.
Independent studies have since implicated APOBEC3B mutagenesis in other types of cancer, including head/ neck, lung, bladder, cervical, ovarian, and as many as 16 out of 30 cancers examined to date [41][42][43][44][45][46][47]. Additional work with yeast has suggested that both the immediate product of DNA deamination, uracil, and the derivative downstream lesion, the abasic site, may be important intermediates in APOBEC3B mutagenesis [48,49]. Specifically, expression of various APOBEC3 family members in yeast induced high levels of C-to-T and C-to-G mutations, as well as kataegic clusters, and deleting uracil DNA glycosylase collapsed the mutation pattern to dispersed C-to-T mutations. Deletion of the TLS DNA polymerase REV1 as well as expression of a REV1 catalytic mutant also caused similar phenotypes. Therefore, C-to-G transversions in yeast are most likely due to REV1-dependent deoxy-cytidine insertion opposite an abasic site created by the combined action of APOBEC3B deamination and UDG base excision. Kataegis events most likely require APOBEC3B at two levels, first to create a lesion that results in the initial single-or double-stranded DNA break, and second to deaminate (perhaps processively) the resulting exposed single-stranded DNA repair intermediates. Because UDG is uniquely able to excise uracils from both double-and single-stranded DNA substrates, these clustered DNA C-to-U lesions lead to both C-to-T transitions and C-to-G transversions depending on the polymerase recruited to assist with repair. An additional feature of kataegis is a strong DNA strand bias with most cytosine mutations occurring on the same strand, likely reflecting the existence of an extensive single-stranded intermediate. Recent work in human cell lines has shown that a U/G mismatch in transfected plasmid DNA can lead to single-stranded regions susceptible to APOBEC3mediated deamination [50].
Overall, a compelling case has been made for APO-BEC3B mutagenesis in breast and additional cancers. These studies have led to a model in which APOBEC3Bdependent DNA C-to-U deamination events underlie a variety of mutagenic outcomes (Figure 3; based on [17,41,43]). As discussed, uracils can directly template the insertion of adenines during DNA replication or local DNA synthesis. However, based on studies in yeast [48] and the fact that UDG is a very efficient enzyme that excises uracils from both single-and double-stranded DNA [51], it is likely that abasic sites are major intermediates for both dispersed and clustered C-to-T mutations and C-to-G mutations. Most polymerases will follow the ' A-rule' and insert an A opposite an abasic site, accounting for C-to-T transitions, whereas REV1 will uniquely insert a C opposite an abasic site, which accounts for C-to-G transversions. An additional DNA polymerase may also be involved because C-to-A mutations have been observed in other APOBEC3B-associated cancers such as ovarian carcinomas [43,45]. Uracil excision followed by abasic site processing by the major abasic site endonuclease, APEX, will generate a single-stranded break that could easily be processed into a double-stranded break by a variety of mechanisms, including the generation of a nick on the opposing strand or a replication fork collision. Nucleolytic resection of nicked or broken DNA may create singlestranded DNA substrates for APOBEC3B. As additional full genome sequences become available, it will be interesting to determine whether larger-scale chromosomal aberrations such as insertions, deletions, translocations, and amplifications correlate with APOBEC3B expression and/or define hotspots for dispersed or clustered APOBEC3B deamination events. Initial studies with breast cancer genomic DNA sequences have indicated that a portion of kataegis events may be located near sites of DNA rearrangement [36,38,52], although global levels of segmental copy number changes do not appear to correlate [44].
Possible involvement of at least one other APOBEC family member in breast cancer mutagenesis APOBEC3B is absent from a substantial fraction of the global population due to a 29.5 kb deletion allele that joins the homologous 3' untranslated regions of APOBEC3A and APOBEC3B and removes the entire coding sequence Figure 3 Model for APOBEC3B mutagenesis in cancer. The central pathway goes left to right and then circles back to depict error-free base excision repair of two C-to-U lesions catalyzed by APOBEC3B (A3B). Most genomic uracils are probably repaired in this manner. However, intermediates in this repair process can lead to a variety of mutagenic outcomes. Top left: C-to-T mutations can result from DNA synthesis over uracilated templates (as in Figure 2) or from synthesis over an abasic site because most DNA polymerases insert deoxy-adenosine opposite this non-instructional lesion.
Bottom left: C-to-G transversions most likely occur when REV1 inserts deoxy-cytidine opposite an abasic site followed by repair of the original lesion or a round of DNA synthesis. Top right: a single-stranded DNA break (SSB) can result from cleavage of the phosphodiester backbone by APEX (normal component of BER). Bottom right: A double-stranded DNA break (DSB) can result from opposing APEX-mediated endonucleolytic cleavages, or from a DNA replication fork hitting a single-stranded break. Both single-and double-stranded breaks can lead to additional mutagenic outcomes such as kataegis (A3B-catalyzed deamination of exposed single-stranded DNA) and insertions, deletions, amplifications, inversions, and translocations. LIG, DNA ligase; POL, polymerase; UDG, uracil DNA glycosylase. [53]. The frequency of the deletion allele is high in Oceanic populations (93%), intermediate in Asian and North American Indian populations (37% and 58%), and rare in European and African populations (6% and 1%). This natural deletion allele provides clear opportunities for both molecular and clinical studies. A recent study showed that 5'-TC biased mutations and kataegis are still prevalent in a small number of breast tumors in the complete absence of APOBEC3B [54]. At first glance, this observation calls into question the aforementioned work. However, only 14 APOBEC3B-null breast cancers could be identified for analysis and tumor ages were unknown, making mutation frequency comparisons difficult. Therefore, the most parsimonious explanation at this time is that APOBEC3B and at least one additional family member contribute to mutagenesis in breast cancer.
The most likely candidates based on intrinsic DNA deamination preferences for 5'-TC motifs are APOBEC1 and/or APOBEC3A/C/D/F/H [18]. In particular, APO-BEC3A may be dysregulated through fusion with the APOBEC3B 3' untranslated region, poly-A region, and potential 3' cis-regulatory sequences [54,55]. Consistent with this idea, several studies have shown that APOBEC3A overexpression can induce DNA damage responses resulting in the appearance of classical markers, double-stranded breaks, and elevated levels of mutation [17,[56][57][58][59][60][61]. All of these studies have relied upon APOBEC3A overexpression in heterologous cell types, and it should be noted that endogenous APOBEC3A is not genotoxic even upon 100fold upregulation by interferon-α [62,63]. Viral infections causing innate immune responses and/or splice variants may also be contributing factors. In support of this idea, recent studies on head/neck cancer have linked human papilloma virus infection to APOBEC3B upregulation, and one has shown that the viral E6 oncoprotein is sufficient to trigger APOBEC3B upregulation [46,64,65]. Hit-andrun infections causing transient APOBEC3 upregulation are also plausible but challenging to test. A current limitation is a lack of specific antibodies to many of the human APOBEC3 family members, including APOBEC3B, which if available would enable protein level studies to be compared with current mRNA-based work. In any event, additional work including clinical studies in appropriate ethnic populations will be necessary to generate larger cohorts of APOBEC3B-null individuals for expression studies, mutation pattern analyses, and possibly definitive functional experiments to unambiguously delineate the involvement of other family members in cancer mutagenesis.
Clinical impact of APOBEC3B mutagenesis in breast cancer -major differences between incidence and progression Incidence and progression are separate topics that do not necessarily have to share molecular mechanisms.
Incidence is the probability of getting cancer. Progression is what happens after a cancer occurs. A priori, one would predict that APOBEC3B status would not impact cancer incidence because at least one upstream event is needed to elevate APOBEC3B expression [17]. However, three independent studies have indicated that homozygosity for the APOBEC3B null allele associates with a higher incidence of breast cancer [66][67][68]. The largest study showed that the risk of developing breast cancer was higher for those with homozygous null alleles than for heterozygous carriers or individuals with two copies of the intact gene [67]. Given the innate immune functions of APOBEC3B with clear antiviral and antitransposon activities [18,25,28], it is possible that APOBEC3B-null individuals have less protection from an as-yet-unidentified virus or they have chronically higher levels of endogenous transposition (events that would otherwise be restricted), which is another way to endow cells with a mutator phenotype. Indeed, higher levels of L1 transposition have been documented in some types of cancer [69] but association with APOBEC3B nullizygosity has yet to be addressed.
Recent work has addressed the question of progression in a large cohort of estrogen receptor-positive breast cancer patients [70]. APOBEC3B mRNA levels in previously cryopreserved samples were measured by quantitative PCR and the overall cohort average was used to divide high and low expression groups. Kaplan-Meier analyses were done to compare the two groups, and high APOBEC3B levels were found to associate with shorter durations of both disease-free survival and overall survival. Moreover, because these specimens were procured some time ago, outcomes were not complicated by adjuvant treatments. Therefore, these data indicate that APOBEC3B levels may be a pure prognostic marker for estrogen receptor-positive breast cancers. Similar results were obtained by retrospectively analyzing independent cohorts. Additional work and possibly prospective trials will be needed to evaluate estrogen receptor-negative breast cancers and deduce potential interactions with adjuvant therapies. For instance, studies will be needed to address whether elevated levels of APOBEC3B-enabled mutation contribute to the development of drug resistance and metastases.

Conclusions
The DNA cytosine deaminase APOBEC3B is a newly defined source of DNA damage and mutation in breast cancer. It is already clear that APOBEC3B mutagenesis impacts approximately half of all breast cancers and may be responsible for large proportions of mutations in these tumors, ranging from modest to extremely high levels. Because APOBEC3B is an endogenous and likely stochastic mutagen, it is predicted to drive tumor evolution and contribute to all processes attributable to mutations (that is, tumor development, progression, heterogeneity, metastasis, and drug resistance). In fact, a recent study with head/neck cancer implicated APOBEC3B mutagenesis in activation of the kinase PIK3CA, which is also mutated in a large proportion of breast cancers [46].
To use UV light as an analogy, knowledge of its promutagenic activity led to the development of measures to limit exposure and reduce the risk of potentially detrimental long-term outcomes such as skin cancer (notably sunscreen). APOBEC3B may be of similar importance to breast cancer, and one can already envisage developing the equivalent of a 'sunscreen' to inhibit APOBEC3B mutagenesis. APOBEC3B may be a prime therapeutic target because it is non-essential, it is rarely expressed in most normal tissues, and it is a dominant-acting enzyme with an active site that may be druggable. Proof-ofconcept inhibitors have already been developed for the related enzyme APOBEC3G [71,72]. Additional work is likely to yield APOBEC3B inhibitors, and such molecules may have therapeutic merit in high-risk individuals and/ or as a post-operative adjuvant to limit further tumor evolution and improve the durability and efficacy of existing therapeutics.
APOBEC3B may also have diagnostic and prognostic utility. Direct measurements of APOBEC3B mRNA or protein levels, perhaps in combination with 'reading' its mutagenic signature, would allow tumors to be grouped into APOBEC3B risk categories (analogous to assays for microsatellite instability in MMR-defective cancers). Such studies may determine a threshold value for APOBEC3B expression that unambiguously distinguishes a rapidly evolving cancer from an indolent mass. False positive breast cancer diagnoses still occur and molecular information on APOBEC3B may help reduce the magnitude of this problem. Appropriate diagnostic assays will also be necessary to identify cancer cases that may benefit from APOBEC3B inhibition. In conclusion, APOBEC3B mutagenesis is a major factor in breast cancer and many other human malignancies and much additional basic, translational, and clinical work is needed to fully realize this discovery.