Introduction

The advent of massively parallel sequencing is driving an unprecedented explosion in genetic knowledge that is set to change the face of modern medicine. From a human health perspective, improved diagnostic power is the most immediate and readily achievable outcome, while realization of therapeutic benefit is more challenging and demands ongoing evolution of genomic technologies. Recent advances in genome editing technology, most notably user-designed nucleases, are creating tremendous excitement and ushering in what many believe will be a golden age of genome engineering [1]. This review focuses on another powerful genome engineering tool, recombinant AAV-mediated gene targeting [2, 3] and seeks to place this technology in the broader context of contemporary cutting-edge genome engineering technologies. The aim is to provide a broad overview and synthesis from a therapeutic perspective that is accessible to the generalist reader. Particular emphasis is given to the independent and synergistic utilities of AAV-mediated gene targeting in the context of challenges posed by specific organ and disease targets.

Historical Perspective

Shortly before sharing the 1968 Nobel Prize in Physiology and Medicine for deciphering the genetic code, Marshall Nirenberg authored an editorial in Science asking “Will society be prepared” [4]. Here, he presciently predicted the use of “genetic surgery” to program heritable changes into bacteria in the near term and into mammalian cells, including human cells, within 25 years. His chief concern was the impending power of man “to shape his own biological destiny”. The first authorized gene marking and gene therapy trials in man took place just over 20 years later [5, 6], with underpinning advances in the intervening years including the recombinant DNA revolution in the 1970s [7] and the development of viral vectors capable of stable gene transfer by genomic integration in the 1980s [8, 9]. The strategy used in these early studies involved a relatively crude gene addition approach rather than the precise molecular repair of a mutant locus, which is most akin to the notion of genetic surgery. The technological chasm between these approaches is substantial, such that therapeutic success in the gene therapy field continues to rely heavily on gene addition despite impressive advances in the technologies required to achieve therapeutically relevant levels of targeted gene repair.

In the early 1980s, Mario Capecchi and colleagues recognized the capacity of mammalian cells to mediate homologous recombination (HR) between exogenously introduced DNA molecules and reasoned that the cellular HR machinery might be similarly harnessed to mediate HR between introduced exogenous DNA sequences and endogenous genomic loci [10]. This powerful notion proved correct and formed the basis of much subsequent research by Capecchi and others showing that almost any mammalian gene could be repaired, mutated or modified in permissive cells [11, 12]. The convergence of this technology with the development of mouse embryonic stem (ES) cells [13, 14] facilitated targeted manipulation of the mouse germ line and the generation of knock-in and knock-out mouse models [1517]. This revolutionized the study of mammalian gene function and was paralleled by equivalent developments in other model organisms, including yeast [18]. A key component of these advances was the capacity to incorporate positive and negative selection cassettes into targeting constructs such that cells bearing rare HR events (typically <1 per 106 cells) could be selectively expanded from a background population of unmodified cells and cells with random off-target genomic integration events [12, 19].

While low gene targeting efficiency is not limiting in cell types that can be readily cultured and selectively expanded without losing the desired phenotype, such as pluripotency, this is not the case for the majority of foreseeable therapeutic applications. Unsurprisingly, therefore factors governing the efficiency of gene targeting by HR are the focus of considerable ongoing research effort. Critical early insights included the observation that linear DNA molecules are a preferred substrate for HR [20] and that the frequency of HR is influenced by the cell cycle [21]. Many of these early learned lessons have been recapitulated in the development of AAV-mediated gene targeting and other evolving genome editing technologies discussed below.

AAV Biology and Vector Development

AAV was first identified in the 1960s in an adenovirus isolate by electron microscopy and subsequently shown to be a nonpathogenic dependent parvovirus requiring helper functions from members of the adenovirus or herpes virus families for efficient completion of the viral life cycle [22, 23]. The viral genome consists of a single-stranded DNA molecule of 4680 base pairs (bp) including 145 bp inverted terminal repeats (ITRs) flanking two major open reading frames encoding viral proteins required for replication (rep) and encapsidation (cap). Both sense and antisense viral genomes are generated during viral replication and are packaged into separate virions with equal efficiency [24]. Importantly, the cis-acting sequences required for viral genome rescue, replication and packaging are located within the ITRs. This allowed recombinant AAV genomes to be generated that retained only the ITRs flanking user-defined heterologous sequences, most commonly an expression cassette [25, 26]. The viral rep and cap proteins can be supplied in trans along with essential adenovirus helper functions to produce stocks of recombinant virus for target cell transduction.

The majority of early virological studies and initial vector development focused on AAV2, the most prevalent human isolate [23]. Ongoing development of AAV vector technology since the first vector constructs were described in 1984 has been substantial and accompanied by an exponential increase in usage as a gene transfer tool for discovery and therapeutic purposes. Key events in the technological development of the AAV vector system have been (i) the discovery that recombinant AAV genomes can be cross-packaged into the capsids of other AAV isolates (pseudo-seroty**) [27], (ii) improved packaging strategies producing higher titre vector stocks [28] and (iii) broadening of the utility of AAV vectors through the development of AAV-mediated HR (see next section).

The transduction performance of AAV vectors on different cells types and equivalent cell types across species varies significantly such that the utility of AAV vectors in specific applications both in vitro and in vivo must be established empirically. Variables include the nature of interactions between the vector capsid and cell surface receptors, internalization, intracellular trafficking to the nucleus, the kinetics of vector genome uncoating and conversion of input single-stranded genomes to potentially transcriptionally active double-stranded DNA templates [2932]. Transgene expression can occur from either episomal vector genomes or from genomes that have undergone integration into the host cell. Importantly, capsid choice affects many of these variables, with different capsids conferring distinctly different vector tropism, immuno-biology and transduction performance. Moreover, the repertoire of available capsids for defined applications is expanding rapidly through both the isolation of novel capsids from nature and a variety of capsid engineering strategies [33]. The net result is an increasingly powerful vector toolkit that can be configured for specific applications [34, 35].

AAV-Mediated Gene Targeting

Russell and Hirata first described the use of AAV-mediated gene targeting to modify homologous human chromosomal sequences in 1998 [2]. Using an integrated neomycin phosphotransferase gene and the native hypoxanthine phosphoribosyltransferase locus as targets, they reported HR efficiencies in HeLa cells, HT-1080 cells and normal human fibroblasts two to three logs higher than achievable by transfection, with targeting efficiency in fibroblasts approaching 1 % of the transduced cell population. These efficiencies were unprecedented despite earlier efforts to improve gene targeting efficiencies using retrovirus [36] and adenovirus-mediated [37] HR template delivery and implied that undefined aspects of AAV vector biology and genome configuration facilitate homologous recombination. In further support of this possibility, higher AAV vector doses resulted in higher gene targeting rates, an unexpected result given that earlier gene targeting studies using transfection showed no such correlation [17]. Properties of AAV vectors theorized to favour HR included the single-stranded nature of the genome, the capacity to deliver genomes to the target cell nucleus at high multiplicity, persistence of single-stranded genomes in the nucleus over time, the potential of the T-shaped hairpin vector termini to stabilize the genome and reduce random integration and potential of AAV genomes to stimulate DNA damage responses [2, 38]. Subsequent studies by the Russell laboratory and others have provided further insight into the types of genome modification made possible, the fidelity and specificity of the reaction, optimal design of AAV targeting constructs, the variables influencing efficiency and, more recently, the molecular basis of AAV-mediated HR.

Genome modifications achievable using AAV-mediated genome editing include small (<25 bp) insertions and deletions [2], all possible single base substitutions [39], intermediate sized deletions (~300 bp) [3] and large insertions up to 1.5 kb [40]. Insertion events occur at consistently higher efficiencies than deletions [41] and, irrespective of the nature of the modification, higher gene targeting rates can be achieved with increased lengths of homology between the AAV gene targeting construct and the target locus, placement of the target mutation at the centre of the AAV vector genome and increasing the multiplicity of infection (MOI) used [42]. Introduction of double-strand breaks (DSBs) at the target locus further enhances targeting efficiency [43]. Other factors influencing the efficiency of gene targeting include chromosomal position effects [44], the cell cycle [45], and target locus transcriptional and replicative activity [46•]. Finally, the presence of single nucleotide polymorphisms at the target locus can markedly reduce targeting efficiency and this effect can also be exploited to selectively edit a specific allele [47]. Detailed protocols for AAV-HR vector design, construction and production have been published [48].

The mechanism of AAV-mediated gene targeting remains incompletely understood. A model involving pairing of the input single-stranded vector genome with homologous chromosomal sequences and resolution by repair synthesis or recombination is consistent with most of the experimental data. These data include the lack of gene targeting observed with self-complementary vectors containing double-stranded genomes [42], directional preferences for targeting in relation to transcription and replication [46•], and the results obtained with related parvoviral vectors based on minute virus of mice (MVM) that package only one vector strand, yet still target efficiently [49]. Most convincingly, these MVM vectors showed a strong strand preference for targeting that is difficult to explain without invoking a single-stranded intermediate [49] but is consistent with asymmetrical pairing at the replication fork [46•]. A recent report disagreed with this model and concluded that a double-stranded AAV vector genome recombines with the host chromosome based on colony sectoring analysis [50]. However, this study used vectors with multiple heterologies known to reduce targeting frequencies and mismatch repair-deficient cell lines, so their general significance in other settings is unclear. These same authors also found that introduction of a DSB at the target locus inverted the mechanism to an ends-in process whereby the free chromosome ends invade the double-stranded AAV targeting intermediate [50]. The implications of these findings for gene targeting strategies involving the combined use of user-designed nucleases and supply of an exogenous template for HR are discussed in the next section.

User-Designed Nucleases

The development of endonucleases targeted to user-defined sites in the genome stemmed from studies on semi-independent nuclease domains and user-targetable DNA-binding proteins, giving rise sequentially to zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and engineered homing endonucleases (meganucleases) [1, 51]. Each of these technologies involves DNA sequence recognition by protein-binding domains for which the underlying design algorithms have been defined empirically. The pros and cons of each technology are application dependent with considerations including ease of DNA-binding domain design, efficiency, specificity (including off-target events) and capacity for vectorization [51]. The most recent development, arising from research into bacterial adaptive immunity, is the clustered regularly interspersed short palindromic repeat (CRISPR)-Cas nuclease system [52, 53]. This system represents a quantum advance, as endonuclease targeting is based on Watson-Crick base pairing between a readily designed guide RNA sequence and the target DNA sequence. The guide strand (sgRNA) binds the DNA target sequence immediately upstream of a requisite 5′NGG motif targeting Cas9 endonuclease activity to produce a DSB 3-bp upstream of the motif. To further enhance targeting specificity and favour precise repair by HR rather than the error prone non-homologous end-joining (NHEJ) pathway, the Cas9 catalytic domain has been further engineered to produce single-stranded nicking activity (Cas9n). This nicking activity can be exploited using a double-nicking strategy to minimize off-target mutagenesis and improve the likelihood of homology-directed repair [54].

This technology has stand-alone utility when the desired outcome is the introduction of a disruptive mutation at a defined genomic locus, as repair of DSBs occurs through the error-prone NHEJ pathway. From a therapeutic perspective, this approach could be used to knock out a disease causing dominant allele, but more precise editing outcomes, such as repair of a mutant recessive disease locus, require the availability of a DNA template to facilitate repair of the DSB with gene correction by the HR pathway. At low efficiency, the repair template might be provided endogenously by the other allele but higher efficiency and more sophisticated editing, such as the insertion of a selection cassette, necessitates the delivery of an exogenous template. Depending on the specific target cell type and therapeutic context, the efficiency of template delivery and gene targeting becomes a critically important variable. Herein lies the significance and special promise of AAV-mediated gene targeting in genome editing for therapeutic purposes.

AAV-Mediated Gene Targeting and Disease Specific Considerations

Potential gene therapy and regenerative medicine applications for genome editing are extensive and can be divided into those involving ex vivo genetic modification of a target cell population followed by delivery to the patient (with or without further differentiation) and those necessitating in vivo genetic modification of a defined target cell population in situ in an organ or tissue. Genome editing ex vivo is more technically amenable, particularly when the target cell type can be readily expanded in culture without loss of the required phenotype and cells carrying the desired gene targeting events can be selected and, if necessary, clonally expanded to therapeutically useful numbers. Promising examples include induced pluripotent stem cells [5557], mesenchymal stem cells [58, 59], and fibroblasts [60]. In these instances, there is relatively less, from a targeting efficiency perspective, to distinguish between competing editing technologies where challenges associated with the efficient delivery of repair templates and/or user-designed nucleases can be off-set by selective expansion of corrected cells. Where individual clones can be expanded to therapeutically useful numbers, there is even the prospect of undertaking whole genome sequencing to preclude the presence of deleterious off-target events and other genomic damage acquired in culture. Other highly attractive target cell types, however, such as haematopoietic stem cells (HSC), while amenable to genome editing, are difficult to expand in culture without loss of stem cell properties and engraftment potential [61]. In such cell types, the primary efficiency of template delivery and targeted repair becomes critical, and here AAV-mediated gene targeting offers special promise compared with markedly less efficient physiochemical strategies for template delivery. There are also likely to be applications, where highly efficient primary gene correction rates are desirable as this could potentially obviate the need for selection and expansion, and dramatically simplify otherwise complex therapeutic protocols.

Some of the most exciting applications for AAV-mediated gene targeting, and for which this system currently offers unique promise, are those requiring genome editing of defined target cell populations in vivo. The liver provides the best contemporary example of this promise. In mice, highly efficient liver-wide gene delivery is readily achieved with AAV expression vectors pseudo-serotyped with murine liver tropic capsids [62], and multiple mouse models of genetic liver disease have been successfully treated using conventional gene addition strategies [63]. This capacity of AAV vectors for highly efficient liver-targeted gene delivery in mice has also been exploited to achieve gene targeting in vivo. In the earliest study using AAV gene targeting, vectors pseudo-serotyped with the AAV type 2 and 6 capsids, which are now recognized to be relatively poorly liver tropic in mice, Miller and colleagues successfully corrected both a mutant lacZ cassette at the ROSA26 locus and a naturally occurring mutation in the GusB gene that is associated with mucopolysaccharidosis type VII [64]. While technologically important, the highest correction frequency achieved was in the order of 3 × 10−5 gene correction events per hepatocyte which is well below the threshold required for therapeutic benefit, even for the most amenable disease targets. In a further study, Paulk and colleagues reported successful correction of a mouse model of tyrosinemia type 1, a metabolic liver disease in which gene-corrected hepatocytes have a selective growth advantage and can undergo replicative expansion to reach therapeutically useful numbers [65]. Maximal rates of targeted gene correction in excess of 1 × 10−3 events per hepatocyte were achieved using the highly murine hepatotropic type 8 capsid. In a subsequent study combining co-delivery of an AAV gene targeting construct and an AAV vector encoding a ZFN to induce DSBs at the target locus, gene repair frequencies up to 3 × 10−2 events per hepatocyte were achieved [66]. In the context of factor IX (FIX) deficiency (haemophilia B), where as little as 1–3 % of physiological FIX expression is of therapeutic value, this level of gene correction was sufficient to correct prolonged blood clotting times in a mouse model. A positive correlation was also reported between gene targeting efficiency and vector dose. Finally, in an even more contemporary in vivo study, again using the haemophilia B mouse model, AAV-mediated gene targeting has been used to successfully ameliorate bleeding without co-delivery of a user-designed nuclease [67••]. In this study, a promoterless human FIX cDNA was knocked in under the highly transcriptionally active endogenous albumin promoter, with on target integration reported at a frequency of 5 × 10−3 events per hepatocyte. The resultant FIX levels achieved were 7 to 20 % normal. Collectively, these studies highlight progress towards human therapeutic application of AAV-mediated gene targeting and the potential for broader application in other therapeutically important target cell types both in vitro and in vivo. This potential for broader therapeutic application, however, requires further technological developments. In vivo gene targeting also has significant potential as a research tool even if relatively inefficient, as evidenced by a recent study in which it was used to introduce a specific oncogenic mutation into mouse liver in a model of hepatocellular carcinoma [68].

Recent Advances and Future Prospects

There remains much to be learnt about AAV vector biology, in particular how specific capsids, whether native or engineered, influence key aspects of vector performance in defined target cell types within and across species. For example, the highly murine liver tropic type 8 capsid performs markedly less well in human hepatocytes as judged by transduction efficiencies achieved in chimeric mouse human livers [69] and inferred from human clinical trial data [70]. Accordingly, AAV transduction data obtained in mouse models and even nonhuman primate models, cannot be reliably extrapolated to predict performance in humans. As a result, there is increasing effort to generate data in primary human cells, both in vitro and in vivo using humanized mouse models, and to identify novel native and engineered capsids that are highly tropic for clinically relevant human cell types [69]. As a consequence, the power of the AAV vector toolkit can be expected to continue to increase, and higher cell type specific vector genome delivery efficiencies can be expected to confer increased gene targeting efficiencies.

As is already evident from earlier discussion of relatively tractable disease targets, such as haemophilia B, it is highly likely that the achievable genome editing efficiencies will alone be insufficient for many more challenging disease phenotypes. For a proportion of these diseases, largely dependent on the biology of the target cell populations involved, it is likely to become possible to selectively expand the subset of cells bearing desired gene targeting events. Strategies for achieving this are likely to include concurrent correction of mutations with nearby insertion of drug selectable expression cassettes. Again, this point is well illustrated by the developments of selection strategies in the mouse liver [71].

Conclusions

The technologies for genome editing are becoming increasingly powerful and have already been successfully used to achieve phenotype correction in a small number of murine models. The main limitation, however, for human therapy and for application to a broader spectrum of more challenging disease phenotypes, is efficiency. Gene-targeting vectors based on AAV are unique in that they simultaneously combine highly efficient delivery with precise genome-editing capacity. Moreover, they have special promise for direct in vivo applications, where AAV transduction vectors can be used to co-deliver user-designed nucleases if required to further enhance targeting efficiencies. Finally, when considered in concert with evolving strategies for selective expansion of successfully edited cells, the trajectory of genome editing technology is particularly promising such that the prospect of realizing therapeutic benefit is no longer a distant vision.