Introduction

Aquaculture continues to grow faster than any other major food production sector and is quickly becoming the main source of seafood in human diets. In this context, Norway is the largest producer of farmed Atlantic salmon (Salmo salar) worldwide. In later years, the production of salmon in Norway has ceased to grow due to sustainability challenges linked to open sea-cage rearing. Genetic introgression of farmed salmon into wild stocks and the marine parasite, salmon louse, are recognized as the two major concerns1. The high prevalence of salmon lice in most Norwegian fjords, due to open sea-cage farming, cause high lethality in wild salmonids and is hindering expansion of sea-cage farming. The consequences of genetic introgression caused by escapees remain uncertain, but existing knowledge indicates that it may lead to changes in life‐history traits, with potential ecological impacts2,3,4,5. Sequencing of the salmon genome6 has permitted more detailed studies on the link between genes and key traits, and we and others have shown that single nucleotide polymorphisms (SNPs) to a certain degree can explain the time of maturity1 and disease resistance7,8. In this context, New Breeding Technologies (NBTs) by gene editing may offer a solution to some of the problems in salmon farming, with a possible production of salmon displaying traits such as disease resistance and sterility9,10,11,12.

We have previously demonstrated the feasibility of double allelic KO in F0 salmon using CRISPR/Cas9, by targeting genes essential for pigmentation9, elongation of polyunsaturated fatty acids13 and reproduction10. At the same time, CRISPR/Cas9 KO-mutations targeting various phenotypes have been shown by others in several farmed fish species such as tilapia14,15,37,38,39,40,42,44,46. Only three studies in zebrafish have used NGS, showing lower perfect repair rates of 1–4%41, 1.7–3.5%45 and <1%43.

Comparing the different repair templates with respect to perfect repair and integration efficiency, our results showed no significant difference between 24, 48 and 84 bp homology using the S ODN. This is in contrast to Boel et al.41 who reported homology arm length to be the most influential factor and 60 bp homology arms to be the optimal length for symmetrical templates. Moreover, no apparent difference was detected between, ss- and dsODNs (when used at 1.5 µM), which is in contrast to previous findings in Drosophila48. Our results indicate that ss- vs. dsDNA template is not the main reason for the observed difference in efficiency between the plasmid and ODNs. We hypothesized that the result could be explained by the fact that the plasmid was injected in a substantially lower molar concentration (10 ng/µl is equivalent to 0.004 µM) than the ODNs (1.5 µM). Taken together, these results suggest a concentration dependent mechanism for ODN-mediated HDR in salmon embryos. Likewise, it has been reported that a 0.7 kb insert generated 75% edits when injected at 0.5pmol/l, but only 9% edits when injected at 0.1 pmol/l, in C. elegans49. Ideally, we would have liked to test a range of different concentrations for all the templates in our study, but this has not been feasible. Although salmon experiments face several challenges in terms of availability in material and slow development, we still believe it is crucial to test out HDR in salmon, as the outcome of the method seems to vary in different species. We hypothesize that the cold rearing temperature and slow development of Atlantic salmon may be an advantage in the context of HDR, allowing a longer timeframe for the integration to occur. The early ontogeny of Atlantic salmon has been described in detail by Gorodilov50, who showed how the duration of the developmental stages from fertilization is dependent on temperature. For example, if the eggs are kept at 6 °C, it takes about three months until hatching, in stark contrast to two days for zebrafish. Interestingly, it has been reported that cold shock-treatment increases the frequency of HDR gene editing in induced pluripotent stem cells51.

In addition to perfect reads, we detected several reads showing erroneous repair. These reads contained the FLAG insert, but also indels within the homology arms. Most interestingly, we found that the location of these indels were strongly dependent on the polarity of the ODNs. Using the AS ODN 89% of the indels were located on the 3′-side and using the S ODN 90% of the indels were located on the 5′-side. Similarly, when using the dsODN the indels were equally distributed on the 5′- and 3′-sides of the insert, indicating that the repair machinery has no preference regarding the template polarity (S vs. AS ODN). This, in combination with the similarity of the inserts with the ODN template sequence, also strongly indicates that DSB repair using ODNs initiates the SDSA pathway, as previously suggested for C. elegans49 and zebrafish41. Our findings suggest that the 3′-end pairing with the template and initial DNA synthesis occur with high fidelity, while the steps involving annealing, gap filling, and ligation are more prone to errors. The cause of these errors is unclear, but various mechanisms of template switching have been suggested41. Our data supports the template switching theory, as the origin of the inserts predominantly have high similarity with the ODN template sequence (Supplementary Fig. 6). To our knowledge, the ODN (S vs. AS) dependent location of indels has not been reported by others. We suggest taking this information into account when designing ODN repair templates for HDR. To obtain a high rate of in frame integration 5′- end indels must be avoided, making AS ODNs the preferred template.

We have in this study observed that ODNs (S, AS and ds) with 24, 48 and 84 bp homology arms integrates perfectly at a relatively high rate (up to 27%) into salmon embryos. These results are obtained from sequencing of DNA from fin clips, which might not perfectly reflect germline transmission efficiency. However, considering the high fecundity of salmon females (8000–10000 eggs), a potential quick integration into broodstock is possible by crossing F0s. For example, if parental F0 fish have 15% perfect integration, crosses will produce ~180–225 F1 offspring with double allelic KI. To increase the efficiency further studies could focus on the concentration of ODN template, as this clearly affects the efficiency of integration (Fig. 2). However, focus could also be aimed at Cas9. Currently we are using Cas9 mRNA, this probably results in more variants compared to Cas9 protein as observed previously52. Unfortunately, although we have performed multiple trials with Cas9 protein, we have not yet been able to successfully use it in salmon embryos. It is also possible to explore other nucleases53 to improve efficiency and accuracy of the CRISPR KI protocol. Another possibility would be to use short-life Cas9 variants, which have been reported to reduce toxicity and off-target editing54,55.

A challenge with the ODN technology is the possibility to make these ODNs long enough for, for example full gene integration. While synthesis of ODNs were previously restricted to a maximum length of <200 nucleotides, recent technologies now allow generation of longer sequences56,57, and simple ssDNA synthesis over 10 kb using asymmetric PCR has been demonstrated58. Commercial manufacturers also offer synthesis of long ssDNA, although at a relatively high cost. Nevertheless, this enables the insertion of longer sequences such as reporters, gene tags, regulatory elements or even genes. However, editing efficiency is sensitive to insert size, elegantly shown by Paix and colleagues by taking advantage of the split-GFP system47.

We have compared various DNA repair templates for HDR in salmon, and our results show that ODN templates induce highly efficient HDR integration at the target site, much higher than previously observed in any fish species. Our results also indicate that the integration occurs via the SDSA repair pathway and is dependent on template concentration. Interestingly our data also gives further clues to how the SDSA repair pathway may work, as we for the first time in any species show in detail that the distribution of indels is dependent on ODN polarity.

Methods

Ethics statement

This experiment was approved by the Norwegian Animal Research Authority (NARA, permit number 5741) and the use of these experimental animals was in accordance with the Norwegian Animal Welfare Act.

Preparation of Cas9 RNA and gRNA

The slc45a2 CRISPR target sequence is described in Edvardsen et al.9. The target sequence was blasted against the reference genome of salmon and show no other hits than to the gene in question. Preparation of gRNA and cas9 mRNA was performed as previously described9 with the following exceptions: for in vitro transcription of gRNA we used the HighScribe T7 Quick High Yield RNA Synthesis Kit (NEB) according to the protocol for short transcripts. The RNeasy MiniKit spin column (Qiagen) was used to purify the gRNA.

Design and preparation of donor DNA templates for slc45a2

S- and AS ODNs were ordered from Integrated DNA Technologies (Leuven, Belgium). They were designed by copying 24/48/84 nucleotides on each side of the CRISPR cut-site, with a 29 bp insert comprised of TT-FLAG-TAA. TT was included to keep the open reading frame of FLAG, and the STOP codon (TAA) was added to ensure an albino phenotype for slc45a2 CRISPR mutants, regardless of a successful KI-event. Aiming to compare ss- vs. dsDNA, we prepared a dsODN (with 24 bp homology arms) by annealing S and AS. The design is illustrated in Fig. 1. Another pair of S and AS ODNs (24 bp homology arms) were designed with the purpose of cloning into a plasmid (pCRTM4-TOPO vector). The design is identical to the one described above, with the addition of gRNA target sequences on each side for in vivo release of the template and A-overhangs in the 3′ends. The S and AS ODNs were annealed, and cloning performed according to the TOPO® TA Cloning® Kit for Sequencing. The different repair templates are described in Table 2.

Table 2 Description of the different repair templates used. *All repair templates were symmetrical, with both left and right homology arms of the same length. **The polarity of the ssODNs are relative to slc45a2.

Microinjection

Salmon eggs and sperm were delivered by Aquagen (Trondheim, Norway). Fertilization and microinjections were carried out as described previously9 using 50 ng/µl gRNA and 150 ng/µl cas9 mRNA in nuclease free water and a FemtoJet®4i (Eppendorf) microinjector. The ODNs (S, AS or ds) were added to the injection mix with a final concentration of 1.5 or 0.15 µM, and the plasmid with a final concentration of 2.5 or 10 ng/µl (corresponding to 0.001 and 0.004 µM, respectively).

Analysis of mutants

When kept at 6–8 °C, the salmon eggs will hatch approximately three months post fertilization. The slc45a2 CRISPR mutants are easily recognized in newly hatched embryos and in juveniles, due to the lack of pigment, and these individuals (albinos) were selected for further DNA analyses. DNA was extracted from caudal fins using DNeasy Blood & Tissue kit (Qiagen). To ensure complete homogenization, the tissue was incubated overnight at 56 °C using a thermomixer. DNA was eluted in 30 µl nuclease free water. To identify FLAG-positive mutants PCR was performed on genomic DNA, with the forward primer targeting the FLAG-sequence (5′-CTACAAAGACGATGACGAC) and the reverse primer targeting slc45a2 (5′-CGCAACGACTACACATTAT). The PCR-products were evaluated on 1% agarose gels. In order to verify insertion of FLAG and to assess the level of mosaicism, a fragment covering the entire target site was amplified in selected samples (n = 76) with a two-step fusion PCR to prepare for sequencing by Illumina MiSeq, as described in30. The following primer sequences were used in the first PCR-step; 5′-tctttccctacacgacgctcttccgatctCAGATGTCCAGAGGCTGCTGCT and 5′-tggagttcagacgtgtgctcttccgatctTGCCACAGCCTCAGAATGTACA (gene specific sequence indicated in capital letters).

Analysis of MiSeq data

Fastq files were filtered and trimmed with Cutadapt59, and variants were called using a custom script (Supplementary Fig. 5). Finally, read counts were reported for the variants containing the inserted sequence, separating those with a perfect match to the entire target sequence (referred to as perfect HDR), and those with a correct insert sequence but various mismatches in the rest of the target sequence (referred to as perfect FLAG + indels) (Fig. 2). In addition, read counts were reported for variants containing degenerated insert sequences (≥50% intact insert sequence, referred to as degenerated FLAG), and wild type sequences (Supplementary Fig. 3).

Analysis of indel locations in the “Perfect FLAG + indels” group

All the sequence variants were extracted (after filtration in the previous variant analysis) from the group called “Perfect FLAG + indels”. Using Geneious, the sequences were aligned to the reference gene containing the inserted sequence using the “Highest Sensitivity” option. The alignment was used to extract the information about indel positions, and for each deletion the location of the 5′ end of the deleted sequence was chosen to represent the position. The read count for each indel-containing variant was converted to percentage of the total read count of variants from the category “Perfect FLAG” for all individuals. The percentages were plotted on the reference sequence with colors showing AS 24 (green), S 24 (blue), ds 24 (red), S 48 (pink) and S 84 (black) (Fig. 3). In order to analyze the variation in indel positions between the different templates, the percentages of indels located either at the 5′- or 3′-side of the inserted sequence was calculated for each group.

Statistical analyses

D’Agostino Person normality test (column statistics) were used to asses normal distribution of the data. Non-parametric statistical analyses were performed using a Kruskall-Wallis test, followed by Dunn’s multiple comparison test. The tests were carried out using GraphPad Prism 8.0.1.