Objective

Ongoing habitat loss, overexploitation, climate change and other factors have severely reduced population size of many large mammals. In addition to preventing and reversing these causal factors of biodiversity loss, ex-situ captive breeding programs can be an effective method for boosting small populations. Captive breeding programs are being attempted to augment the wild population of several species and prevent their extinction [1, 2].

Genomic resources play an increasingly important role in conservation-related captive breeding projects [2]. For example, measurement of genetic diversity and relatedness between individuals may inform choice of breeding individuals [1]. Genomic resources may also aid in managing health and fertility of captive animals [2].

The Eastern black rhinoceros Diceros bicornis is critically endangered in the wild. The total number of individuals is estimated at 3,142, many of which live in captivity [3]. A captive breeding program aimed at boosting population size is currently ongoing [4]. Several aspects of this effort benefit from genomic resources, including genetics-informed choice of breeding animals, microbiome research in relation to health issues and the genetics of male fertility. Several genome assemblies are available for black rhino, but these are either from a female or produced using short reads, preventing the study of Y-chromosomal regions that may affect male fertility. We sequenced the genome of a male black rhino using ONT long reads with the aim of generating long contigs that can be assigned to the sex chromosomes.

We present here the results of the sequencing, assembly and annotation. A preliminary identification of sex-chromosomal regions in this genome assembly is outlined in [5].

Data description

A blood sample was collected from the Eastern black rhino ‘Vungu’ held at Blijdorp Zoo (Rotterdam, The Netherlands) during routine medical examination. DNA was extracted from whole blood using the Nanobind CBB kit (Circulomics) following the manufacturer’s instructions. ONT sequencing libraries were prepared using ligation sequencing kit v9 (SQK-LSK109) and v10 (SQK-LSK110) and sequenced on a Minion device with both R9 and R10 flowcells. FAST5 data was basecalled using the Guppy basecaller version 3.3.3. This resulted in 5.7 million reads with an average read length of 18 kb (Table 1, Data set 1, [6]). The sequence reads were assembled de novo using Flye with the parameters --nano-raw = ONT regular reads, pre-Guppy5, −i 2 = number of polishing iterations [7]. This resulted in a draft genome assembly of 2.47 Gb, consisting of 834 contigs with a contig N50 of 29.5 Mb. The draft genome assembly was annotated by lifting over the annotation of accession GCA_013634535.1 [8], which is the only Black rhino genome assembly for which an annotation is currently available. Over 99.9% of gene features were retrieved in the new genome assembly.

Table 1 Overview of data files/data sets

Limitations

The genome assembly is still fragmented and can be further improved using Hi-C, bionano or other techniques.