Introduction

γδ T cells are the ‘third’ type of lymphocytes, besides αβ T cells and B cells, that can rearrange gene segments at the DNA level in order to generate variable antigen receptors1,2. These three cell lineages have been conserved seemingly since the emergence of jawed vertebrates, with the notable exception of squamate reptiles3, while a similar tripartite subdivision exists even in jawless vertebrates such as lamprey and hagfish4,5. Emerging evidence suggests that their role in early life immunity might be a critical factor for this striking evolutionary conservation2,6,7,8,9,10,11,12,13,14,15. Indeed, human γδ T cells have been shown to react vigorously to infections in utero10,16 and early environmental post-natal exposure14,15 and in mouse models γδ T cells confer protection against parasite and viral infections in early life and/or when the αβ T cell compartment is compromised6,17,18. Furthermore, besides protection against infection, mouse models indicate that fetal-derived γδ T cells may play crucial physiological roles such as thermoregulation and the development of brain/short-term memory19.

Translation of γδ T cell biology findings from mouse models toward human are complicated by the lack of conservation of the γ and δ loci20,21. For example, in contrast to the conservation of αβ TCR-expressing innate T cells (MR1/metabolite-reactive MAIT, CD1d/lipid-reactive iNKT), human phosphoantigen-reactive Vγ9Vδ2 T cells do not exist in mice, and, vice versa, no human homolog of mouse dendritic epidermal T cells (DETC, γδ T cells highly enriched in the mouse skin epidermis) has been identified5. In human, it is becoming increasingly clear that the phosphoantigen-reactive γδ T cells are innate-like T cells, while nonVγ9Vδ2 T cells adopt an adaptive nature2,22,23,24. Despite this increasing knowledge about the effector functions and TCR recognition modalities, only little is known about the thymic development of human γδ T cells.

Like αβ T cells, γδ T cells are generated in the thymus where rearrangement of V, D, and J gene segments takes place in order to form a TCR at their cell surface. Conventional CD4 and CD8 αβ T cells leave the thymus as naïve T cells that can develop into the right functional effector cells in the periphery, depending on the type of pathogen encounter, such as cytotoxic CD8 αβ T cells and type 1, 2 or type 3 CD4 αβ T helper cells. Recent single-cell analysis revealed, however, that this CD4+ effector T cell pool generated in response to various pathogens cannot be easily parsed into discrete T helper lineages but instead forms a continuum of polarized phenotypes that is shaped by the specific pathogens25,26. We have recently shown that human fetal γδ thymocytes are already functionally programmed and are highly enriched for several ‘human-specific’ invariant/public TCR sequences27,28. Whether human γδ T cells are pre-committed towards distinct functional effector programs and whether this is linked to the expression of specific invariant/public fetal thymic γδ TCR sequences is not known. In particular, the distinct features of Vγ9Vδ2 T cells suggest that they could follow different rules during their thymic development.

Here, we took advantage of combining single cell (sc) RNA gene expression (RNA-seq) with sc γδ TCR sequencing to unravel human γδ T cell development. As such we identify developmental stage-specific thymocyte effector clusters and their concomitant TCR repertoire and differentiation pathways.

Results

Experimental design

In order to obtain insight into the effector programming in the human fetal thymus, we performed sc RNA/TCR sequencing on γδ thymocytes from six fetal thymuses, in parallel with three pediatric thymuses (Fig. 1A), followed by confirmation of selected findings at protein level by flow cytometry. We sorted γδTCR+CD3+ thymocytes (Supplementary Fig. 1A) before applying the scRNA/TCR sequencing protocol in order to link unequivocally particular gene expression profiles that can be shared by other (innate-like) lymphocytes29,30,31,32. Flow cytometry results showed a negative correlation between the gestation age of the fetus and the frequency of γδ thymocytes (Fig. 1B) and of Vγ9Vδ2 T cells among γδ thymocytes (Fig. 1C). From this analysis we selected for the scRNA/TCR experiments a series of fetal thymuses ranging from 14 to 22 weeks of gestation time, allowing the analysis of γδ thymocytes along these different ages, in particular the comparison of Vγ9Vδ2 and nonVγ9Vδ2 thymocyte development. The human pediatric γδ thymocytes, possessing only a low percentage of Vγ9Vδ2 T cells27 (Fig. 1C), did not show a correlation with post-natal age and we selected the ages 4.0, 4.5, and 11.0 years (Supplementary Fig. 1B).

Fig. 1: Experimental approach.
figure 1

A γδ T cells were sorted from human fetal and pediatric thymuses and were subjected to a modified version of 10x genomics 5′ protocol in order to amplify CDR3δ and CDR3γ regions. Flow chart created with BioRender.com. B Frequency (%) of γδ thymocytes out of total CD3+ living cells. C Frequency (%) of Vγ9Vδ2 thymocytes out of γδ+ thymocytes. R and p values (two-tailed) in B, C were obtained by doing Spearman correlation test in the gestation age graphs, while dot plots were analyzed by two-tailed unpaired t-test. B, C White dots indicate samples used in the sc RNA/TCR-seq experiments. “FT” group: fetal thymus. “PNT” group: post-natal thymus/pediatric thymus. Source data are provided as a Source Data file. See also Supplementary Fig. 1.

scRNA sequencing identifies heterogeneous immature and mature γδ thymocyte clusters in the human fetal thymus

After quality control and integration of the fetal γδ thymocyte datasets (n = 6), a total of 16,508 γδ thymocytes were retained for downstream analysis (average gene number = 1875; average Unique Molecular Identifier or UMI = 2851) (Supplementary Fig. 2A). Plotting the cells by Uniform Manifold Approximation and Projection (UMAP), led to the identification of 11 distinct clusters with some of them being more enriched in certain subjects (Fig. 2A, Supplementary Fig. 2B). Assessment of thymocyte maturation markers33,34,2,28 allowed us to obtain insight into the development of the three effector fates. Surprisingly, all three fetal thymic effector types decreased upon maturation their number of N additions, decreased their CDR3 length and increased the level of publicity of their TCRs. Interestingly, the Vγ9Vδ2 T cells of the small type 1/type 3 effector cluster in the post-natal thymus appeared to undergo similar enrichments upon maturation, in contrast to the Vδ1+NKp30+ post-natal cluster. During the transition from DP (double positive) towards SP (single positive) stage in αβ thymocyte maturation, the CDR3 of the α and β chain becomes shorter which is related to MHC-imposed structural constraints131,132. The reason for the preference for short public TCRs upon MHC-independent γδ thymocyte maturation towards effector fates is unclear. Since these sequences are encoded by germline-encoded gene segments or only contain a very low number of N additions, the position of the amino acids within their CDR3 is less variable and may provide the ‘right’ TCR signal. A possible example is the presence of a hydrophobic amino acid at position 5 of phosphoantigen-responsive TRDV2-containing CDR3 sequences82,83, of which the frequency increased upon maturation towards effector fates in the fetal thymus and which showed preferential pairing with the public TRGV9-containing CALWEVQELGKKIKVF CDR3. When more N additions are present, this can lead to, besides an increase in CDR3 length as such, a displacement of the germline-encoded CDR3 residues thus decreasing the chance to have a germline-encoded hydrophobic residue at position 5 of the CDR3δ sequence. Finally, shorter CDR3 length may influence the position of regions outside the CDR3 (such as the hypervariable region 4, HV4) of the TCR and thus the interaction with butyrophilins in the human thymus5,99,133. The relative contribution to the TCR signal in this setting of the CDR3 versus non-CDR3 TCR regions remains to be determined, but the significant changes observed here at the level of the CDR3 upon thymic maturation highlight the importance of the CDR3 in the maturation towards effector fates in the human fetal thymus.

The identification of different CDR3 sequence enrichments in the three fetal thymic effector clusters combined with their pseudotime developmental trajectories strongly suggest the presence of three developmental pathways. The fact that the three effector clusters did not show differences in CDR3 N additions, argues against a different timing in the generation of particular CDR3 sequences (as observed in the mouse model) as a possible explanation for their association with particular effector fates. We rather propose that the CDR3 sequence contributes to a difference in TCR signal strength during maturation and thus, together with signaling from other receptor types such as NKR and/or cytokine receptors and/or precursor frequencies59,74,75, to the type of effector fate. Type 1 and type 2/3 clusters split early onwards during the development of immature γδ thymocytes. Type 1 γδ thymocytes went then through several early stages of maturation associated with strong TCR and associated co-stimulation signaling, which is in line with the need for such signals for the development of type 1 γδ thymocytes in (genetically-modified) mouse models53,95,96,134. Furthermore, we identified genes (TNFRSF9, XCL1, TNFRSF13) across the human type 1 developmental pathway that are also highly expressed during the thymic Skint1-mediated and extra-thymic Btnl1-mediated TCR-dependent selection of mouse type 1 γδ T cells89,135. Of note, TNFRSF9 (4-1BB) has been shown to be induced on human Vγ9Vδ2 T cells by phosphoantigens136, consistent with a TCR-dependent and butyrophillin-dependent (BTN3A1, BTN2A1, BTN3A2)5,133 regulation of this co-stimulatory receptor. Thus, despite the known differences between mouse and human γδ T cells2,123,137 and the large difference in the timing of the development of the fetal/neonatal immune system138,139, our observations during human fetal type 1 γδ thymic development are strikingly similar to what has been shown in mouse models. The type 3 and type 2 developmental pathways were largely shared but split at the final maturation stage. Overall, based on the expression patterns of TCR-signaling related markers57,96,134,140 and of the transcription factor PLZF103,110,134,141,142, we propose that differences in timing and strength in TCR signaling result in associated differences of the transcription factor PLZF that then guides the final thymic γδ effector fate.

In summary, we have generated a cell atlas of human γδ thymocyte development across fetal and post-natal life, from the most immature stages until programmed effector fates. This combined database of gene expression and detailed TCR information at the single-cell level has provided insight into γδ T cell development in the human thymus and provides a resource for further study.

Methods

Human fetal and post-natal thymus

Human fetal thymus samples (n = 11) were obtained from 14 to 22 week estimated gestational age elective pregnancy terminations carried out for socio-psychological reasons with approval of the Singapore Singhealth Research Ethics Committee. Women gave written informed consent for the donation of fetal tissue to research nurses who were not directly involved in the research, or in the clinical treatments of women participating in the study. All the donors were informed about the purpose of the research and there was no compensation offered for donation. All fetuses were considered structurally normal on ultrasound examination prior to termination and by gross morphological examination following termination. Human pediatric thymus (9 donors aged between 1 and 11 years) samples were obtained from children that underwent cardiac surgery with approval of the Medical Ethical Commission of the Ghent University Hospital (Belgium). Samples from the previous sources were collected after all participants (when applicable, mothers/parents) gave written informed consent in accordance with the Declaration of Helsinki. Cell suspensions from fetal thymus and post-natal thymus samples were obtained as previously described28.

Flow cytometry and sorting of γδ thymocytes

For flow cytometry (assessment of percentage of γδ and Vγ9Vδ2 thymocytes) and associated cell sorting (FACS) of the samples used to generate the single-cell libraries, cells were thawed in complete medium, washed twice, labeled with Zombie NIR dye (0.5:100; BioLegend), and then subsequently stained with antibodies directed against CD3 (dilution 1:100 for flow cytometry and 2.5:100 for sorting; clone UCHT1; BV510 for flow cytometry or PB for sorting; BD Biosciences), TCRγδ (dilution 1:100 for flow cytometry and 15/100 for sorting; clone 11F2; APC (Miltenyi Biotec) for flow cytometry and PE (BD Biosciences) for sorting), TCRVγ9 (dilution 0.25:100 for flow cytometry and 0.625:100 for sorting; clone IMMU360; PE-Cy5; Beckman Coulter) and TCRVδ2 (dilution 4:100 for flow cytometry and 10:100 for sorting, clone IMMU389; FITC; Beckman Coulter). For sc experiments, CD3+γδTCR+ thymocytes were sorted (mean purity 98% of living cells) on a FACS Aria III (BD Biosciences). For bulk RNAseq experiments, the CD3+γδTCR+ thymocytes were further sorted into CD3+ γδTCR+Vγ9+Vδ2+ as “Vγ9Vδ2” (mean purity 95% of living cells), and CD3+γδTCR+ non(Vγ9Vδ2) as “non-Vγ9Vδ2” γδ T cells (mean purity 95% of living cells)28; αβ T cells (CD3+TCRγδ) were sorted as well in parallel (all around 10,000 cells) on a FACS Aria III cell sorter (BD Biosciences), snap-frozen in liquid nitrogen, and stored at −80 °C for later RNA extraction. To validate the presence of the distinct populations identified in the single cell data the following antibodies were used: CD3 (dilution 1:100, clone UCHT1; BV510; BD Biosciences), TCRγδ (dilution 1:100, clone 11F2; APC; Miltenyi Biotec), TCRVγ9 (dilution 0.25:100, clone IMMU360; PE-Cy5; Beckman Coulter) and TCRVδ2 (dilution 4:100, clone IMMU389; FITC; Beckman Coulter), CD4 (dilution 1:100, clone SK3; BUV395; BD Biosciences), CD26 (dilution 2:100, clone M-A261; BUV496; BD Biosciences), NKG2D (dilution 3:100, clone 1D11; BV421; Biolegend), CD196 (dilution 2:100, clone 11A9; BV650; BD Biosciences), CD1a (dilution 2:100, clone HI149; BV711; Biolegend), CD278 (dilution 0.5:100, clone C398.4A; BV785; Biolegend), CCR4 (dilution 2:100, clone 1G1; PE; BD Biosciences), CD94 (dilution 2:100, clone DX22; PE-Cy7; Biolegend), CD161 (dilution 2:100, clone DX12; R718; BD Biosciences), CD8a (dilution 2:100, clone RPA-T8; APC-Cy7; BD Biosciences), NKp30 (dilution 2:100, clone P30-15; PE-Dazzle; Biolegend). In these protein validation experiments, measurements were taken from 9 distinct fetal samples and 8 infant samples that were thawed in complete medium and washed twice prior to staining. iFluor860 (infrared fixable viability dye) (dilution 0.05:100; AAT Bioquest) was used to gate on live cells. In all cases (FACS or flow cytometry experiments), the data were analyzed using FlowJo software under version 10 (Tree Star). To generate the UMAP plots in Fig. 3 and Fig. 8, we used the Flowjo plugin “UMAP” (v3.1) and in both cases we computed it by Euclidean distances with 2 components, a value of 15 for the nearest neighbors parameter and a value of 0.5 as minimum distance. For the UMAP of Fig. 2, dimensional reduction process involved the following cell surface markers: CD1a, CD94, CD161, CD4, ICOS, CCR4, CCR6, CD26, and NKG2D. For the UMAP of Fig. 8, we used the values from the following markers: CD1a, CD94, CD161, CD4, ICOS, CCR4, CCR6, CD26, NKG2D, TCRVδ2, and TCRVγ9. In this last case, we decided to include TCRVδ2 and TCRVγ9 markers to facilitate the visualization of the small Vγ9Vδ2 effector cluster.

Single-cell RNA-seq and single-cell TCR (TRD/TRG)-seq libraries construction

Libraries for sc RNA and TCR sequencing were generated from 0.5–2 × 104 FACS-sorted γδ thymocytes from six fetal subjects and three children using the Chromium Single Cell 5′ Library Gel Bead and Construction kit as well as Chromium Single Cell V(D)J Enrichment Kit (10x Genomics, CA, USA) according to the user guidelines (v1 [PN-1000006] and v2 [PN-1000244] Chemistry, Single Cell V(D)J protocol number CG000086 and CG000331). Fetal sample selection included six fetuses with an estimated gestation time of 14 weeks, 15 weeks and 2 days, 16 weeks and 2 days, 17 weeks and 5 days, 21 weeks, and 22 weeks and 6 days, while post-natal thymuses were from patients with 4, 4 and a half, and 11 years of age, respectively. Measurements were taken from these distinct samples.

Single-cell TCR & gene expression libraries were generated according to Chromium Single Cell V(D)J protocol (10x Genomics). 2 μL of cDNA amplified and purified from GEMs (“Gel bead in EMulsion” droplets) were used to amplify γδTCR CDR3 sequences. Custom primers specific for TRDC and TRGC constant gene segments were designed for this purpose and were obtained from Eurogentec. In brief, for the first step in the enrichment of CDR3 sequences the custom primers TRGC: CAAGAAGACAAAGGTATGTTCCAG and TRDC: GTAGAATTCCTTCACCAGACAAG were used, while for the second target enrichment Cgamma ‘inner’: AATAGTGGGCTTGGGGGAAACATCTGCAT and Cdelta ‘inner’: ACGGATGGTTTGGTATGAGGCTGACTTCT were used. The remaining cDNA was used for gene expression library construction according to 10X Genomics protocol instructions. Agilent Bioanalyzer High Sensitivity DNA chips were used to check quality control read-outs of sc RNA-seq and sc TCR-seq libraries using a Bioanalyzer 2100 machine (Agilent Technologies). Indexed libraries were pooled and sequenced on Illumina NovaSeq 6000 device from BRIGHTcore (Brussels Interuniversity Genomics High Throughput core) platform.

Single-cell RNA-seq data processing

CellRanger (v3.0.2) software from 10x Genomics was used to demultiplex and map sequencing reads against the GRCh38 genome. Count matrices were loaded into R using ´read10x’ function from Seurat R package. All downstream analyses were implemented using R v4.0.3 and the package Seurat v3.2.3143. Low-quality reads were filtered using the cutoff nFeature_RNA > = 200, while the cutoff for maximal nFeature_RNA was manually set-up for each sample according to the samples cell distribution in order to exclude doublets. Percentages of mitochondrial genes were plotted as well, and outliers were removed to filter out dead cells. ‘Cellcyclescoring´ function from Seurat package was used to assign cell cycle phase of cells in the datasets (G1, G2, or S). Integration vignette from Seurat v3.0 was followed to generate merged Seurat objects (FT: 6 fetal thymus samples & PNT: 3 pediatrical thymy) using the ‘SCTransform’ function144 and regressing mitochondrial genes, cell cycle genes and TRDV & TRGV genes. Principal components (PCs) were calculated using ‘RunPCA’ and by using ‘ElbowPlot’ visualization, 20 dimensions were chosen as input for ‘RunUMAP’ function. UMAP representation was used to generate bidimensional coordinates for each cell. The k-nearest neighbors of each cell was computed using the ´FindNeighbors’ function and this knn graph was used to construct the shared nearest neighbor (SNN) graph by calculating the neighborhood overlap (Jaccard index) between every cell and its k.param nearest neighbors. Finally, the ´FindClusters’ function was used to cluster cells using the Louvain algorithm based on the same PCs as RunUMAP function (algorithm resolution FT = 0.3 & PNT = 0.5). Cluster algorithm resolution was chosen after analyzing the evolution of the clusters at different resolutions with clustree R package (v.0.4.3). The Differential gene expression analysis comparing gene expression of each cluster to all the others was performed by the ´FindAllMarkers’ function using Wilcoxon-Rank sum test method. DEGs were selected based on an average log2-fold change (logFC) ≥ 0.2, a percentage of expression superior than 10% in at least one test cluster (min.pct ≥ 0.1), a difference higher than 15% in the fraction of detection between the two groups (min.diff.pct ≥ 0.15) and adjusted p-value inferior than 0.05 (based on Bonferroni correction using all genes in the dataset). dittoSeq (v1.4.1) R package was used extensively to visualize Seurat object data.

Module scores

Single-cell gene signature enrichment scores were calculated using the ‘AddModuleScore’ function with the default parameters in Seurat. Egress score was manually curated using previously described markers described to be involved in thymocyte egress to periphery35,45,46,72. Type 1 score was named “CTL” score in the original paper where it was defined145 and type 3 score was termed “γδ17” score in the original paper [54].

GO and pathway enrichment analyses

Gene ontology (GO) analysis was performed by clusterProfiler package (v4.0.5)146. The gene list was arranged by logFC (decrescent order) obtained after comparing effector fetal clusters with the rest of cells using ‘FindMarkers’ function from Seurat with a min.pct ≥10%. GSEA was run using gseGO function with default parameters and using Benjamini–Hochberg method to obtain p.adjusted values. Enrichment results were plotted using ggplot2 R package (v3.3.5).

Single-cell TCR-seq analysis

Sc TCR libraries were generated by using CellRanger vdj pipeline (v3.0.2). Integrated FT and PNT Seurat objects (gene expression data) were combined with their respective sc TCR-seq (TCR sequence) data based on shared 10× cell barcodes and following the script provided here: https://www.biostars.org/p/384640/. Only those cells expressing productive TCR sequences (γ and/or δ chain) were retained for data integration in the Seurat objects using the ‘Addmetadata’ function. TRDV and TRGV sequences were used to check N nucleotides (N additions) and publicity levels. Number of N nucleotides was obtained using junctional analysis website tool from IMGT® (international ImMunoGeneTics information system®) website (MP, 2003). Barcodes were kept as identifiers for the input of the website tool and later used to embed the junctional information again in the Seurat objects in the metadata file and they were subsequently plotted using ggplot2 package (v 3.3.2). Publicity of TRDV and TRGV sequences was established by comparing individually all the CDR3 sequences from each single cell datasets against the CDR3 sequences of the other single-cell datasets (9 subjects in total, 6 fetal and 3 post-natal). In order to strengthen the analysis of publicity levels of CDR3 sequences, we decided to increase the number of subjects in the different comparisons by including CDR3 repertoire data obtained previously by bulk TCR repertoire. This new bulk TCR data included previously published data27,28 and also unpublished data, resulting in a series of γδ thymocyte repertoires of 10 different subjects (3 fetal thymus samples and 7 pediatric thymus samples). The CDR3 data of these 10 bulk TCR repertoires was originally divided in 2 files: data from sorted Vγ9Vδ2 thymocytes14 and nonVγ9Vδ2 γδ thymoctyes28. Because the goal of the publicity analysis is to check whether a specific sequence is present in one subject, we decided to merge the two files (Vγ9Vδ2 and nonVγ9Vδ2) in single combined files. Using base and dplyr (v1.0.7) R packages the amino acid CDR3 sequences of each of the thymy from the sc Seurat objects (6 subjects in the fetal thymus dataset and 3 subjects in the post-natal dataset) were interrogated individually against the bulk TCR data. The results of this analysis ranged from publicity values of 0 (present only in the interrogated sc TCR data) to 19 (present in the sc TCR data of the 6 fetal thymus samples and 3 pediatric thymus samples and the 10 bulk TCR repertoires). Results were added back in Seurat objects as metadata and plotted using ggplot2 package.

Lineage inference

Pseudotime trajectory analysis of fetal γδ thymocytes was performed with the Slingshot R package under version 2.0.0147. In order to remove confounding factors, we excluded cycling cells (G2 and S phase) and cells belonging to the type I IFN cluster following the same reasoning described previously in the literature72. Then, Principal components (PCs) were calculated using ‘RunPCA´ and by using ‘ElbowPlot’ visualization, 20 dimensions were chosen as input for ‘RunUMAP’ which was performed for 5 dimensions (instead of the standard 2 dimensions to reduce the distortion generated by the process of dimensionality reduction that can influence the lineage tracing results). Lineages were computed after selecting the cluster with immature features (based on gene expression) as a root. The calculated trajectories were overlaid into the UMAP embeddings. Genes that varied across the Slingshot trajectories were investigated with tradeSeq R package under version 1.6.0148, and were plotted as heatmaps of smoothed scaled gene expression using ´predictSmooth´ function from tradeSeq and pheatmap R package (v1.0.12). The code used to generate Slingshot object and the usage of tradSeq package was obtained from the following website https://nbisweden.github.io/workshop-scRNAseq/labs/trajectory/slingshot.html#Finding_differentially_expressed_genes.

Bulk RNA sequencing

RNA derived from sorted cell populations (Vγ9Vδ2, nonVγ9Vδ2 γδ, αβ) was isolated using the RNAeasy micro kit (Qiagen, Cat. No./ID: 74004). RNA quality was checked using a Bioanalyzer 2100 (Agilent Technologies). Indexed cDNA libraries were obtained using the Ovation Solo RNA-Seq System (NuGen) following the manufacturer’s recommendation. The multiplexed libraries were loaded on a NovaSeq 6000 (Illumina) using an S2 flow cell, and sequences were produced using a 200 Cycle Kit (Illumina, PN: 20028313). Paired-end reads were mapped against the human reference genome GRCh38 using STAR software (version 2.7.10a) to generate read alignments for each sample. Annotations Homo_sapiens.GRCh38.90.gtf were obtained from ftp. Ensembl.org. After transcript assembling, gene level counts were obtained using HTSeqd software. Differential expression was performed by using EdgeR quasi-likelihood running under the Degust platform. Only genes with a minimum count per million of 1 in each replicate were included. Volcano plots were generated using EnhancedVolcano R package (v1.10).

Statistical analysis

All statistical analyses were performed using GraphPad Prism software (v8.0.2).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.