Introduction

Wheat (Triticum aestivum L.) is one of the most important staple crops in the world, and estimates suggest that wheat production needs to increase 70% to fulfill food demand [1]. Therefore, maintaining food security and increasing yields to meet the higher demand of a growing global population has become a top priority for breeding programs [2].

Thousand gain weight (TGW), grain number per spike (GNS) and spike number per unit area are three major components of grain yield [3]. Among the above factors, the grain number per spike is affected by inflorescence architecture [4,5,6]. Plant inflorescences are small florets arranged on an axis, which can be divided into many types, such as racemes, spikes, corymbs, panicles etc. [7]. The wheat inflorescence is a spike, that usually consists of sessile spikelets arranged in opposite rows along the spike axis, each producing 3–5 florets [8]. Therefore, inflorescences, which determine the classification of Gramineae and the yield of food crops, have long been the subject of study of botanists and breeders [9]. In wheat germplasm resources, Supernumerary Spikelets (SS) is a special germplasm with many flowers and grains, which has the potential to improve the yield per unit area in wheat [17]. In addition, QFSN4B.4–17 responsible for SNS in multiple environments using a RIL population of 173 lines derived from a cross of the common winter wheat lines Shannong 01–35 and Gancheng 9411 was also identified previously [18].

To date, several genes affecting inflorescence structure and SNS have been identified and designated. For example, the HvMADS1 can directly regulate the expression of HvCKX3 by binding to the promoter through a CArG-box domain, which affect the cytokinin homeostasis and inflorescence structure in barley [19]. In rice, APO1 interacts with APO2 to synergistically control cell proliferation in meristem and regulates panicle formation, and the overexpression of APO1 or APO2 has been associated with significantly increasing the spikelet number [4, 20, 21]. Several genes related to SNS were identified in wheat, including WFZP [22], Ppd-1 [23], FT2 [24], WAPO1 [25], Q gene [26], and Grain Number Increase 1 [27].

Although there are many reports about QTL related to wheat spikelet number, only few of QTL have been genetically verified, offering a foundation for fine map** and map-based cloning. This has greatly restricted the dissection of the molecular mechanisms underlying spikelet number as well as improvement of spikelet number in wheat breeding. Thus, the identification and validation of novel QTL/genes for spikelet number is important. In this study, we used a RIL population (YC) derived from a cross between YuPi branching wheat (YP) and Chinese spring wheat (CS), to identify QTL associated with SS trait using a high-density genetic map and phenotypic data obtained from a multiple environmental trial, in different genetic backgrounds.

Materials and methods

Plant materials

The common parent YP is a bread wheat germplasm with the SS character [28, 29]. An RIL population generated by the single seed descent method from a cross YP × CS (YC population, 215 F6/F7 lines), was used for BSE-Seq and QTL map**. Three RIL populations derived from the crosses YP × CM107 (YCM population, 132 F6/F7 lines), YP × CM104 (YJ population, 60 F6/F7 lines) and YP × 11N21 (NY2 population, 73 F6/F7 lines), were used to validate the QTL in different genetic backgrounds.

Phenotypic evaluation and statistical analysis

YC RIL population were planted in three environments: Chongzhou (CZ, 103° 38′ E, 30° 32′ N) in 2021–2022 (2022CZ), Chongzhou and Wenjiang (WJ, 103° 41′ E, 30° 36′ N) in 2022–2023 (2023CZ and 2023WJ). A randomized complete block design was adopted in all environments. YCM, YJ and NY2 populations were planted at Chongzhou in 2022–2023. Fifteen seeds of each family were planted in 1.5 m rows spaced 30 cm apart. Field management was performed according to recommendations for wheat production. After maturity, five plants from the center rows were selected and used for phenotypic evaluation, including the number of SS per spike (NSS), SNS, TGW, GNS, grain weight per spike (GWS), grain length (GL), grain width (GW), flag leaf length (FLL) and flag leaf width (FLW). The SNS was determined by counting the total spikelets in the main spike, NSS was calculated by deducing the number of nodes per spike from SNS. GNS, GWS, GL and GW were measured by a Wanshan SC-G automatic seed test system (Hangzhou Wanshen Detection Technology Co., Ltd., CN). The TGW was calculated as 10-fold of the weight of 100 seeds measured with an electronic balance, in three replicates for each line [1].

Analysis of variance (ANOVA) for the NSS in each trial and Pearson correlations between variables were computed in SPSS v27 (IBM SPSS, Chicago, IL, USA). To minimize environmental effects, the best linear unbiased estimates (BLUE) for NSS in three replicates was calculated using SAS 8.1 (SAS Institute Inc., Cary, NC, USA). According to the method described by Smith, broad-sense heritability (h2) across environments was estimated, according to the equation h2 = VG/(VG + VGE/r + VE), where VG = genotypic variance, VGE = genotype × environment variance, r = number of replicates, and VE = environmental variance. Student’s t-tests (P < 0.05) were applied to compare lines classified by genotype [30, 31].

Bulked Segregant Analysis and Exome sequencing

Total genomic DNAs were isolated from young leaves of the tested cultivars and RILs using a CTAB method followed by RNase-A digestion [32]. Isolated DNA was quality checked by resolving on a 1% agarose gel electrophoresis and concentration was determined using a UV spectrometer [33, 34]. Two DNA pools were constructed by mixing equal amounts of DNA normalized to the concentration 100 ng/µL from 30 extreme SS (SS-pool) and 30 normal spike types (NS-pool), respectively. The DNA libraries were constructed through DNA fragmentation, end-repair, adaptor ligation, PCR and hybridization capture as previously described [35]. Curated sequence data was aligned to the CS reference genome sequence (RefSeq) v2.1 by BWA software [36]. The BCFtools software was used to detect and extract the single nucleotide polymorphism (SNP) and InDel [37]. The SNP and InDel were annotated using ANNOVAR, which mainly included different regions of the genome and different types of exon regions [38]. SNP-index methods were used to screen the SNP and InDel sites with candidate regions between the progeny mixed pools in this study [39,40,41,42]. To identify candidate regions associated with the SS trait, the ΔSNP-index of each locus was calculated by subtracting the SNP-index of the SS-pool from that of the NS-pool according to previous method [43]. To confirm the results of ΔSNP-index, an algorithm was further performed to identify the SNPs and InDels associated with the SS trait using the equation reported previously [31, 40], and the absolute value of Δ(SNP-index) was used for Locally Weighted Scatterplot Smoothing (LOESS) to obtain the correlation threshold [2]. The greater the ΔSNP-index, the more likely the SNPs and InDels contribute to the trait of SS or is linked to a locus that controls the trait.

Marker development, genetic map construction and QTL identification

To validate the BSE-seq results and further narrow down the region, primers were designed to the flank the sequences of the targeted SNPs using an online tool of the wheatomics platform (http://wheatomics.sdau.edu.cn/PrimerServer/). Polymorphic SNPs between YP and CS in the initial map** region were converted to KASP markers following previously described methods [44]. The PCR for InDel primers was conducted in a 20 µL reaction volume with 1.0 µM primer mix, 2.5 ng/µL DNA and 1× Taq Master Mix (Vazyme Biotech, CN). PCR was performed using routine procedures, and the polyacrylamide gel (8%) electrophoresis was used for resolving amplified products [34]. The polymorphisms of the markers were confirmed by parental liens and some progenies.

Linkage map** was conducted using JoinMap v4.0 [45]. The maximum likelihood map** algorithm and Kosambi’s function were used to determine the marker order and distance, respectively. QTL analysis of SS trait was performed by Interval Map** (IM) with the software MapQTL v5.0 [46]. For each trial, a test of 1,000 permutations was performed to identify the LOD threshold corresponding to a genome-wide false discovery rate of 1% (P < 0.01).

QTL validation

For QTL validation, the NSS from all homozygous lines in each of the three RIL populations (YCM, YJ and NY2) were counted. Based on marker profiles, individuals in each population were grouped into two classes as described above, and the difference in the average spikelet number between these classes was used for measuring the QTL effects within each validation population. The Student’s t-test was used to determine the significance in differences between the two groups in each population at P < 0.05.

Prediction of candidate genes for QTL

The physical interval of QTL was obtained by conducting a homolog search of the flanking makers against the Chinese Spring reference genome (IWGSC_Refseq v2.1), and the genes mapped to the region were identified. Further, information on the genes were obtained from the wheatomics platform (http://wheatomics.sdau.edu.cn/). The expression values as transcripts per million (TPM) in roots, stems, leaves, spikes, and grain were obtained from the GeneExpression of Wheatomics platform. Genes were annotated through BLAST against the corresponding protein sequences in rice and Arabidopsis thaliana on KOBAS v3.0 [47]. The total RNA was extracted by using the RNeasy plant mini kit (Qiagen, CN) from CS and YP at the correct stage, and the products after reverse transcription were used for quantitative RT-qPCR. The genomic DNA of parents was used to amplify candidate genes for sequence analysis.

Results

The NSS of YC RIL population in different environments

In the different environments, the NSS of CS remained stable at 0, while that of YP ranged from 56.4 to 65.8, with the NSS of YP being significantly (P < 0.01) higher than that of CS (Table 1; Fig. 1). The NSS trait in YC population was also observed in different environments, ranged from 0 to 63.6, and the broad-sense heritability was estimated as 0.96 (Table 1). The correlation coefficients between the different environments were all significant and ranged from 0.922 (P < 0.01) to 0.982 (P < 0.01) (Table 2). The BLUE values of SNS, GWS, TGW, GNS, GL, GW, FLL and FLW were shown in Table 1 and were used to assess the effect of SS QTL on these traits.

Table 1 Distribution of NSS in the YC population
Fig. 1
figure 1

Spike features of YuPi branching wheat (YP) and Chinese spring wheat (CS)

Table 2 Correlation coefficients for the NSS in the YC population

BSE-seq analysis

The results obtained from the exome capture sequencing of bulked segregants were compared with the CS reference genome v2.1. We use the ΔSNP-index algorithm to calculate the allele segregation of the SNPs and InDels between two extreme DNA pools. ΔSNP-index algorithm showed abundant candidate SNPs/InDels enriched on chromosome 2 A and 2D (Chr2A and Chr2D). ΔSNP-index greater than 95% confidence interval was selected as the threshold for screening [48]. A 16.3 Mb region of Chr2A (63.8–80.1 Mb) and a 9.1 Mb region of Chr2D (68.1–77.2 Mb) were identified as the candidate regions for SS trait (Figure S1).

Linkage map construction and QTL identification

To confirm the preliminarily identified genomic regions responsible for SS trait, SNPs and InDels in the target regions were converted into KASP and InDel markers, and were used for the construction of the genetic map. In total, 13 KASP markers and one InDel marker were used for the construction of the genetic maps. Six KASP markers developed for Chr2A generated a linkage map spanning 18.0 cM, while seven KASP markers and one InDel marker developed for Chr2D generated a linkage map spanning 25.2 cM (Fig. 2, Table S1). The phenotypic data of SS trait evaluated in the three environments were used for QTL map**. Two stable QTLs named QSS.sicau-2 A and QSS.sicau-2D were detected in three environments and BLUE. QSS.sicau-2 A located between A11 and A17, explained 13.7–15.9% of the phenotypic variance with the LOD values ranging from 6.9 to 8.1. QSS.sicau-2D located between INDEL and D3, explained 27.4–32.9% of the phenotypic variance with the LOD values ranging from 15.0 to 18.7 (Table 3). The favorable alleles of the two QTL were all contributed by YP.

Fig. 2
figure 2

Linkage map of chromosome 2 A and chromosome 2D. (a) Physical map of chr2A, with QSS.sicau-2 A mapped on a region of the 3.0 cM genetic map. (b) Physical map of Chr2D, with QSS.sicau-2D mapped on a region of the 1.8 cM genetic map

Table 3 QTL analysis for SS in different environments and the BLUE datasets

Effects of supernumerary spikelet QTL in different genetic backgrounds

Three KASP markers (A4, D3, 2DYC) and one InDel marker (INDEL) were used to assess the effects of QTL on NSS in validation populations (A4 is linked to QSS.sicau-2 A and exhibits polymorphism across the three validation populations. INDEL, 2DYC and D3 are linked to QSS.sicau-2D and exhibit polymorphism in YJ, G2 and NY2 populations, respectively). In YJ population, the average NSS in homozygous “AA, DD” genotype was 17.71, that was significantly (P < 0.01) increased than that in homozygous “aa, dd”, “AA, dd” and “aa, DD” genotypes. Similarly, in G2 and NY2 populations, the average NSS in homozygous “AA, DD” genotypes were 15.75 and 14.52, respectively, that significantly (P < 0.01) increased than that in other three homozygous genotypes (Fig. 3).

Fig. 3
figure 3

The effects of QSS.sicau-2 A and QSS.sicau-2D on NSS in the validation populations. (a) YP/CM104 RIL population. (b) YP/CM107 RIL population. (c) YP /11N21 RIL population. ** indicate significance level at P < 0.01 by the Student’s t-test. NSS, the number of SS per spike

Candidate gene prediction

According to the CS genome (IWGSC Refseq v2.1), QSS.sicau-2 A was located between 70.6 Mb and 78.2 Mb on Chr2A, and 91 high-confidence genes, including WFZP-A (TraesCS2A03G0239400) spanned the region. Sequence analysis of the WFZP-A (Table S2), revealed a 4-bp deletion in WFZP-A resulting a frame-shift in YP (Figure S2, Table S3), and the variation type is consistent with Zang734 [49]. These results suggest that WFZP-A is likely responsible for QSS.sicau-2 A.

QSS.sicau-2D was mapped between 72.7 Mb and 75.1 Mb on Chr2D, revealing 34 high-confidence genes in the region (Table S4). Eight spike/grain-specific genes were identified by gene expression of WheatOmics (http://202.194.139.32/expression/index.html), that might probably be involved in spike growth and development (Table S5, Figure S3). Exome sequencing data of YP reveal one T to A substitution which converts the codon (TAT) to a translational stop codon (TAA) in TraesCS2D03G0260700 (Fig. 4). Additionally, we examined the coding sequences of the known branching gene WZFP-D (TraesCS2D03G0248500, chr2D: 69,940,372.69,941,615). No sequence variation between YP and CS was identified for WZFP-D, and no significant difference (P > 0.05) in gene expression in YP and CS (Figure S4, Table S3).

Fig. 4
figure 4

Variation sequence and expression patterns of TraesCS2D03G0260700. (a) The nucleotide of YP and CS are shown in red and black, respectively. (b) The amino acid sequence encoded by the TraesCS2D03G0260700 in YP and CS. (c) TraesCS2D03G0260700 expression patterns in different tissues, the data (TPM value) were downloaded from the GeneExpression of WheatOmics (http://wheatomics.sdau.edu.cn/). Z10, first leaf through coleoptile. Z13, 3 leaves unfolded. Z23, main shoot and 3 tillers. Z30, pseudo stem erection. Z32, 2nd node detectable. Z39, flag leaf ligule/collar just visible. Z65, anthesis half-way. Z71, caryopsis water ripe. Z75, medium milk. Z85, soft dough

Effects of QSS.sicau-2 A and QSS.sicau-2D on yield-related traits

Two QTL, QSS.sicau-2 A and QSS.sicau-2D, were identified in YC population. Their effects on the yield-related traits were analyzed by linking markers. The YC population was divided into four genotypes of “aa, dd” “AA, dd”, “aa, DD” and “AA, DD” by molecular markers. The SNS, GNS and GWS of homozygous “AA, DD” genotypes were significantly higher than that of other three genotypes, the TGW, GL and GW of homozygous “AA, DD” genotypes were significantly lower than that of other three genotypes (Fig. 5a and f). Additionally, the FLL and FLW were not affected by QSS.sicau-2 A and QSS.sicau-2D (Fig. 5g and h).

Fig. 5
figure 5

Effects of QSS.sicau-2 A and QSS.sicau-2D on grain traits and flag leaf. Bar pattern and significance analysis of (a) SNS, (b) GNS, (c) GWS, (d) TGW, (e) GL, (f) GW, (g) FLL and (h) FLW. Each bar shows mean ± SD. Different letters indicate significance level at P < 0.05 by the Tukey’s test. SNS, spikelets number per spike. GNS, grain number per spike. GWS, grain weight per spike. TGW, thousand gain weight. GL, grain length. GW, grain width. FLL, flag leaf length. FLW, flag leaf width

Discussion

The complexity of the wheat genome combined with the fewer spikelet number mutants, hinder the identification of beneficial variants associated with spikelet number. To enhance the yield potential of wheat, breeders have tried to alter its sink capacity by modifying spike morphology [50]. The supernumerary spikelet (SS) character of bread wheat (Triticum aestivum L.) is an abnormal spike morphology with extra spikelets per rachis node. In some materials, the SS trait was unstable, and some environmental factors and photoperiod could affect the expression of bh gene for SS character [51, 52]. In this study, YP maintained SS trait in multiple environments, which provides an available germplasm resource for improving wheat spikelets.

Comparison of QTL identified for wheat spikelet number with previous studies

In recent years, many QTLs related to SNS have been widely reported and were mapped to all 21 chromosomes of wheat [14, 53,54,55,56]. QSns.Y4Y-2 A, a QTL related to SNS, was detected in multiple environments and mapped between 85.6 Mb and 91.1 Mb [57]. Li [58] found one local cultivar YM44 that has SS phenotype, which was linked to the marker Xwmc522 and cfd56 mapped on chromosome 2 A and 2D, respectively, and further confirmed that WFZP-A is the functional gene of chromosome 2 A that causes SS phenotype. In this study, we found that WFZP-A (Chr2A: 71,582,645.71,583,948) exists in the QSS.sicau-2 A (70.6–78.2 Mb) interval, and the similar variation existed in YP by the sequences analysis of WFZP-A in YP and CS (Figure S2). Therefore, we speculate that WFZP-A may be responsible for QSS.sicau-2 A.

In addition, several QTL controlling SNS trait have been reported on chromosome 2D in wheat. QSns.sau-2D was mapped on chromosome arm 2DS flanked by the markers AX-109,836,946 (32.8 Mb) and AX-111,956,072 (34.4 Mb) [59]. Gene Mrs1 on the short arm of chromosome 2D, that closely linked to Xgwm 484 (50.6 Mb), and further identified the candidate gene WFZP-D (chr2D: 69,940,372.69,941,615) by homologous function annotation [22, 60]. In this study, QSS.sicau-2D was mapped to the genomic region of Chr2D between 72.7- and 75.1 Mb, and no overlap between the QTL interval associated with SNS and indicating QSS.sicau-2D may be a novel QTL for SS trait.

Potential candidate genes for QSS.sicau-2D

To screen the candidate gene for QSS.sicau-2D, we amplified WFZP-D in the cDNA of both parents even though WFZP-D is not in the interval of QSS.sicau-2D. The results showed that WFZP-D had no sequence difference between YP and CS. Furthermore, there was no significant difference (P > 0.05) in the expression of WFZP-D in YP and CS by RT-qPCR analysis (Figure S4). One SNP was identified between CS and YP 60 bp downstream of the WFZP-D and a KASP marker named D21 was developed to assay the polymorphism. The D21 marker was mapped 9.4 cM away from QSS.sicau-2D in the reconstructed genetic map (Figure S5). These results indicated the WFZP-D is not the candidate gene for QSS.sicau-2D.

Eight spikelet/grain specific expressed genes were identified in the QSS.sicau-2D region, among which, one SNP converts the codon (TAT) to a translational stop codon (TAA) of TraesCS2D03G0260700 in YP by exome sequencing and Sanger sequencing analysis. In addition, TraesCS2D03G0260700 encoding cyclin-like F-box domain protein, and previous studies have shown that cyclin-like F-box domain is specifically expressed in wheat inflorescence and involved in the development of wheat inflorescence [61,62,63,64,65]. Overall, TraesCS2D03G0260700 was predicted to be a potential candidate gene for QSS.sicau-2D.

Effects of QSS.sicau-2 A and QSS.sicau-2D on yield-related traits

Consistent with previous studies, GL, GW and TGW have decreased while GNS has increased, with the increase of SNS (Fig. 5), such as a single mutation in WFZP-D can significantly increase the SNS and GNS [49, 58, 59]. GWS is an important trait influencing wheat yield [66]. In this study, we found that the GWS exhibited a significant increase in YC population when QSS.sicau-2 A and QSS.sicau-2D coexisted. This may be attributed to the compensatory increase in grain number per spike (GNS) offsetting the decrease in thousand-grain weight (TGW). In conclusion, the SS trait have the value of increasing the yield potential of wheat.

Conclusion

In this study, two major SS QTL were identified in multi-environments and validated in different genetic populations. QSS.sicau-2D is a new QTL of SS trait, and predicted TraesCS2D03G0260700 to be its candidate gene. At the same time, the combined effect of QSS.sicau-2 A and QSS.sicau-2D significantly increased GWS, which has the application value of increasing the yield potential of wheat.