Background

Auxin is an important plant phytohormone that is involved in a variety of processes which include: apical dominance, vascular tissue differentiation, lateral root elongation, fruit development, flowering, and stress responses [1,2,3]. Auxins exert their effects via signal transduction pathways which include many gene families such as auxin response factor (ARF), auxin/ indole-3-acetic acid (Aux/IAA), Gretchen-Hagen 3 (GH3), and Small Auxin-Up RNA (SAUR) [4].

ARFs are transcription factors that exert their effect by binding to auxin response elements (AuxREs) located in the promoter regions upstream of auxin-responsive genes [5]. The basic structure of a typical ARF includes three conserved domains: a DNA-binding domain (DBD), an auxin responsive (aux_resp) region, and an Aux/IAA domain. These structures have been described in great detail [6, 7]. The conserved DBD is located near the N-terminus of the sequence and functions by recognizing AuxREs in promoter regions which allows the ARF to bind to the DNA sequence. The aux_resp region is a conserved region located in the middle of the sequence. This sequence sometimes has an amino acid composition bias that allows the ARF to function as a transcriptional activator or a repressor. Glutamine (Q)-rich middle regions are present in ARFs that are transcriptional activators, while serine (S)-rich, serine and glycine (SG)-rich, and serine and proline (SP)-rich middle regions are present in ARFs that are transcriptional repressors. The Aux/IAA domain is located at the ARF’s C-terminus, and it has a PB1 domain that is similar to that of Aux/IAA proteins which allows for dimerization between both proteins.

The Aux/IAA gene family in plants has been reviewed by Luo et al. [8]. The genes encode short-lived nuclear proteins, which inhibit ARFs by binding to them under low auxin concentrations. At higher auxin concentrations, Aux/IAA proteins bind to the transport inhibitor response 1/auxin signalling F-Box (TIR1/AFB) complex, causing the rapid ubiquitination and degradation of Aux/IAA and the subsequent release of ARFs, which can activate transcription. The GH3 gene family is responsible for maintaining auxin balance, but it does not seem to have a conserved domain [9]. The GH3 protein is responsible for forming conjugates between amino acids and the hormones: auxin (IAA), jasmonic acid (JA), and salicylic acid (SA). These conjugates are not biologically active and are targeted for ubiquitin degradation. The SAUR gene family regulates plant development by acting as an effector of hormone signals, and its transcription can be rapidly induced with 2–5 min of auxin signalling [10].

Due to the importance of auxin signalling proteins in plant developmental responses, identification and functional characterization of such proteins in various plants have been conducted. The ARF, Aux/IAA, GH3, and SAUR gene families have been characterized in several economically important crops such as Arabidopsis thaliana [11], castor bean [9], cucumber [12], cotton [10] and potato [13]. To date, the repertoire of auxin early response proteins in hexaploid sweet potato has not been fully characterized, despite their importance in the sweet potato tuberization process [14]. It is necessary to characterize these proteins to further understand their roles in sweet potato tuber initiation and development.

Sweet potato (Ipomoea batatas [L.] Lam.) is a hexaploid staple crop that is ranked sixth in importance worldwide among the food crops produced [15]. Consequently, decades of research have been conducted to investigate how this crop tuberizes, in order to improve yields. However, analysis of this crop at the molecular level is not as easy as with other economically important crops since its complex genome makes it difficult to obtain a complete reference genome [16]. In other tuberizing crops, such as Solanum tuberosum [17, 18], Ipomoea trifida [19], and Manihot esculenta [20], several auxin-responsive genes are up-regulated during tuberization.

This study seeks to characterize and investigate the expression of sweet potato IbARF, IbAux/IAA, IbGH3, and IbSAUR genes during tuberization. Phylogenetic analysis, motif analysis, and promoter analyses were performed. The expression studies of these genes in public databases were analysed and confirmed with qRT-PCR. Our results represent the first genome-wide characterization of the ARF, Aux/IAA, GH3, and SAUR genes in the hexaploid sweet potato. These results will facilitate better annotation of the sweet potato genome and provide insights on controlling the tuberization process, towards increasing crop production and food sustainability.

Results

Identification and characterization of IbARF, IbIAA, IbGH3, and IbSAUR genes

After HMMER searching, manual inspection of the domain organization via the NCBI CDD, removal of redundant sequences, and clustering of highly similar sequences, 29 IbARF sequences (IbARF1IbARF28), 39 IbAux/IAA sequences (IbIAA1IbIAA33), 13 GH3 sequences (IbGH3.1 – IbGH3.13), and 200 IbSAUR sequences (IbSAUR1 – IbSAUR200) were obtained. Their predicted biochemical characteristics are summarized in Table 1 and Table S2. The predicted novel isoforms of the genes are listed in Table S6 (Additional File 16).

Table 1 Summary of ARF, Aux/IAA and GH3 gene families in sweet potato

The IbARF, IbIAA, IbGH3, and IbSAUR genes were distributed unevenly across the chromosomes, with the majority of the IbSAURs located on Chromosome 14. The genes with similar intron–exon arrangements clustered together on a phylogenetic tree (Figs. 1, S1, S2, S3). The proteins encoded by these genes showed a wide range of molecular weights (MWs) and predicted isoelectric points (pIs). Analysis of the domain organization in the NCBI CDD indicated that 19 of the ARF proteins had the canonical structure consisting of the B3 DNA-binding domain, the conserved middle region, and the conserved C-terminus domain. CDD analysis indicated that not all the IbIAA proteins were complete matches to the canonical Aux_IAA domain, with 16 sequences being truncated.

Fig. 1
figure 1

Map showing the intron–exon structure of the IbARF coding sequences (figure created on the GSDS server). The left panel illustrates a neighbour-joining (NJ) phylogenetic tree based on the aligned sequences with 1000 bootstrap replicates. Sequences with similar intron–exon structure cluster together in the NJ tree

Motif analysis of IbARF, IbIAA, IbGH3, and IbSAUR sequences

The results of investigation of the domain architecture of the protein sequences revealed the presence of several highly conserved motifs, many of which were functional domains that were present in the Interpro database.

ARF Of the 20 motifs searched, Motifs 1 and 5 represented the B3 DNA binding domain, as shown in Fig. S4. One or both of these motifs were found in all IbARF sequences. Motif 2 represented the aux_resp middle domain and this motif was observed in all the sequences. Motif 3 represented the PB1 domain (domains III and IV of Aux/IAA proteins). This motif was present in 22 proteins. Motifs 4, 11, 12, and 15 matched the ARF protein Interpro domain. Motif 15, which is rich in Q residues, was observed in the middle region of 7 proteins (ARF5, ARF6, ARF7, ARF8, ARF12, ARF21, and ARF22).

Aux/IAA Of the four domains (I, II, III, and IV) found in canonical Aux/IAA proteins (Fig. 2, Table 1), four motifs were observed, each corresponding to one of the domains. Motif 4 was found in 25 sequences, corresponding to Domain I which contains the “LxLxL” ethylene response factor (ERF)-associated amphiphilic repression (EAR) motif [8]. Motif 3, which corresponds to Domain II, was present in 33 sequences and this motif contains the “GWPPV” degron sequence which controls the turnover of these sequences [8]. Motifs 2 and 1 corresponded to Domains III and IV respectively and both domains represent the Phox and Bem1p (PB1) domains (IPR000270) which allow Aux/IAA proteins to form homodimers with themselves or heterodimers with ARF proteins [8]. Motifs 2 and 1 were each found in 36 and 35 proteins, respectively. Twenty-three of the 39 IbIAA sequences had motifs that corresponded to all four domains (Motifs 1, 2, 3, 4) as seen in canonical Aux/IAA proteins.

Fig. 2
figure 2

Conserved motifs in IbIAA sequences identified by the MEME software (Figure created using TBTools). There are no duplicated motifs within a sequence, and the order of motifs in the sequences is the same

GH3 Motif analysis of the 13 GH3 protein sequences yielded 20 different motifs (Fig. S5). Motif 1 was found in all 13 sequences, and this motif corresponded to the GH3 family (IPR004993) in the Interpro database. All 13 sequences had a combination of Motifs 2–12 which also belonged to the GH3 family.

SAUR A maximum of ten motifs were observed upon examination of the SAUR protein sequences with MEME (Fig. S6) Motif 1 was found in 186 sequences and this motif represented a conserved SAUR domain in the Interpro database. Motif II was found in 169 sequences, while Motif III (which corresponded to the SAUR domain, pfam02519, in the CDD) was found in 63 sequences. Fourteen remaining sequences lacked Motif I.

Phylogenetic analysis of IbARF, IbIAA, IbGH3, and IbSAUR sequences

The Neighbour-joining phylogenetic tree (Fig. 3) displays the grou** of the ARF proteins into three distinct classes, A, B, and C. Class A contains 7 IbARFs with Q-rich middle regions. Class B and Class C contain 14, and 8 IbARFs, respectively. Classes B and C are rich in serine, proline, and threonine [21].

Fig. 3
figure 3

Neighbour-joining phylogenetic tree showing the phylogenetic relationships between ARF sequences. The trees were constructed using 1000 bootstrap replicates in MEGA 11. The I. batatas sequences were represented by a red circle. The numbers on each branch represent the percentage of replicate trees that clustered together in the bootstrap test. The branches are coloured into three classes according to the classification of Finet et al. [22] and used by Song et al. [13]

The IbIAA phylogenetic tree (Fig. S7) shows that the sequences are grouped into 6 clades which we labelled Clades A-E (as previously described by Wu et al. [23]) and Clade F, which contains non-canonical IAA sequences that were excluded from the study mentioned above. All the IbIAA proteins were distributed among all 6 clades.

The GH3 phylogenetic tree is illustrated in Fig. S7. The sequences are clustered into 3 groups as previously reported [24], with the IbGH3 sequences present in only two groups. There were no IbGH3 sequences in Group 3, which consisted of only AtGH3 proteins.

Fig. S7 illustrates the phylogenetic tree created from 79 A. thaliana sequences, 199 IbSAUR sequences (IbSAUR151 excluded), and 58 Oryza sativa protein sequences. The sequences were grouped into clades that were described by Zhang et al. [25]. Clade I and Clade V had the largest number of IbSAUR members, possibly arising from gene duplication events. All ten clades had members of the IbSAUR family. None of the DEGs belonged to Clade I, but IbSAUR33 and IbSAUR62 were part of Clade V.

In silico analysis of cis-acting regulatory elements (CREs)

PlantCARE analysis of the 2,000 bp region upstream of the start codon for each of the genes identified revealed a variety of core promoter elements (Table S4). All the ARF, IAA, and GH3 genes had light-responsive elements in their promoter sequences, with most genes having multiple types of light-responsive elements. Light responsive elements were found in 196 of the SAUR genes.

Most of the promoter sequences contained at least one CRE involved in responsiveness to various hormones such as auxin, gibberellin (GA), SA, ethylene, abscisic acid (ABA), and JA. Four types of auxin-responsive elements were found in the ARF, Aux/IAA, and GH3 promoter sequences: AuxRR-core, TGA-box, AuxRE, and TGA-element. Thirty-nine SAUR genes had auxin-responsive elements. These results are consistent with those of Feng et al. [9], who found that not all ARF, Aux/IAA, and GH3 promoter sequences from castor bean (Ricinus communis) had auxin-responsive elements.

Some of the promoter sequences had elements associated with responding to abiotic and biotic stresses such as low temperature (LTR), abiotic stress (as-1), wounding (WUN-motif, WRE3), drought (MBS), and low oxygen (GC-motif). Some sequences had cis-elements that are involved in plant development processes such as meristem expression (CAT-box), circadian control (circadian), seed-specific regulation (RY-element), endosperm expression (GCN4-motif), and negative regulation of phloem expression (AC-I, AC-II). Some promoter sequences had binding site-related elements such as AT-rich element, Myb-binding site, Box III, and Unnamed_1 (Table S4). A more detailed breakdown of the numbers of the different types of CREs found in the DEGs observed during tuber initiation is presented in Table S5.

In silico gene expression analysis of auxin signalling genes

The results of in silico gene expression analyses of each auxin signalling gene are presented (Figs. 4, 5 and 6 and Figs. S8, S9 and S10).

Fig. 4
figure 4

Heatmap showing the gene expression of IbARF genes obtained from: a RNA-seq data [26] obtained from various tissues for both Xuzi3 and Yan252 sweet potato cultivars. The colour scale bar represents the log2(FPKM + 1) values. b RNA-seq data [16] obtained from FRs and SRs at various stages of development. The colour scale bar shows that blue indicates down-regulated expression and red represents up-regulated expression. Colours represent log2FC. The raw log2FC data is indicated for statistically significant (adj. p-val. < 0.05) differential expression only

ARF All 29 ARF genes were expressed across all the plant tissues studied (Fig. 4a). However, some genes (IbARF13, IbARF19, IbARF20, and IbARF27) had FPKM (fragments per kilobase per million) values that were less than one and were only expressed in non-root plant parts. The gene expression in both cultivars was generally similar, although some genes, such as IbARF22 (which has higher expression in Yan252 root tissues), have cultivar-specific expression. IbARF3 and IbARF12 had very high expression in the shoots and young leaves. IbARF4 had very high expression in the stems of both cultivars. IbARF17 had high expression in roots and green stems.

Since FPKM should not be used to make statistical comparisons across samples, a separate dataset was used to calculate fold changes based on DESeq2 normalization (Fig. 4b). No DEGs were obtained for the 20 DAT vs. 10 DAT comparison and this comparison was not investigated further. The expression of ARF8, ARF10, ARF12, and ARF26 genes were significantly higher in storage roots (SRs) compared to fibrous roots (FRs) from 40 DAT and beyond. ARF18 was only significantly differentially expressed at the 30 DAT stage only. ARF4 was significantly up-regulated in SRs at the 50 DAT stage. Many of the ARF sequences were not significantly differentially expressed at any of the stages that were investigated. IbARF5 was significantly down-regulated in SRs compared to FRs. Most of the IbARFs had no significant change in expression or showed down-regulation in response to ABA, MeJA, SA, drought, salt, or cold (Fig. S9). The only exceptions were IbARF18 and IbARF23, which were up-regulated in leaves in response to cold (Fig. S9).

IAA The expression of the IAA genes is summarized in Fig. 5. IbIAA17 was strongly expressed in shoots, FRs, and initiating tuberous root (ITRs). As shown in Fig. 5a, some genes had the highest FPKM values in shoots, leaves, and stems, while other genes, such as IbIAA27 had high expression in all the tissues. IbIAA16 had high FPKM values in expanding tuberous roots (ETRs), ITRs, shoots, and young leaves. There were several genes (IbIAA5a, IbIAA6) that had negligible expression in roots but were expressed in the other plant parts. Figure 5b illustrates the fold changes observed between SR and FR. Some genes were up-regulated at the 30 DAT time point only (IbIAA-2, -5a, -15, -24) while IbIAA16 and IbIAA17 were up-regulated at 40 DAT and 50 DAT. IbIAA-1, -26, and -31 were down-regulated at 40 DAT while the other genes were not differentially expressed. Most of the IbIAAs had no significant change in expression or showed down-regulation in response to ABA, MeJA, SA, drought, salt, or cold (Fig. S9). IbIAA-1, -2, -4, -11, -29, -30, and -32 were up-regulated in one or more plant part in response to cold treatment. IbIAA-12 and -26 were up-regulated in response to ABA, MeJA, drought, and salt treatments, while IbIAA18 was up-regulated in response to drought in FRs only.

Fig. 5
figure 5

Heatmap showing the gene expression of IbIAA genes obtained from: a RNA-seq data [26] obtained from various tissues for both Xuzi3 and Yan252 sweet potato cultivars. The colour scale bar represents the log2(FPKM + 1) values. b RNA-seq data [16] obtained from FRs and SRs at various stages of development. The colour scale bar shows that blue indicates down-regulated expression and red represents up-regulated expression. Colours represent log2FC. The raw log2FC data is indicated for statistically significant (adj. p-val. < 0.05) differential expression only

GH3 The expression patterns of GH3 genes are summarized in Fig. 6. All the GH3 genes were expressed across the tissues that were investigated. Many of the GH3 genes had their highest FPKM values in shoots and young leaves and the gene expression was similar across cultivars. Two genes (IbGH3.1, IbGH3.10) had high FPKM values in shoots, young leaves, FR and ITR. IbGH3.11 had higher FPKM in stem and root tissue than in shoots and leaves (Fig. 6a). The results of differential expression of GH3 genes between SR and FR, are illustrated in Fig. 6b. Seven of the 13 GH3 genes were down-regulated at one or more time points, with GH-3.2, -3.3, and -3.7 being significantly down-regulated at all three time points. These three genes belong to Group II of the phylogenetic tree (Fig. S7c). GH3.5 and GH3.8 were up-regulated at the 30 DAT stage only. GH.12 and GH3.13 were down-regulated at 40 DAT only and these were the only DEGs from the GH3 gene family that belonged to Group I of the phylogenetic tree. Of the 13 IbGH3 genes, only five showed up-regulated gene expression in response to one or more of the hormone or stress treatments in Fig. S9. IbGH3.1 was up-regulated in response to cold, while IbGH3.5 and IbGH3.11 were up-regulated in response to SA and MeJA, respectively (Fig. S9). IbGH3.2 and IbGH3.3 showed up-regulation under MeJA, drought, and cold treatments.

Fig. 6
figure 6

Heatmap showing the gene expression of IbGH3 genes obtained from: a RNA-seq data [26] obtained from various tissues for both Xuzi3 and Yan252 sweet potato cultivars. The colour scale bar represents the log2(FPKM + 1) values. b RNA-seq data [16] obtained from FRs and SRs at various stages of development. The colour scale bar shows that blue indicates down-regulated expression and red represents up-regulated expression. Colours represent log2FC. The raw log2FC data is indicated for statistically significant (adj. p-val. < 0.05) differential expression only

SAUR The expression of the SAUR genes is summarized in Fig. S8. Of the 200 SAUR genes identified, 8 (IbSAUR-92, -107, -115, -132, -133, -155, -163, -190) were not expressed across any of the tissue types for either cultivar. As shown in Fig. S8a, the tissue-specific expression was similar for both cultivars, with many genes having the highest FPKM values being observed mostly in the shoots and mature leaves and lower expression in roots. Some genes (such as IbSAUR-10, -52, -60) had highest expression in shoots and young leaves. More than 50% of the IbSAUR genes were not expressed in any of the root tissue types. There were some variations in gene expression between cultivars, such as for IbSAUR35 which was higher expressed in Yan252 roots than in Xuzi3 roots. Fig. S8b shows the fold-changes between SR and FR and several SAURs that were only expressed at certain time points. Some genes (IbSAUR-2, -3, -12, -13, -47, -48, -49, -61, -64) were significantly up-regulated (log2FC > 2) in SR vs. FR across one or more time points, while others (IbSAUR-9, 10, -34, -62) were down-regulated. IbSAUR32 was highly expressed across all tissues in both cultivars, but it was not differentially expressed between SR and FR. Most of the IbSAUR genes did not show differential expression, or were down-regulated in response to ABA, MeJA, SA, drought, salt, or cold treatments (Fig. S10). IbSAUR8 was up-regulated in response to SA and drought, while IbSAUR9 was up-regulated in response to drought and salt. The remaining up-regulated IbSAURs were up-regulated in response to ABA (IbSAUR48 and IbSAUR61), MeJA (IbSAUR-29, -31, -63, and -168), SA (IbSAUR1 and IbSAUR65), and cold (IbSAUR-37, -71, -98, and -118).

qRT-PCR confirmation of expression analysis

In order to validate the expression observed in the in silico gene expression analysis, qRT-PCR experiments were conducted using tissue from 5 plant parts (GS—green stem, PR—pencil root, SR—storage root, FR—fibrous root, L—leaf). PCR efficiencies ranged from 1.85 to 2.02 (except for IbARF7 which had an efficiency of 1.8).

For most of the genes, the highest gene expression was observed in the stem and/or leaf, which was also observed in the in silico analysis (Fig. 7a). The expression of IbARF4 and IbARF7 were similar in SRs and FRs, while for the remaining genes (except IbGH3.2), their expression was twofold higher (or more) in the SR relative to the FR (Fig. 7b). IbGH3.2 was twofold down-regulated in the SR vs. FR. Therefore, there is concordance between the expression observed in the public datasets and the results from this study. An interesting observation was that for all the genes that were investigated, the PR and FR expression values were similar, but this trend was not observed for IbARF7, where the PR expression was much lower than that of the SR and FR.

Fig. 7
figure 7

qRT-PCR confirmation of in silico gene expression. a qRT-PCR results showing expression of genes taken from RNA sampled from 3 pooled biological replicates of different sweet potato plant parts. The bars represent the mean ± standard error (SE) (n = 3). GS: green stem; PR: pencil root; SR: storage root; FR: fibrous root; L: leaf. Bars not sharing a common letter showed significant differences in gene expression using the Kruskal–Wallis H test (p < 0.05) b Comparison of log2FC data for SR vs. FR from in silico and qRT-PCR gene expression data. Asterisks indicate statistically significant (p-value < 0.05) log2FC (found using DESeq2 for in silico data and Student’s t-test for qRT-PCR data). A single asterisk indicates statistically significant log2FC from the RNA-seq data only, while 2 asterisks indicate statistically significant log2FC for both the RNA-seq and qRT-PCR data. The error bars for the qRT-PCR data represent the confidence intervals derived from the mean ± SE of the ΔΔCT values (n = 3)

Predicted PPI network of proteins encoded by DEGs

A protein–protein interaction network for the proteins encoded by all DEGs was constructed based on the known interactions of Arabidopsis homologs (Fig. 8). This network was constructed to examine whether the different DEGs belonging to the same gene family had unique interaction partners, which could then be used to elucidate their roles in tuberization. The majority of the DEGs interacted with other proteins encoded by the DEGs, for example, the interactions with high confidence (score of 0.7 or more) occurred between ARF and IAA proteins (Fig. 8). Figure 8a shows that ARF5 (MONOPTEROS—MP), IbARF10 and IbARF12 were predicted to have high confidence interactions with multiple IAA proteins—IbARF10 and IbARF12 (or IbARF8) were predicted to interact with IbIAA-5a, -15a, -16 while IbARF12 was also predicted to interact with IbIAA-2, -17, -24. IbARF5 (or MP) was not only predicted to interact with all the aforementioned up-regulated IbIAA proteins but was also predicted to interact with the down-regulated IbIAA-1, -26, -31 (Fig. 8b). The IbGH3 and IbSAUR proteins were predicted to interact with other auxin signalling proteins, for example, GH3.2 was predicted to interact with IbARF5, IbIAA31 and IbSAUR9 while IbGH3.8 was predicted to interact with IbARF12, IbIAA16, and IbSAUR196. Except for MP, there was no overlap between the first shell interactors for the proteins of the up-regulated and down-regulated DEGs, for example, TIR1 was observed in the up-regulated network only, whereas several Aux/IAA proteins (IAA1, IAA6, AXR3) were observed in the down-regulated network.

Fig. 8
figure 8

STRINGdb Protein–protein interaction diagram illustrating predicted interactions between the sweet potato auxin signalling genes based on their homology to A. thaliana proteins and their interactions. The nodes represent the auxin signalling proteins and the edges represent predicted interactions. No more than five interactors were shown in the first shell and zero interactors were shown for the second shell. Coloured nodes are those enriched with a Gene Ontology (GO) term with an FDR < 0.01—red: GO:0009734 (Auxin-activated signaling pathway); blue: GO:0009723 (Response to ethylene); green: GO:0010102 (Lateral root morphogenesis); purple: GO:0016881 (Acid-amino acid ligase activity). White nodes had no significant GO enrichment. The DEGs were prefixed with the species abbreviation, Ib a Up-regulated DEGs b Down-regulated DEGs. (inset) The thicknesses of the edges are proportional to the confidence of the prediction (only medium confidence score of 0.400 or higher shown)

Discussion

Characterization of sweet potato auxin signalling genes

In this study, we characterized the repertoire of the auxin regulated genes (ARF, Aux/IAA, GH3, and SAUR) in the hexaploid I. batatas genome. The 29 characterized IbARF genes are similar in number to that described in other crops [13, 19]. Our identification of 39 IbAux/IAA genes is also in agreement with that found in: Populus trichocarpa, tomato, and potato [18, 27, 28]. Likewise, the number of IbGH3 genes are similar to that in other species, such as Arabidopsis [29], Zea mays [30], O. sativa ssp. Japonica [31], and Solanum lycopersicum [32]. This trend was not seen for the SAUR gene family, which had more SAUR genes than that of the other species in the phylogenetic tree (Fig. S7d).

Biochemical characterization of the auxin signalling protein sequences gave insight into their domain organization. Of interest, is that IbARF12 and IbARF26 do not have the canonical ARF domain structure, which indicates that non-canonical auxin signalling pathways may be involved during SR initiation. Additionally, if the ARF middle region is rich in glutamine (Q), serine (S), and leucine (L), it may function as an activator but if it is serine (S), proline (P), leucine (L), and glycine (G)-rich, then it may function as a repressor [13]. The up-regulated expressions of ARF8 and ARF12 give evidence for them being transcriptional activators based on their Q-rich domains. Some of the Aux/IAA sequences are truncated, such as IbIAA31, so further work is required to understand how they can modulate auxin responses. The high conservation of GH3 amino acid residues indicates that orthologous GH3 genes between A. thaliana and I. batatas have similar specificity for the amino acids they conjugate. Additionally, most of the SAUR proteins had the highly conserved SAUR domain (represented by Motif I) which is likely involved in the mechanism of action of these genes [33]. Sun et al. [10] reported a highly similar Motif I that was conserved across 7 plant species, so that Motif I is likely to be crucial for SAUR activity. The IbSAUR proteins that lack Motif I and are possibly pseudogenes.

Phylogenetic analysis also showed the evolutionary conservation of I. batatas auxin signalling sequences as they clustered together into distinct clades. The IbARF sequences clustered into 3 groups, which was consistent with the findings of Song et al. [13] and Zhang et al. [79]. No template controls were also included, to ensure that samples did not have exogenous nucleic acid contamination.

The cDNA was obtained from three pooled biological replicates of tissue. The primers were used in qRT-PCR reactions each containing: 25 μL Power SYBR® Green Master Mix (Invitrogen), 200 nM forward primer, 200 nM reverse primer, 1 μL of cDNA and 23 μL of sterile nuclease-free water (Sigma-Aldrich) with a final reaction volume of 50 μL. Each reaction had three technical replicates and was run on a qTower3 thermal cycler (Analytik Jena, Jena, Germany) with the following cycling parameters: 10 min initial denaturation followed by 40 cycles of 15 s at 95 ℃ and 60 s at 55 ℃. Melting curve analysis was conducted after (for 6 s in the range of 60 ℃ to 95 ℃) and the results were analysed using the qPCRSoft program (Analytic Jena, Jena, Germany). PCR efficiencies were determined from the raw amplification curve data using Real Time PCR Miner [80]. With this method, the calculated PCR efficiency of 2 corresponds to 100%. The Pfaffl method was used to normalize the expression levels to the housekee** gene (COX) expression levels [81]. Statistical analyses (Kruskal–Wallis H test or Student’s t-test (p < 0.05)) were conducted with IBM SPSS Statistics version 28.

Construction of Predicted Protein–Protein Interaction (PPI) network

To further understand the roles of the DEGs in tuber initiation, a PPI network was constructed. Interaction networks of DEGs from the same gene family were examined, to distinguish their roles during tuberization. The protein sequences for the DEGs were used as queries in the STRING database to obtain the PPI networks, based on their A. thaliana homologs [82].