Background

The Green Revolution (GR) in rice production was attributed to the high-yielding semi-dwarf cultivars. In fact, the miracle rice, IR8, inherited the sd1 (semidwarf 1) gene from the Dee-geo-woo-gen (DGWG) cultivar (Hargrove et al. 1979). It conferred IR8 its short stature, making it lodging resistant, leading to high grain yield. Unknown to many, another cultivar also inherited the sd1 gene directly from DGWG. It is the Taichung Native 1 (TN1), which was popular in the 1960s (Chandler 1992). Recently, the genome of TN1 was sequenced, assembled and annotated, hel** to answer questions about the yield difference between TN1 and IR8 and why they both are photoperiod-insensitive (Panibe et al. 2021).

A fundamental characteristic of TN1 is its short height due to the sd1 gene from DGWG. The deletion of the semidwarf sd1 gene incurs a loss of function for the gibberellin (GA) 20-oxidase 2 (Os20ox2), which is involved in the synthesis of the growth hormone gibberellin (Spielmeyer et al. 2002). A reduction in GA results in a shorter plant height (Itoh et al. 2002). However, the sequence of the sd1 gene is not well studied. The current literature definition of the sd1 gene was based on the comparison of DGWG-type sd1 mutants (Habataki, Milyang 23, and IR24) with the sd1 of Nipponbare, Sasanishiki, and Calrose (Monna et al. 2002). It revealed a 383-bp deletion from the second half of Nipponbare’s exon 1 to the first half of exon 2, or in terms of the expressed sequence, a 278-bp deletion (Monna et al. 2002). Another definition of the sd1 deletion is a 280-bp deletion in the comparison of the semidwarf Doongara with the tall Kyeema, whose sd1 sequence is similar to Nipponbare (Spielmeyer et al. 2002). Those studies were done when the full Nipponbare genome was not yet available (until 2005) (International Rice Genome Sequencing Project and Sasaki 2005), and was later improved in 2013 (Kawahara et al. 2013). With the genomes of TN1 (Panibe et al. 2021) and IR8 (Stein et al. 2018) now available, we aim to compare the sd1 genes of these cultivars and redefine the semidwarf gene based on TN1 and IR8, the two direct descendants of DGWG.

If the greatest strength of TN1 is its high-yielding property due to its semi-dwarf stature from the sd1 gene, its weakness is its high susceptibility to the blast disease. Rice blast leads to a severe annual loss in rice production worldwide (Wang et al. 2014). However, plants have a natural defense against this and other pathogens, thanks to their resistance genes or R genes. Most R genes are composed of a nucleotide-binding site (NBS) domain and a leucine-rich repeat (LRR) domain (Takken and Joosten 2000). A combination of R genes in a plant may lead to a wide range of immunity response (Fukuoka et al. 2015). Unfortunately, TN1 is susceptible to major rice diseases like blast caused by the fungus Pyricularia oryzae (syn. Magnaporthe oryzae) (Sabbu et al. 2016) and the bacterial blight disease caused by the bacteria Xanthomonas oryzae pv. oryzae (Kumar et al. 2012). Predicting the R genes in the genome of TN1 will help understand the resistance profile of TN1, and why it is highly susceptible to blast. For factors that affect plant sensitivity to blast disease, see Chen et al. (2019), Liu et al. (2021), Nugroho et al. (2021) and Zhang et al. (2015).

There are in total 37,526 predicted genes in the TN1 genome (Panibe et al. 2021). Of these thousands of genes, some could be under the influence of positive selection (PS), conferring the cultivar certain advantages that could be related to TN1’s phenotypic characteristics like drought tolerance (Garg and Singh 1971; Garg et al. 2002). Mining the entire genome for genes that makes TN1 unique is no longer highly challenging, thanks to bioinformatics tools that automate the process of looking for positively selected (PS) genes such as PosiGene (Sahm et al. 2017). By using an input of coding sequences from the genomes of GR-related cultivars like IR8 (Stein et al. 2018), MH63 (Zhang et al. 2013) by using the protein alignments of sd1 and the information from their gff annotation (Nagano et al. 2005; Panibe et al. 2021). The range specified by the light blue arrow represents the sequences of sd1 in TN1 and IR8 that were validated by our Sanger sequencing. The 382 bp deletion in TN1 can be derived by computing the difference between 981 and 599, the latter of which represents the gene length of TN1 sd1 before its 2nd intron