1 Soybean Production and Its Economic Value

Soybean is a very important crop that provides substantial oil and protein nutrition for the increasing human population. Soybean cultivation has been rooted back in ancient times c. 6000–9000 years ago, in East Asia [1]. Its massive production has reached its highest in the last century with the help of improving breeding techniques. Soybean production has increased since 1961 from 20–30 million tons to 350 million tons per year [2].

Soybean is very rich in oil and protein compared to other members of the legume family. Therefore, it meets a considerable demand for animal feed and oil production. Over three-fourths of soybean by weight is used for feeding livestock, poultry, and aquaculture production, so some countries increasingly export soybean products while others import to meet the demand for soybean-based animal feed. The rest is consumed by humans as an industrial oil, biofuel, food ingredients (lecithin, emulsifier, and proteins), and food (soy sauce, tempeh, soy milk, and tofu). Soybean is introduced as a rich protein source for plant-based diets as it consists of 40% of the dry matter including nine essential amino acids. Therefore, it is very important for the vegetarian and vegan diet, it provides high nutrition with its protein content [3]. Soybean seeds are the most important part of the plant, so throughout the domestication process, traits improving soybean seed quality and yield have been artificially selected for efficient utilization in the food industry and agriculture.

Soybean domestication has led to a significant reduction in genetic diversity due to selective sweeps, resulting in the fixation of beneficial traits. Studies have shown that nucleotide fixation during soybean domestication and improvement has resulted in a reduction of genetic diversity compared to wild soybean populations. Furthermore, the fixation of key genes involved in the regulation of traits such as seed size, pod dehiscence, and photoperiodic flowering has played a crucial role in sha** the morphology and adaptation of soybean to different environments. These genetic changes have contributed to increased yield and better adaptation to a range of environmental conditions, making soybean a globally important crop. However, the reduced genetic diversity resulting from selective sweeps also raises concerns regarding the resilience and adaptability of soybean crops in the face of new and changing environmental challenges.

2 Genetically Modified Soybean

Genetic modification of an organism traces back to the domestication of organisms. However, public perception misinterprets this term and people think that genetic modification of organisms came out with the developments in biotechnology in the late twentieth century. Breeding practices have long been used by humans and are striking evidence of genetic modification. With the discovery of recombination techniques in bacteria, genetic modification techniques have been gradually improved over the years and used first in producing medicines and then crops.

Recombinant DNA technologies are the fundamentals of genetic modifications in living organisms. Briefly, a target gene cassette-containing vector is transferred by a virus or a bacterium into living cells of an organism to insert a specific genetic sequence into that organism’s genome. The first genetically modified soybean was produced in the 1990s. Glyphosate-resistant soybean cultivation together with the glyphosate herbicide dramatically decreased labour that occurred due to tillage of the soil, at the same time this dual application increased genetically modified soybean production.

Despite its recalcitrant nature of regeneration under tissue culture techniques, several studies showed that soybean has been used for gene editing of flowering time, seed oil content, lateral root growth, and defence mechanism [4] (Table 17.1).

Table 17.1 Soybean traits and associated genes that are modified by gene editing techniques CRISPR/Cas9, zinc-finger nuclease (ZFN), transcription activator-like effector nucleases (TALEN), Agrobacterium tumafaciens, and bean pod mottle virus mediated transformations

Curtin et al. (2011) first published the research about hairy-root and whole-plant transformation mediated by Agrobacterium rhizogenes using the zinc-finger nuclease (ZFN) method to target DICER LIKE (DCL), RNA-DEPENDENT RNA POLYMERASE (RDR), and HUA ENHANCER1 (HEN1) genes in the root cells [5]. They continued using the ZFN method to create double mutants of DCL1a and DCL1b for Agrobacterium rhizogenes-mediated whole-plant transformation [6]. ZFNs were also used to deliver multiple different DNA donors in the FAD2-1a locus (Glyma.10 g278000) by using a biolistic bombardment technique on immature embryo explants [7]. This research successfully regenerated fertile plants and transmitted the insert to the next generation.

FAD2-1a and FAD2-1b loci were mutated by using transcription activator-like effector nucleases (TALEN), to convert oleic acid into linoleic acid to increase the polyunsaturated fatty acid. The study used the hairy-root transformation method mediated by Agrobacterium rhizogenes [8]. The same research group also targeted FAD2-1a, FAD2-1b, and FAD3 to convert linoleic acid into oleic acid by using the TALEN technique and again successfully transformed soybean immature embryo explants [9].

Du et al. (2016) conducted a study to compare two gene editing techniques, TALEN and CRISPR-Cas9 in parallel with testing the transformation efficiency by using soybean-specific U6-10 and Arabidopsis-specific U6-26 promoters in soybean [10]. They targeted a gene encoding phytoene desaturase (PDS), a rate-limiting enzyme involved in the carotenoid biosynthesis pathway. Hairy root transformation mediated by Agrobacterium rhizogenes successfully resulted in mutated buds. The study suggested the usage of CRISPR/Cas9 with species-specific promoters to acquire a highly transformation-efficient, cost-efficient, and easy-to-construct transformation technique [10].

CRISPR/Cas9 technique outperforms other precise gene editing techniques by its cost-efficient and easily applicable features. This technique accelerates soybean breeding and supports soybean production. Cytoplasmic male sterility (GmAMS1) [11], flowering time (LNK2) [12], seed oil content (GmFAD2) [

3 Agronomically Important Soybean Traits and the Use of Gene Editing

Soybean breeding plays a crucial role in the production of soybeans around the world as it helps to develop soybean varieties that can adapt to different environmental conditions. By breeding soybean varieties that are good quality, and tolerant to various biotic and abiotic stresses, soybean production can be increased and stabilized. Additionally, breeding efforts have resulted in soybean varieties with desirable traits such as high yield, improved nutritional quality, and enhanced oil and protein content, which are important for meeting the increasing demand for soybean products worldwide. Overall, soybean breeding has been instrumental in improving soybean production by develo** varieties that are better adapted to the diverse environmental conditions in different regions of the world. A rapid and precise gene editing might help to improve elite soybean varieties. Soybean improvement might be accelerated by introducing a non-synonymous mutation with the help of gene editing.

Although trading and migrating routes had caused the dissemination of a certain type of cultivated soybean seeds towards Eastern Asia and North America, local landraces had provided efficient genetic resources for soybean breeding in adaptation to the environment. The idea of a single origin of soybean domestication does not completely explain the existence of allelic variation among cultivated varieties. Because local genetic diversity had provided location-specific causal alleles associated with the traits of interest. Therefore, to improve plants’ yield capacity and seed quality in terms of oil and protein, there were several genes functionally identified to be responsible for plant architectural, physiological, and morphological changes in organs by using CRISPR/Cas9 method.

3.1 Pod Shattering Resistance

Pod shattering resistance, to prevent seed dispersal and yield loss, is an important agronomical trait that has come along with domestication [20]. Angiosperms develop their seeds within the fruit and disperse them when there is an abscission between pedicel and lemma. This decreases the harvest output and was taken under control by artificially selecting pod-shattering resistant plants. Four pod-shattering resistance-associated genes were identified in soybean: GmSHAT1-5, Pdh1, NST1A, and Glyma09g06290.

Dong et al., (2014) identified a causal polymorphism in the GmSHAT1-5 gene and the pod-shattering resistant domesticated soybeans, which were diversified from wild soybeans, derived from this single haplotype [21]. GmSHAT1-5 is responsible for the lignification of fiber cap cells in the pod ventral suture which causes thickening in domesticated soybeans. The sample collection included both Glycine max and Glycine soja varieties gathered from the seed bank of the Chinese Academy of Agricultural Sciences (Bei**g). The pod indehiscent allele from Glycine max showed a 13-fold higher expression than Glycine soja. It seems like domestication significantly affects pod-shattering traits; however, this research did not reveal the origin of the indehiscent allele.

Zhang and Singh (2020) identified a locus called NST1A, which showed epistasis with Pdh1. NST1A was a NAC family gene, a paralog of GmSHAT1-5 [22]. Likewise, in NAC family transcription factors in Arabidopsis thaliana, a premature stop codon was identified to be responsible for gain-of-function mutation, where it provided pod shattering-resistance despite the existence of the Pdh1 allele [22, 23]. The indehiscent NST1A allele was predominantly found in Southern China and Japan, this implies that local wild cultivars in those regions were selected for the indehiscent NST1A allele independent of low humid conditions.

A genome-wide association study genotyped 211 soybean accessions including modern and wild cultivars collected from the National Center for Soybean Improvement of China by using NJAU 355 K SoySNP array containing 282,469 SNPs. A quantitative trait locus was identified on chromosome nine and within that locus, a candidate gene Glyma09g06290 was found homologous to Arabidopsis thaliana basic helix-loop-helix, a gene responsible for silique dehiscence. Quantitative polymerase chain reaction analysis also indicated that the Glyma09g06290 gene was highly expressed in pod indehiscent varieties [24].

Another gene regulating pod shattering in the domesticated soybean is Pdh1. Pdh1 showed high homology to dirigent family genes, which were initially known as a stereoselective bimolecular phenoxy radical coupling of (E)-coniferyl alcohol, for producing lignan [25]. The functional Pdh1 was found to be highly expressed in the lignified inner-sclerenchyma cells of the seed pod [26]. The inner sclerenchyma physical properties changed when Pdh1 expression increased, and pod shattering started. As the relation between Pdh1 and lignin was not clear yet, the gene might be responsible for lignin deposition in the seed pod. A non-synonymous nucleotide substitution that produces a stop codon results in pod-shattering-resistant varieties. Under low humidity pdh1 allele containing soybeans showed significantly lower shattering scores than those with the Pdh1 allele. This pod shattering-resistance associated allele was seen in more than 50% of Chinese and a considerable proportion of South Asian and North American landraces. However, Japanese and Korean landraces showed a very low frequency of this allele. The origin of domestication by selecting the indehiscent Pdh1 allele might be originated from Huang-Huai-Hai Valley [22]. This infers that low humidity conditions provided selective pressure on the pdh1 allele to protect seeds from dispersion. Zhang et al. (2022) provided a CRISPR/Cas9 gene editing solution for pod shuttering-susceptibility in a summer adapted soybean cultivar HC6 found in Huang-Huai-Hai [27]. They performed QTL map** by using a recombinant inbred line population of HC6 and a pod shattering-resistant variety JD12 and they found a reproducible major allele at the Pdh1 locus, SNP A/T that causes a nonsense variant (HC6/JD12). The resistant allele T was associated with low humidity regions in China, whereas the susceptible one A with high humidity regions in China, Japan, and Korea. Having known the contrasting effect, causal allele in different haplogroups facilitated the application of CRISPR/Cas9, the precise gene editing. This finally provided a gene therapy for pod shattering in soybean cultivars.

3.2 Shoot Growth Habit

Planting and harvesting time remarkably affect soybean yield, therefore, farmers must choose the appropriate maturity type regarding the environmental conditions. Soybean determinacy is an important agronomic trait that identifies the maturity type. Determinacy is governed by genes and environmental signals, which control the generation of shoot apical meristem and transition to floral meristem. Soybean can be classified into three groups of determinacies: determinate, semi-determinate, and indeterminate. Indeterminate varieties, which are late maturing, show a prolonged vegetative phase with active stem and branch apices producing new nodes with leaves. Whereas determinate varieties, which are early maturing, cease stem and branch apical growth with photo-periodical floral induction.

Phenotypic variation amongst soybean landraces provided a good genetic resource for soybean breeding. Soybean planting management aims to maximize yield capacity and quality. It was found that, when indeterminate varieties are early-planted, they maintain an active vegetative growth for a long time and adequately accumulate amino acids and nutrients to allocate them towards seeds to increase yield quality and capacity. On the other hand, when determinate varieties are late-planted, yield capacity and moisture decrease. However, early planting of indeterminate soybean varieties can carry some risks. For example, late frost or extended pathogen infection might cause to decrease in yield capacity and even delay harvesting. To avoid the risks, determinate and indeterminate varieties are planted accordingly to maximize soybean production in the field [28,29,30].

In the cultivated soybean varieties, two genetic loci were identified to be associated with the determinacy trait: Dt1 and Dt2. The Dt1 allele is dominant or incompletely dominant on the dt1 allele; the Dt2 allele is dominant on the dt2 allele. Soybean plants with Dt1/Dt1 genotype are identified as indeterminate with dt2/dt2 and semi-determinate with Dt2/Dt2. However, the dt1/dt1 genotype shows a determinate phenotype when the Dt2 locus is either recessive or dominant homozygous or heterozygous. Therefore, the Dt1 locus has an epistatic effect on the Dt2 locus [31, 32]. Their antagonistic behavior regulates flowering time and plant stem growth.

Dt1 is induced by E3 and E4 under long day conditions, interacts with bZIP family transcription factor FDc1, and binds to the promoter of APETALA1 for delaying flowering. On the other way, when APETALA1 binds to the promoter of Dt1, it inhibits its expression, thus promotes flowering [33]. Dt1 locus encodes a phosphatidylethanolamine-binding protein (PEBP) family protein called GmTfl1 (or GmTfl1b) which is an ortholog of Arabidopsis TERMINAL FLOWER1 that controls plant height and internode length. GmTfl1b is expressed in the shoot apical meristem until flowering initiation [46]. Indeed, high seed water permeability facilitates water absorption and makes the seed easy to germinate. Jang et al. (2015) identified a seed hardness locus qHS1, which encodes an endo-1,4-β-glucanase [45]. A single nucleotide substitution from A to G in this gene dysfunctions the substrate-enzyme cleft domain and causes permeable seed coat in soybean. Likewise, a single nucleotide substitution from C to T by using Agrobacterium tumefaciens mediated transformations in the GmHs1–1 gene, which encodes a calcineurin-like metallophosphoesterase transmembrane protein, showed increased permeability in the soybean seed coat [44]. Chandra et al. (2020) made inter-specific crosses between Glycine max and Glycine soja to understand the genetic inheritance of seed coat impermeability by using 217 recombinant inbred lines [46]. They phenotyped seed coat impermeability by looking into slow and rapid imbibition rates of the offspring. They identified three linked markers on chromosome 2, this locus was previously identified by Sun et al. (2015) and Jang et al. (2015) [44, 45]. Additionally, the phenoty** results revealed semi-permeable genotypes that might cause by minor alleles, and one of them was found to be associated with leaflet width, phytophthora resistance, and seed tocopherol. This implies that seed coat-identifying genes diversify in nature, and they maintain seed protection and coat-related alleles. The process of soybean breeding to improve seed coat impermeability should consider the involvement of the minor alleles as a potential genetic gain.

Another seed trait that was subjected to artificial selection during the domestication process is seed oil content. Cultivated soybean seeds contain more oil than wild seeds, which shows the effect of domestication on selecting high oil capacity in the seeds [47]. Soybean oil has a great economical value in the market, after palm oil, it is the second most-produced vegetable oil in the world between 2018 and 2023. China is the world leader in the production and consumption of soybean oil (USDA, 2023). It is used for human consumption and shows tremendous health benefits. Soybean oil consists of 15% saturated and 85% unsaturated fatty acids, which is responsible for lowered blood cholesterol levels, and decreased coronary heart disease [15, 51]. A major QTL has been identified on chromosome 15, GmSWEET10a (Glyma.15G049200), that affects seed oil, size, and protein content [15, 47]. This locus encodes a member of the SWEET gene family, a sugar transporter gene, ensuring sucrose efflux and allocating sugar from the mother seed coat to the filial embryo. Wang et al. (2020) showed that the frequency of the allele, which is significantly associated with seed oil content increase, is higher among landraces and cultivated soybeans than wild varieties [52]. Therefore, there was a strong artificial selection during the domestication of soybean. Zhang et al. (2020) identified a two-base-pair CC deletion in exon 6 of the cultivated and high-seed oil-containing soybean varieties [47]. They also unravelled that this gene shows a pleiotropic effect on seed oil and protein contents since the varieties having CC alleles available are significantly rich in protein. Additionally, Wang et al. (2020) identified a homologous locus of GmSWEET10a, named GmSWEET10b, by conducting a knock-out mutation through CRISPR/Cas9 showing a similar effect on seed oil and protein content while changing seed size; however, they could not find a significant artificial selection for this locus [52]. Cai et al. (2023) showed a similar antagonistic effect between seed oil and protein contents by indicating contrasting synthesis of oil and protein under changing expression of the GmMFT gene through CRISPR/Cas9 mediated knock-out mutants [55]. For example, soybean production has been declining in Argentina, one of the major soybean producers, due to drought stress, which decreases the production, and so does the export and crushing [56]. QTL studies identified causal loci, which can promote marker-assisted breeding for abiotic and biotic stress resistance. However, only a few of them have been validated through gene editing; otherwise, most of them are still in the candidate status.

The improved root system, enhanced water uptake, effective stomatal conductance, and slow wilting are some of the avoidance strategies in soybean from drought stress. Soybean breeding in light of genetic mechanisms can provide drought-resistant soybean crops and save soybean production under drought conditions. Throughout the investigations, several drought resistance-conferring genes have been functionally identified. A crosstalk between plant hormones and transcription factors regulates plant response to stress conditions. NAC (NAM, ATAF, and CUC) [57, 58], MYB [59], WRKY [59], AREB [60], DREB [61, 62], AP2/ERF [63] transcription factors were found to be involved in this collaboration. Overexpression of soybean GmNFYA13, a nuclear localization protein, was found to be responsible for gaining resilience to salt and drought stress in transgenic soybean plants. Abscisic acid (ABA) is one of the plant hormones which control the physiological adaptations of a plant under stress conditions. For example, stomatal closure is induced by increasing ABA to prevent water loss. When ABA is artificially induced in soybean, GmNFYA13 expression was increased. This infers that the GmNFYA13 gene is involved in abscisic acid-mediated stress response in soybean plants [64].

Soybean is a salt-sensitive plant, increased Na+ ions change cellular ion balance and damage cells. Cation Diffusion Facilitator 1, Arabidopsis K+ Transporter 1, and also some transcription factors that are generally involved in abiotic and biotic stress conditions, such as MYB, WRKY, AP2/ERF, and NAC are associated with salt stress resistance in soybean [65]. Wang et al. (2021) identified an ABA and salt induced transcription repressor GmAITR in soybean to reduce the salinity stress related phenotypes without losing its fitness [66]. A CRISPR/Cas9 knock out mutant technique, gmaitr inhibited the expression of ABA and showed tolerance to salt stress.

Moreover, flooding stress is another abiotic stress that affects soybean production under ill-drained soils. Soybean roots are primary organs that are affected by flooding stress, limited oxygen uptake causes hypoxia and reduced energy production. To overcome this stress, plants undergo alternative energy-producing metabolic activities. Transcriptomic and proteomic studies unravel a group of proteins that are involved in cell wall modification, methylglyoxal detoxification, hypoxia reduction, pathogen defence, reactive oxygen species scavengers and chaperons, and energy production through glycolysis induction and alcohol fermentation [67,68,69].

Soybean production around the world is challenged by increasing negative impacts of fungus, bacterium, phytoplasma, nematode, and virus infections. Natural and artificial selection strategies improved soybean resistance over the years and sustain its development and reproduction despite dynamic spatial and temporal conditions. Soybean breeding for biotic stress resistance is a very active process. Due to changing climate conditions, pathogen populations shift, and new races are introduced to host plants. This activates new protection mechanisms and beneficial mutations in resistance-conferring genes provide endurance to plants. The selection pressure on beneficial mutations can occur both naturally and artificially. Zhao et al. (2015) investigated nucleotide fixation of pathogen resistance in wild and cultivated varieties, and their study revealed that Glyma20g08290 (homolog of Arabidopsis thaliana RPM1 gene) is a naturally selected locus, which is associated with Pseudomonas syringae in soybean (Ashfield et al., 1995) and found in wild soybean varieties [70, 71].

Marker-assisted selection provides causal QTLs for vertical and horizontal resistance. It unravels major R genes, which maintain vertical resistance in soybean to soybean cyst nematode (Rhg), Phytophthora root and stem rot (Rps), soybean rust (Rpp), frog eye leaf spot (Rcs), bacterial blight (Rpg), and soybean mosaic virus (Rsv and Rsc) [65, 70, 72, 73]. R genes provide full protection in a race-specific manner. Horizontal resistance is controlled by multiple minor effect genes and confers resistance against many soybean diseases such as sudden death syndrome, Sclerotinia stems rot, root-knot nematode, and most Pythium species [73]. However, this type of protection is not pathotype specific so it is more long-lasting than vertical resistance. In vertical resistance, environmental conditions might cause genetic changes in avirulent proteins, which are recognized by pathotype-specific R genes, or shift in pathogen populations. On the other hand, utilization of molecular markers associated with R genes is more feasible than pursuing a soybean breeding strategy for minor allelic resistance.

4 Conclusion

QTLs have been identified for a number of traits in soybean. These QTLs can be used to develop marker-assisted breeding programs to improve soybean cultivars for resistance to these stresses. Gene editing is a newer technology that can be used to rapidly and efficiently introduce and edit specific genes. Gene editing is a complementary approach to marker-assisted breeding, and the two technologies can be used together to accelerate the development of improved soybean cultivars. The advantages of using gene editing for soybean improvement:

  • Gene editing is a precise technology that can be used to target specific genes.

  • Gene editing is a rapid technology that can be used to develop new cultivars in a shorter time frame than conventional breeding methods.

The challenges of using gene editing for soybean improvement:

  • Gene editing is a regulated technology, and there are a number of regulatory hurdles that must be overcome before gene-edited soybeans can be commercialized.

  • There is some public opposition to the use of gene editing in food crops.

Despite the challenges, gene editing is a promising technology that has the potential to revolutionize soybean improvement. By combining gene editing with marker-assisted breeding, we can develop soybean cultivars that are more resistant to abiotic and biotic stresses, have improved yield and nutritional quality, and are better suited to the changing climate.