Main

The cultivated garden strawberry (Fragaria × ananassa), an allo-octoploid (2n = 8x= 56), has a unique natural and domestication history, originating as an interspecific hybrid between wild octoploid progenitor species approximately 300 years before present1. The genomes of the progenitor species, Fragaria virginiana and Fragaria chiloensis, are the products of polyploid evolution: they were formed by the fusion of and interactions among genomes from four diploid progenitor species (that is, subgenomes) approximately 1 million years before present2. Whereas two of the diploid progenitor species have been identified3, the other two diploid progenitor species have remained unknown. Moreover, the history of events leading to the formation of the octoploid lineage and the evolutionary dynamics among the four subgenomes that restabilized cellular processes after ‘genomic shock’4 in allopolyploids remain poorly understood. Here, we present what is, to our knowledge, the first chromosome-scale assembly of an octoploid strawberry genome, the identities of the extant diploid progenitor species of each subgenome, and novel insights into the collective evolutionary processes involved in establishing a dominant subgenome in this highly polyploid species.

The Rosaceae are a large eudicot family including a rich diversity of crops with major economic importance worldwide, such as nuts (for example, almonds), ornamentals (for example, roses), pome fruits (for example, apples), stone fruits (for example, peaches), and berries (for example, strawberries)5. Strawberries are prized by consumers, largely because of their complex array of flavors and aromas. The genus Fragaria was named by the botanist Carl Linnaeus, on the basis of the Latin word ‘fragrans’, meaning ‘sweet scented’, describing its striking, highly aromatic fruit6. A total of 22 wild species of Fragaria have been described, ranging from diploid (2n = 2x= 14) to decaploid (2n = 10x= 70)7. The genus Fragaria is highly interfertile between and within ploidy levels, thus leading to the natural formation of higher-polyploid species8,9.

Polyploid events, also known as whole-genome duplications, have been an important recurrent process throughout the evolutionary history of eukaryotes and have probably contributed to novel and varied phenotypes10,11,12,13. Polyploids are grouped into two main categories: autopolyploids and allopolyploids, involving either a single or multiple diploid progenitor species, respectively14,15. Many crop species are allopolyploids16, thus contributing to the emergence of important agronomic traits such as spinnable fibers in cotton17, diversified morphotypes in Brassica18, and varied aroma and flavor profiles in strawberry19. Allopolyploids face the challenge of organizing distinct parental subgenomes—each with a unique genetic and epigenetic makeup shaped by independent evolutionary histories—residing within a single nucleus15. Previous studies have proposed, as part of the ‘subgenome dominance’ hypothesis20, that the establishment of a single dominant subgenome may resolve various (epi)genetic conflicts in allopolyploids21,22,23,24. However, understanding of the underlying mechanisms and ultimate consequences of subgenome dominance remains largely incomplete25.

Subgenome-level analyses in most allopolyploid systems are greatly hindered by the inability to confidently assign parental gene copies (that is, homoeologs) to each subgenome, owing to both large-scale chromosomal changes and homoeologous exchanges that shuffle and replace homoeologs among parental chromosomes26,37 genomes, with each homoeologous chromosome set colored according to its diploid progenitor species (F. vesca in red, F. nipponica in purple, F. iinumae in blue, and F. viridis in green). Details are provided in Supplementary Table 8. F. vesca and F. ananassa chromosomes are shown on the y axis and x axis, respectively. b, Gene-retention patterns among the four homoeologous copies of chromosome 1, with color coding as in a. The relative distance along the F. vesca chromosome is shown on the x axis with the total number of analyzed genes. The percentage of genes retained is shown on the y axis, as estimated with sliding windows of 100 genes. The chromosomes of F. vesca37 are named Fvb1 through Fvb7. c, A microsyntenic comparison of a region on chromosome 1 between diploid F. vesca and the four homoeologous regions in Fragaria × ananassa. Gray lines indicate shared syntenic gene pairs, and relative orientation is shown in blue (forward) or orange (reverse). The four subgenomes of Fragaria × ananassa are labeled with corresponding diploid species names of potential origins.