Introduction

The gastrointestinal system contains countless microorganisms spanning multiple kingdoms performing diverse functions. Archaea, bacteria, viruses, and fungi work in concert and competition to acquire nutrients and space [1]. The focus of previous gut microbiome research has predominantly been on the identification and function of bacteria [2, 3]. However, archaea have been demonstrated to be equally important members of the gastrointestinal microbiome [4]. Methanogenic archaea, or archaea which carry out methanogenesis, perform crucial roles in the gut [4, 5]. Yet, current research has not indicated how methanogenic gut functions change throughout the lifetime of monogastric hosts [6, 7]. With limited research on archaea, and even more minimal analysis on methanogenic functions, we are lacking an in-depth understanding of gastrointestinal associated methanogens, especially our comprehension of methanogen influence on gut and host health throughout host stages of life. By investigating monogastric associated methanogens with a longitudinal approach, we are adding essential knowledge to the limited understanding of monogastric methanogens.

While some beneficial and detrimental associations of archaea to host health have been reported, overall the role of archaea in health and disease is still under investigation [5]. To date, archaea have been associated with a few illnesses, primarily gastrointestinal disorders such as constipation [5, 8, 9] and obesity [5, 10]. Conversely, archaea have also been associated with beneficial attributes. For example, archaea metabolize trimethylamine (TMA), which is thought to decrease cardiovascular disease [4, 5]. This research has prompted further evaluation of archaea members as a probiotic for cardiovascular health [5, 11]. Moreover, archaea allow continued microbial metabolism, growth and action by lowering hydrogen gut levels [5]. Archaea’s role of hydrogen utilization is especially important in the gut where microorganisms work in concert within the shared gut-microbiome system. However, with limited prior research, there is a critical need to understand the role of gastrointestinal archaea in health and sickness via hydrogen metabolism.

Overall, archaea are classified into four superphyla: Euryarchaeota, Asgard, TACK (Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota), and DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, and Nanohaloarchaeota) [12]. To date, Asgard archaea have not been indicated as methanogens [12], and TACK and DPANN have only been identified in non-host associated environmental sites [12,13,14]. Therefore, currently known host-associated gut methanogens fall within the seven orders of Euryarchaeota: Methanobacteriales, Methanococcales, Methanomicrobiales, Methanosarcinales, Methanocellales, Methanopyrales, Methanomassiliicoccales [15,16,17,18]. These Euryarchaeota orders are obligate anaerobes which perform methanogenesis to conserve energy for ATP production, where methane is a byproduct [15, 19]. Actions immediately following methanogenesis generate an ion gradient which is coupled with ATP production [20, 21].

Given the necessity for ATP production, it is unsurprising that historically, studies have primarily relied on the methanogenic gene methyl-coenzyme M reductase A (mcrA) or 16S rRNA for identification of gut-associated methanogens [7, 22,23,24]. McrA has been identified in all methanogens to date, as the protein performs a critical role in the final methane production step of methanogenesis [24, 25]. While prior research was heavily reliant on targeted PCR methodologies, we are in-large missing gene centric methanogenic understanding, from complete genetic sequencing, of gut-associated methanogens [26]. Functional methanogen studies become even more profound when evaluated in a longitudinal approach, especially when following the same hosts. In doing so, we can determine lifetime gut methanogen dynamics and host implications. Currently, studies which evaluate longitudinal methanogen dynamics typically involve ruminant hosts, such as cows, sheep, goats, and deer [27]. At the time of publication, we could not find a longitudinal study of methanogen genomes (i.e. not marker studies such as 16S rRNA or mcrA) following the same monogastrics hosts throughout their lifetime, highlighting the crucial need for such metagenomic longitudinal evaluations [6, 7]. Without this knowledge, we cannot determine lifetime dynamics of archaea, and how their methanogenic function may be related to age-associated factors, such as diet and host development.

Host-associated archaeal methanogens have been linked to various conditions of health and disease. Most archaea-centric intestinal microbiome studies have been conducted on a single time point in the lifetime of the host. Using molecular and cultural approaches, intestinal archaea have been identified in many hosts, including: humans, swine, horses, rats, birds, fish, and kangaroos [28]. Overall, these analyses reported that the most common methanogens in the gut are members of the Methanobacteriales and Methanomassiliicoccales orders [28]. However, little is known about the presence and distribution of archaea through the lifetime of the swine. There is also a lack of data on the functions of the archaea in the swine gut. Overall, this knowledge gap has hindered the identification of factors that influence the diversity, abundance, and functions of archaea in the swine. In this study, we evaluated methanogen abundance and functions of 7 monogastric swine hosts over their lifetime at 22 timepoints from birth through adulthood (ages 1–156 days). We recovered 8 methanogenic archaea metagenome-assembled genomes (MAGs) that exhibited differential colonization patterns in the host at different ages. While distribution of methanogens across multiple hosts has been previously demonstrated, we recovered the first US swine Methanobrevibacter UBA71 sp006954425 and Methanobrevibacter gottschalkii MAGs [28]. Moreover, we attributed methanogenic functional potential to our age-associated archaea, and identified the first evidence of acetoclastic methanogenesis in monogastric-associated archaea, found in our Methanomassiliicoccales MAGs, indicating a previously unknown capability of monogastric methanogens to utilize acetate in energy acquisition. Alternatively, we attributed hydrogenotrophic methanogenesis, where carbon dioxide (CO2) is utilized, in the Methanobacteriales. We surmised that the age-associated detection patterns were due to differential substrate availability, which was highly influenced by diet. Altogether, we provided a comprehensive, genome-centric investigation of monogastric-associated archaea to further our understanding of microbiome development and function.

Results and discussion

Taxonomic classification of gut metagenome-assembled genomes

To broadly sample gut-associated microorganisms of the swine host across different age-associated growth stages, we obtained ~ 5.8 × 109 paired-end reads from Illumina NovaSeq sequencing data of 112 swine fecal samples (Fig. 1 and Additional file 1: Table S3). After quality trimming, we generated ~ 5.2 × 109 paired-end reads. The resulting 3 co-assemblies contained ~ 9.4 × 106 contigs that described approximately ~ 3.6 × 1010 nucleotides and ~ 3.7 × 107 genes. Using a combination of automatic and manual binning strategies, with thresholds of > 70% complete and < 10% redundancy, resulted in 4,556 metagenome-assembled genomes (MAGs). We further removed redundancy by selecting a single representative for each set of genomes that shared an average nucleotide identity (ANI) of greater than 95%, resulting in 1,130 final non-redundant MAGs (nr-MAGs) (Additional file 1: Table S3). Among the nr-MAGs, we recovered an average of 203 ± 187 contigs, with an average N50 of 32,737 ± 35,205. The resolved nr-MAGs had completion values of 87.9% ± 8.6% and redundancy values of 3.2% ± 2.6%. The genomic lineages for archaeal and bacterial nr-MAGs based on domain-specific single-copy core genes resolved to 20 phyla (2 archaea phyla and 18 bacterial phyla) and 588 species (5 archaea species and 582 bacterial species). We could also assign 88.4% of the bacterial and archaeal nr-MAGs to their genera.

Fig. 1
figure 1

Study schematics of 7 swine hosts including fecal sampling ages and developmental stages

Resolved archaeal MAGs are genetically and phylogenetically similar to diverse hosts and geographic disbursed archaea

Among the 1,130 nr-MAGs that we resolved, our genomic collection also included 8 archaea nr-MAGs (hereafter known as archaea-MAGs; Ar-1 through Ar-8; Table 1; Additional file 1: Table S3). We observed that our resolved archaea-MAGs harbored genes which encoded for critical methyl-coenzyme M reductase (mcrABG) proteins required for methanogenesis, including mcrA which is typically utilized for methanogen classification [29, 30] (Additional files 1: Tables S3 and 2: Table S4). To our best knowledge, these MAGs represent the first genomic evidence of putative methanogens differential colonization pattern of the monogastric gut. The resolved methanogen MAGs had an average genome size of 1.4 Mbp, 1,573 KEGG gene annotations, 1,535 COG gene annotations, and a GC content ranging from 31 to 56% (Table 1; Additional file 1: Table S3). We resolved 7 of the methanogen MAGs to the species level with one archaea-MAG resolving to the genus level (Table 1; Additional file 1: Table S3). Our resolved archaea-MAGs were assigned to the following orders: Methanomassiliicoccales (5) and Methanobacteriales (3). Moreover, the genera were as follows: UBA71 (3), Methanomethylophilus (1), MX-02 (1), and Methanobrevibacter (3).

Table 1 Anvi’o results, including taxonomic assignment, of 8 archaea-MAGs

We downloaded 95 Methanomassiliicoccales and 97 Methanobacteriales genomes to investigate the phylogenetic relationship of our resolved archaea-MAGs (Fig. 2; Additional file 3: Fig. S1). We showed that our methanogen populations had close phylogenetic relationships with archaea from geographically distinct mammalian hosts, suggesting high similarities in gene functions in archaea among diverse host species. Given similarities amongst such diverse host species with diverse digestive systems, we hypothesize these close genetic relatives of our resolved archaea-MAGs might be more ubiquitous in a wider range of hosts that are currently discussed. We noticed Ar-4 clustered, as expected, with 6 Methanomethylophilus alvus strains: 5 from human gut samples and 1 from swine (MAG221) (Fig. 2A) [31,32,33,34,35,36]. Ar-7 clustered with 4 MX-02 sp006954405. United Kingdom strain 10 [37] and Chinese strain MAG014 [34] have been identified as swine-originating, whereas B5_69.fa and B45_maxbin.030.fa were from humans [38]. Interestingly, clustering on the same branch (B5_69.fa and B45_maxbin.030.fa) are archaea isolated from South African adult humans [38]. Ar-1 was in the same branch with archaea from Tibetan pig MAG098 [34]; Ar-2 with Chinese roe deer RGIG3983 [28].

We recovered from our study novel archaeal genomes that were previously unidentified in US swine. We were able to resolve and obtain the genomic information, to the best of our knowledge, of the first swine-associated Methanobrevibacter UBA71 sp006954425 and Methanobrevibacter gottschalkii MAGs. The methanogenic archaea family Methanobacteriales has been identified in many previous swine studies, with the majority of these studies utilizing 16S sequencing and/or real-time PCR identification [34, 37, 52,53,54,55,56,57,58,59,60,61]. Still, there is a lack of understanding of the Methanobacteriales in terms of genomic studies, and the Methanomassiliicoccales order collectively in general. Up to this moment, only three swine Methanomassiliicoccales MAGs (Methanomethylophilus alvus, MX-02 sp006954405, and Methanobrevibacter smithii) have been identified [34, 37, 61]. Thus, adding our highly resolved novel archaea-MAGs to the repertoire of swine-associated microbial populations will aid in understanding swine archaea, including functions, host associations (such as age, health status, sex, etc.), and global distribution.

Prevalence of archaeal MAGs and variants at distinct host ages

Assessing the abundance of the methanogens in different growth stages of the swine host provided an opportunity to investigate the association between the host-associated methanogens and the different conditions faced by the swine as they grow. Our genome-centric metagenome analyses revealed two dominant orders of archaea—Methanobacteriales and Methanomassiliicoccales. We showed that resolved methanogen MAGs were differentially detected at different growth stages of the swine, but does the environment affect the functional niche specificity between these two orders of archaea?

The heatmap shown in Fig. 3A provides a graphical summary of the changes in detection for the archaea-MAGs. Detection was defined as the proportion of a given contig in the MAG that is covered at least 1X. Hierarchical clustering grouped the archaea-MAGs into three clusters based on detection: A (top cluster; Ar-1 through Ar-4), B (middle cluster; Ar-5 and Ar-6), and C (bottom cluster; Ar-7 and Ar-8). We observed that Cluster A contained only Methanomassiliicoccales MAGs, while Cluster B contained 2 Methanobacteriales MAGs, and Cluster C one of each order. Cluster A archaea-MAGs were primarily identified in the final stage of growth adult hosts. Conversely, Cluster B methanogens were primarily identified in preweaning hosts. Finally, Cluster C archaea were identified throughout the host lifetime. Further support of distinct archaea-MAGs detection was supported through archaeal-MAG relative abundances (Additional file 4: Fig. S2).

Fig. 3
figure 3

A Detection (portion of MAG with at least 1X read coverage) heatmap of archaea-MAGs (rows) across all individual sample metagenomes (columns) with MAG taxonomy and stage annotation (Preweaning [P]; nursery [N]; growth adult [G]). B Single-nucleotide variant (SNV) analysis of Ar-7 and Ar-8 where box colors indicate competing nucleotides and stage is indicated along the bottom

We noticed the majority of archaea-MAGs detection values increased closely after a stage transition (preweaning to nursery and nursery to growth adult), suggesting that stage transition changes, including diet, housing, and stress, can lead to changes in microbiome composition [62]. Although, exactly how these changes impact archaea is relatively understudied, as most research evaluates bacteria, and therefore archaea-stage dynamics are a topic for future research [63, 64].

We investigated methanogen variants, and found the majority of variation occurred in periods when other archaea were dominating (preweaning and growth adult; Additional file 5: Table S5). We performed single-nucleotide variant (SNV) analysis on our two archaea-MAGs that showed continuous detection throughout the host lifetime (Cluster C MAGs: Ar-7 and Ar-8; Fig. 3B). We attributed the majority of variances to the unweaned host, and fewer variances were identified in the growth adult. Interestingly, the number of variant populations were highest during times where other archaea-MAGs were predominantly identified (Fig. 3B). We hypothesized that the variation found in the growth adult host could indicate a competitive microbial environment, while fewer variants as compared to the earlier growth stages could be due to a largely already developed gut microbiome. In a competitive gut microbiome system, it is beneficial to have genetic diversity which translates to increased functional diversity [65]. A similar competitive environment and SNV diversity was demonstrated in the human gut bacterial community [65]. Comparatively, as the gut developed and microbes established with focused functions, the variation decreased when humans reached 2 years of age [65]. Human preweaning gut development is similar to the development of the swine preweaning gut, albeit swine is relatively faster [66]. The conditions which encouraged the increased variation in the growth adult in our study could have been a change of diet, host stress, or other host-associated and environmental conditions [67].

While we demonstrated differing archaea and SNV association with age, we were primarily interested in methanogen function. We hypothesized methanogenic function influenced our resolved methanogen MAGs’ ability to establish in the microbiome at different host stages through energy acquisition via host diet. Phylogenetic similarity in archaea across geography and hosts prompted an investigation into whether our archaea-MAGs were identified in other hosts of similar developmental ages, and therefore similar archaeal functions.

Methanogens span host species, millennia, and geographic distance

We wanted to further demonstrate not only global and host distribution, but also temporal, or across time, identification of our methanogens beyond genetic similarity, as illustrated in our phylogenetic analyses. We mapped metagenomic sequencing reads from young and aged hosts to our archaea-MAGs from the following hosts: swine (n = 16) [68], humans (n = 429) [69, 70], mice (n = 60) [71], chicken (n = 71) [72], and cattle (n = 34) [73] (Fig. 4; Additional files 6: Tables S6 and 7: Table S7). Our archaea-MAGs were identified in older humans and varying aged swine metagenomes, but not in the chicken, mice or cattle metagenomes. We also demonstrated evidence of our archaea-MAGs in the ancient human gut and global distribution. Altogether we determined within a host species, archaeal age-association appeared to be similar, but some archaea span multiple host species, and for millennia [28]. We hypothesized differential archaeal function may be essential to the gut microbiome of many modern and ancient monogastric hosts.

Fig. 4
figure 4

A Detection (portion of MAG with at least 1 read coverage) heatmap of previously published swine metagenomes [68] mapped to this publication’s archaeal MAGs (Preweaning [P]; nursery [N]; growth adult [G]). B Detection box plots of previously published human metagenomes [69, 70] mapped to our archaeal MAGs (“Adult” from Mexican humans; “Paleo” from present day US and Mexico; all remaining groups from Sweden)

We determined our swine-associated methanogens were not present in poultry, mice and ruminant host metagenomes (Additional file 7: Table S7). Prescence of methanogens was determined when detection (portion of MAG with at least 1X read coverage) was greater than 0.25. This threshold aligns with previous genome-resolved metagenomic research and eliminated false-positive signals in read recruitment results [74,75,76]. Given the drastic differences in the ruminant digestive system compared to the monogastric gut, we were not surprised that our swine archaea-MAGs were not found in cattle from the United States (US) State of Pennsylvania. Although not identified consistently in all cattle, Methanobrevibacter smithii [77,78,79,80] and Methanobrevibacter gottschalkii [81,82,83,84] have been associated with the cow digestive tract. Similarly, UBA71 has been identified in adult chickens previously [85]. Given that similar taxonomic methanogens are present in cattle and chickens, we hypothesized the methanogens of these hosts were genetically distinct from the methanogens we identified in swine. Additionally, since the methanogens we identified were not consistently detected across our aging hosts, it was very probable that other metagenomes from these host populations could contain our methanogens. Future research is necessary to evaluate how distinct methanogen members function individually and collectively within the microbiome system to influence gut health in different host species.

Interestingly, we could only find a singular example of archaea attributed to the mouse gut: Methanomassiliicoccaceae DTU008 [86]. Remaining attempts, encompassing more than 1,000 metagenomes, proved unsuccessful in identifying mice gut archaea [87,88,

Materials and methods

Study design, sample collection and DNA extraction

Our study design and sample collection occurred as previously described [136]. We collected fecal samples from 7 swine over 22 timepoints, ranging in swine age from 1 to 156 days across three developmental stages: preweaning (P), nursery (N), and growth adult (G) (Fig. 1, Additional file 8: Table S1). Swine were born and raised at the Kansas State University Swine Teaching and Research Center. Swine originated from the same farrowing group, and were weaned between 18 and 20 days of age, depending on day of birth.

We stored fecal samples at − 80 °C until DNA extraction. We extracted total genomic DNA from fecal samples utilizing the E.Z.N.A.® Stool DNA Kit (Omega Bio-tek Inc.; Norcross, GA), following the manufacturer protocols. We then quantified the extracted genomic DNA with a Nanodrop and Qubit™ (dsDNA BR Assay Kit [Thermo Fisher; Waltham, MA]) for DNA quality and concentration. We stored extracted DNA at − 80 °C until library preparation and sequencing.

Metagenomic sequencing and ‘omics workflow

DNA libraries were generated for a total of 112 samples with Nextera DNA Flex (Illumina, Inc.; San Diego, CA). Resulting libraries were then visualized on a Tapestation 4200 (Agilent; Santa Clara, CA) and size-selected using the BluePippin (Sage Science; Beverly, MA). The final library pool of 112 samples was quantified on the Kapa Biosystems (Roche Sequencing; Pleasanton, CA) qPCR protocol, and sequenced on the Illumina NovaSeq S1 chip (Illumina, Inc.; San Diego, CA) with a 2 × 150 bp paired-end sequencing strategy.

We utilized the ‘anvi-run-workflow’ program to run a combined bioinformatics workflow in anvi’o v.7.1 (https://anvio.org/install/) [137, 138], with a co-assembling strategy. The workflow used Snakemake to implement numerous tasks including: short-read quality filtering, assembly, gene calling, functional annotation, hidden Markov model search, metagenomic read-recruitment and binning [139]. Briefly, we processed sequencing reads using anvi’o’s ‘iu-filer-quality-minoche’ program, which removed low-quality reads following criteria outlined in Minoche et al. [140]. The resulting quality-control reads were termed “metagenome” per sample. We organized the samples into 3 metagenomic groups based on the developmental stages (P, N, G), and used anvi’o’s MEGAHIT v1.2.9 to co-assemble quality-filtered short reads into longer contiguous sequences (contigs) [137, 141]. The following methods were then utilized in anvi’o to further process the contigs: (1) ‘anvi-gen-contigs-database’ to compute k-mer frequencies and identify open reading frames (ORFs) using Prodigal v2.6.3 [137, 142]; (2) ‘anvi-run-hmms’ to annotate bacterial and archaeal single-copy, core genes with default single-copy genes and taxonomy of genomes [143] as defined by the The Genome Taxonomy Database (GTDB) [144] database (Archaea_76, Bacteria_71, Protista_83, and Ribosomal RNAs) [145] using HMMER v.3.2.1 [137, 146]; (3) ‘anvi-run-ncbi-cogs’ to annotate ORFs with NCBI’s Clusters of Orthologous Groups (COGs; https://www.ncbi.nlm.nih.gov/research/cog) [147]; and (4) ‘anvi-run-kegg-kofams’ to annotate ORFs from KOfam HMM databases of KEGG orthologs (https://www.genome.jp/kegg/) [148].

We mapped metagenomic short reads to contigs in anvi’o with Bowtie2 v2.3.5 [149], and we then converted map**s to BAM files with samtools v1.9 [137, 150, 151]. We used the anvi’o ‘anvi-profile’ program to profile BAM files with a minimum contig length of 1000 bp. Next, we combined profiles with ‘anvi-merge’ into a single anvi’o profile for downstream analyses. We grouped contigs into bins with ‘anvi-cluster-contigs’ and CONCOCT v1.1.0 [152]. We manually processed bins with ‘anvi-refine’ using bin tetranucleotide frequency and coverage across samples [137, Data analyses

We used the “detection” criteria (> 0.25) for downstream statistical analyses. We downloaded metagenomes from swine [68], humans [69, 70], mice [71], chicken [72], and cattle [73], and performed map** to the non-redundant archaea-MAGs according to specifications above (Additional file 9: Table S2). We used RStudio v1.3.1093 [158] (https://www.rstudio.com/products/rstudio/) to visualize MAGs detection patterns using: pheatmap (pretty heatmaps) v1.0.12 [159], ggplot2 v3.3.5 (https://ggplot2.tidyverse.org/) [160], forcats v0.5.1 (https://forcats.tidyverse.org/) [161], dplyr v1.0.8 (https://dplyr.tidyverse.org/) [162], and ggpubr v0.4.0 (https://CRAN.R-project.org/package=ggpubr) [163].

We utilized the RASTtk Genome Annotation Service on PATRIC v3.6.12 (https://patricbrc.org/) and anvi’o COG annotations for metabolic function analyses [164, 165]. We used the comparative pathway tool in PATRIC to predict the metabolic pathways of our resolved non-redundant MAGs. We used ‘anvi-compute-genome-similarity’ to calculate average nucleotide identity (ANI) with a subset of samples (n = 21) from previously published human-associated archaea [16]. We obtained similar genomes that were deposited in public databases and performed phylogenetic analyses of our non-redundant MAGs in PATRIC [165]. Parameters were set as follows: 100 genes, 10 max allowed deletions, and 10 max allowed duplications. We constructed phylogenetic trees for our MAGs with 192 closely related genomes, using the amino acid and nucleotide sequences from the global protein families database. RAxML program was used to construct the trees based on pairwise differences between the aligned protein families of the selected sequences.

Our final figures were edited in Inkscape v1.2.1 [166].