Introduction

In the last decade massive Sargassum events have become a serious predicament for the Caribbean region (Chávez et al. 2020; Skliris et al. 2022). In these events, millions of tonnes of pelagic Sargassum (Phaeophyceae, Fucales) wash up on the shores of the Caribbean Sea, wreaking havoc on coastal communities each year (Chávez et al. 2020; Robledo et al. 2021; Olguin-Maciel et al. 2022). The Sargassum biomass arriving at the Mexican Caribbean has been identified mainly as three morphotypes (S. natans I, S. natans VIII, and S. fluitans III), whose proportion in the total biomass changes seasonally (Amaral-Zettler et al. 2017; Rodríguez-Muñoz et al. 2021; Vázquez-Delfín et al. 2021). The effects caused by such massive Sargassum events range from social and economic problems to public health concerns for the coastal human populations (Robledo et al. 2021; Fraga and Robledo 2022). This phenomenon has multiple causes, such as biological factors, higher nutrient concentrations from seawater, and global warming (sea surface temperature increase and changes in oceanic currents), among other known causes (Skliris et al. 2022; Alleyne et al. 2023).

Multiple international collaborations, sharing a common goal, are underway to resolve and control the Sargassum problem, including cooperation to find proper disposal of the Sargassum biomass or to find ways to take advantage of it as a new raw material source (Chávez et al. 2020; Robledo et al. 2021; Rosellón-Druker et al. 2022). Therefore, there is a silver lining in these Sargassum events through using its biomass, for example, as raw material for construction blocks (Rossignolo et al. 2022), biomethane (Azcorra-May et al. 2022), polysaccharides (alginate and fucoidan) (Mohammed et al. 2020; Bojorges et al. 2022), fertilizer (Ammar et al. 2022), or livestock feed (Choi et al. 2020). Likewise, other compounds of interest in the Sargassum genus are fucoxanthin, a carotenoid with antioxidant effects sold as a dietary supplement (Khaw et al. 2021), and phloroglucinol, a phenolic monomer in brown algae which is used to relieve spasmodic pain in the gastrointestinal and urinary tracks (Blanchard et al. 2018).

However, food or biotechnological applications require quality control of the raw material of Sargassum because it must be in good enough condition to be helpful for its purpose (Toscano et al. 2022). The material may have spent time washed up on the coast under harsh weather, thus degrading it (Andrady et al. 2023). Therefore, the Sargassum biomass may arrive in the processing centers in poor condition for successful material transformation (Azcorra-May et al. 2022). We questioned if we could find differences in chemical composition (changes in functional groups) caused by it being left under open air (solar radiation, humidity, among others) and how our procedure for transporting Sargassum biomass (low temperature and in the dark) to our laboratory modified the Sargassum chemical composition.

To answer our questions, we selected the Fourier transform infrared spectroscopic method with attenuated total reflectance (FTIR-ATR), an efficient analytical technique widely used to quickly and efficiently characterize complex mixtures by providing data on the chemical bonds present in the tested sample (Ibrahim et al. 2005; Lohumi et al. 2015). IR spectroscopy techniques are used for pure compound identification, but combined with chemometrics, they can extend their implementation into complex biological samples (Ferro et al. 2019; Deore et al. 2021; Holden et al. 2022). For example, there are numerous reports of IR spectroscopy in food technology for the quantification of nutrients, quality control of raw materials, and species identification (Toscano et al. 2022; Cebi et al. 2023; Derksen et al. 2023). The ATR (Attenuated Total Reflection) method in the FTIR-ATR equipment has an advantage over conventional transmission technology, such as minimum sample preparation (faster spectra acquisition), non-destructive, and no loss of quantitative capacity. However, background noise augmentation and minor peak distortion can happen due to different optic technology.

Chemometrics is about statistical methods implementation on large chemical datasets to extract more and better information from within. It enables data-driven decision-making instead of intuition for processes like sample classification (Biancolillo et al. 2020). It is helpful in tasks like visualizing trends in the data and producing unsupervised or supervised learning methods for sample classification. An unsupervised learning method does not require a hypothesis to find patterns in the data. An example is the clustering analysis that assigns a cluster to the samples with similar data. The supervised learning method needs labeled data to find differences between the label groups. Both methods also give importance to the individual variables in the classification model (Alloghani et al. 2020).

There are an increasing number of studies focused on the isolation and detection characterization of many compounds of fundamental interest from species of Sargassum arriving in the southern Mexican Caribbean. Few studies point out the chemical differences between the two pelagic Sargassum species and between the pelagic and benthic species (Rosado-Espinosa et al. 2020; Hernández-Bolio et al. 2021). However, we have yet to find information about how the Sargassum chemical composition changes under prolonged conditions in the open air and transported from shore to the laboratory, which can dramatically influence the quality of chemical fingerprinting of samples.

Sample transportation is a fundamental yet overlooked step in any analytical process. Any sample transportation modifies the chemical composition, some more than others. One of the best methods is to flash-freeze the samples as soon as one can, store them in an ultra-low temperature (ULT) freezer (< 80 °C), and then freeze-dry them. Unfortunately, it is not easy to transport a large amount of algal material to the lab. Consequently, in Sargassum studies a frequent preservation technique is to transport the samples from the shore to the lab by kee** them in a cooler on ice (~ 4 °C). Then, analyze what happens to the chemical composition of Sargassum after flash-freezing them in the lab within 24 h instead of flash-freezing the sample at the shore. We named this evaluation flash-freeze comparison. We also evaluated the capacity of the FTIR-ATR data to identify the three morphotypes found in the pelagic Sargassum.

Another key factor to taking advantage of Sargassum biotechnologically is knowing its chemical transformation over time from strandings on the coast, under UV radiation and solar heat. Thus, we complemented the study by evaluating the damage produced by leaving Sargassum in the open air in a period of 56 days. In all the analyses, we focused on understanding which IR peaks make a difference between treatments or morphotypes.

Therefore, our study provides a foundational baseline to further a rapid test of chemical composition quality assessment of Sargassum biomass collected in the field using FTIR-ATR. This will allow us to improve more efficient management and pretreatment strategies to access the usable compounds derived from the Sargassum species that arrive in the southern Mexican Caribbean.

Materials and methods

Species and collection sites

We identified and collected samples from the two pelagic Sargassum species, Sargassum natans (including morphotypes I and VIII) and Sargassum fluitans morphotype III, following the method described by Parr (1939) and Rosado-Espinosa et al. (2020). The sample names came from these three morphotypes: S. fluitans III (flu3), S. natans I (nat1), and S. natans VIII (nat8). We selected three collection sites corresponding to three coastal towns on the southern Mexican Caribbean coast in the municipality of Othon P. Blanco: Mahahual (18°43′11.1"N 87°42′22.3" W), Xahuayxol (18°30′42.0"N 87°45′26.8" W), and Xcalak (18°16′08.9"N 87°50′05.8" W). The collection date was 2 April 2023. We only collected green-golden sea-floating Sargassum biomass further from the shoreline to ensure algal freshness (Rodríguez-Muñoz et al. 2021). The alga decomposes rapidly after being washed ashore, changing its color into reddish and darker tones. These collections were to evaluate 1) the flash-freeze method and 2) the open-air test. Both experiments are detailed below.

Sample preparation with the flash-freeze tests

After identification and epiphyte removal in the field, we divided the algal material into two groups based on its storage while being transported to the laboratory. This was to assess how storage during transport influences the chemical composition of the collected material. Cold sample transportation is a common practice to avoid fast decomposition in Sargassum research (Ragan 1984; Hernández-Bolio et al. 2021; Olguin-Maciel et al. 2022).

The first treatment group was named” field because the samples were flash-frozen in the field. First, based on the literature, we removed the saltwater with tap water (Hernández-Bolio et al. 2021). Then, we filled 50 mL centrifuge tubes (NEST Biotechnology, China) with ∼40–50 g of Sargassum material. After closing the centrifuge tubes, we submerged the complete tube in liquid nitrogen (-196 °C). The samples were transported in a cryogenic storage Dewar filled with liquid N2 to our installation, then kept at -80 °C until freeze drying. All the flash-frozen samples were freeze-dried and ground to a fine powder in an agate mortar. The samples in this treatment group were labeled d00 as their sample ID.

The sample ID for the flash-freeze comparison dataset is in Table 1. The samples in this experiment have three labels in their name ID: the site, morphotype, and flash-freeze treatment. For example, the sample Ma23_nat8_d00 means that the sample was collected in Mahahual, was identified as S. natans VIII, and flash-frozen in the field. A complete list of the sample IDs is in Table SI-1.

Table 1 Sample ID and size for the flash-freeze comparison dataset

The second treatment group was named "lab" because the sample was flash-frozen after transport to our laboratory. For taxonomic identification and epiphyte removal, the Sargassum material was stored in a re-sealable bag filled with seawater and transported in a closed cooler box with ice for further processing in the laboratory. Their transportation time from the field to the laboratory was 24 h. Then, after arriving in the laboratory (at Cinvestav-IPN, Mérida) the saltwater was removed with tap water. Then, the algae were flash-frozen by submerging in liquid N2. The rest of the sample processing consisted of freeze-drying and grinding them into a fine powder using an agate mortar. These samples were labeled d01 to indicate cold transportation from the shore to the lab (see Table 1).

Sample preparation in the open-air comparison

In the open-air comparison we used only S. natans VIII from the Mahahual collection site to eliminate other factors driving the data variation, such as the collection site or the species and its abundance on the site. The sample was transported to our lab the same way as the d01 samples. For the sample preparation for the open-air decomposition experiment comparison, we stored them in 45 conical tubes with between 40–50 g of S. natans VIII, closed with a disposable cleaning cloth to avoid a complete air-drying and to simulate aerobic conditions on the upper side and anaerobic decomposition in the bottom. Finally, the samples were stored in a rack and left in a greenhouse.

The evaluation times were based on reviewing research on the topic (Azcorra-May et al. 2022). The time to flash-freezing them corresponds to the sample labels: d02 for two days, d04 for four days, d07 for seven days, d10 for ten days, d14 for fourteen days, d21 for 21 days, d28 for 28 days, d42 for 42 days, and d56 for 56 days. After staying in the open air, we chose five random conical tubes for flash-freeze. All flash-frozen samples were freeze-dried and ground to a fine powder in an agate mortar.

The sample ID and size for the open-air comparison dataset are indicated in Table 2. The time label corresponds to the number of days in the open air. A complete list of the sample IDs is found in Table SI-2.

Table 2 Sample ID and size for the open-air comparison dataset

FTIR-ATR equipment and analysis

All FTIR-ATR analyses were done on a Nicolet iS5 from Thermo Scientific (USA) with an iD7 accessory on a ZnSe crystal plate using the Thermo Scientific OMNIC software. All the spectra had their background noise removed. The spectral acquisition range was 500 to 4000 cm−1, 32 scans per sample, and operating at a resolution spectral resolution of 4 cm−1, thus a data spacing of 0.482 cm−1. The samples we read in triplicate.

FTIR-ATR data preprocessing

We imported the FTIR-ATR spectra into the RStudio (Posit team 2023), an integrated development environment for R software (R Core Team 2023), using the ChemoSpec R package (Hanson 2023), an R package for the preprocessing, and 2D data exploratory analysis. The ChemoSpec versatile package can process any spectroscopic data file while the y-scale is the signal intensity, and the x-scale corresponds to the frequency unit. After data importation, we corrected the spectra baseline using the 4S peak filling method (Liland 2015). The last preprocessing step included an intensity calibration (or normalization) with the PQN method (Dieterle et al. 2006). Subsequently, we binned the data to a spacing of 0.964 cm−1 and eliminated the first 25 cm−1 values due to the manufacturer's recommendation. Thus, the data has 3604 values (number of columns). The number of rows (observations or samples) for the species comparison was 44 for the d00 subset, 56 for the d01 subset, with the flash-freeze comparison (d00 versus d01) at 90, and the number of rows for the open-air decomposition experiment at 147 (S. natans VIII samples from d00 to d56).

Chemometric analysis of FTIR-ATR data

We did a univariate analysis of variance for each variable to obtain their probability value (p-value). A lower p-value means that the variable is unlikely that the difference is by chance under the assumption of the null hypothesis (Biancolillo et al. 2020). We plotted the negative log p-values into spectra to visually recognize the IR spectra zone with differences confirmed by a statistical test. The classical principal component analysis (PCA) 2D and 3D functions are in the ChemoSpec package. The columns of the numeric matrix were preprocessed by centering and Pareto scaling. The clustering analysis was implemented using the mclust R package (Scrucca et al. 2016). The present confidence ellipses and ellipsoids are robust ones.

The supervised learning methods were calculated with the caret R package (short for Classification And REgression Training), which helps to tune the parameters of classification or regression models to optimize them and evaluate their performance (Kuhn 2008). We did PLS-DA models for the flash-freeze comparison data set and the species comparison. We also generated regression models for the open-air decomposition experiment, including the PLS-R (Partial Least Squares-Regression) and LASSO (Least Absolute Shrinkage and Selection Operator) models. The models above have an advantage over others as they permit the VIPs (Variable Importance in the Projection) calculation, which values how relevant the original variables are for the assumed labels (in classification models) or decomposition days (in regression models). For each comparison, the data requires division into a training set (75% of the original dataset) for the parameters tuning, which gives us optimized models, and the test set (rest of the remaining data) to evaluate the performance. The performance evaluation was through confusion matrix metrics for the classification models or the RMSE (Root Mean Square Error) in test data for the regression models (Gromski et al. 2015; Markoulidakis et al. 2021). All the computed models produced the VIP scores, which estimate the importance of each variable in the final model by generating a rank list of the original variables (Kuhn 2008). We used the VIP scores to plot them into a spectrum to determine which treatments modify IR bands.

Other R packages implemented in this study for visualization in 2D and 3D are the following: ggplot2 (Wickham 2016), Plotly (Sievert 2020), rgl (Murdoch and Adler 2022), and car (Fox and Weisberg 2019).

Results

Flash-freeze comparison

First, we visualized the Flash-freeze comparison dataset data variance using the interquartile range (IQR) spectrum (Fig. 1). Figure 1A IQR spectrum shows which IR regions possess the most variations in the absorbance among all samples. Overall, the peak at 1414 cm−1 had the greatest variation, and secondarily, 1023 cm−1. Most of the IR spectral cm−1 values are constant among all samples.

Fig. 1
figure 1

IR spectral comparison with the interquartile range for the flash-freeze comparison. The IR absorbance spectra are normalized, and the baseline is corrected. Red lines correspond to Q1 and Q3, while the median (Q2) is black. Figure 1A) corresponds to all spectra of the samples from flash-freeze (field and lab), while Fig. 1B) The IQR spectrum zoom in the non-constant region of 2000–525 cm−1 comparing between both flash-freeze treatments (d00 and d001) for the morphotype S. fluitans III (flu3), while Fig. 1C) does it for S. natans I (nat1), and Fig. 1D) for S. natans VIII (nat8). The most relevant IR peaks are labeled

The IQR spectra by groups in Fig. 1B-D show the differences among the three morphotypes identified and their two flash-freeze treatments (Fig. 1B for S. fluitans III, Fig. 1C for S. natans I, and Fig. 1D for S. natans VIII). These spectra demonstrate that most spectral variance was related to the flash-freeze timing. Thus, a prolonged time for sample transportation to the laboratory modifies the chemical composition in the sea-floating Sargassum spp bigger than the morphotype.

We did a principal component analysis (PCA) (Fig. 2 and Figure SI 1) to explore the data with high numbers of variables, which are highly correlated, such as spectral data. It is beneficial for finding trends, outliers, and original variables that drive the data variation (Biancolillo et al. 2020; Hernández-Bolio et al. 2021). The PCA revealed that the samples clustered not only by the species or collection site but also where the flash freeze happened, in the field or the lab. We observe a grou** of field flash-freeze samples (d00) with negative values for the PC1 score, while the lab flash-freeze sample has positive PC1 (d01) score values, reflecting a lower preservation state.

Fig. 2
figure 2

The PCA 3D on the Sargassum species FTIR-ATR data for flash-freeze comparison. The PCA revealed that the samples clustered not only by the species but also where the flash freeze happened, in the field or the lab. Negative values for the PC1 score correspond to field flash-freeze samples (d00), while positive scores in the PC1 reflect a lower preservation state, being those of the lab flash-freeze sample (d01). The discontinuous yellow line separates both Flash-Freeze groups. Each group (flu3_d00, nat1_d00, nat8_d00, flu3_d01, nat1_d01, and nat8_d01) has its respective robust confidence ellipsoid

The PCA complements well with a clustering analysis that classifies the samples without a priori bias. The clustering analysis of the mclust R package (Figure SI 2) found the optimal clustering model for the flash-freeze comparison data set. The clustering analysis results show three clusters: a cluster for S. fluitans III from d00, another for S. natans I from d01, and the last cluster grou** together most samples. S. natans morphotypes I and VIII cluster together when fresh flash-frozen but separate further when flash-frozen in the lab.

There was no evidence of the collection site being a crucial factor driving the data variance rather than the sample treatment (flash-frozen in the field or until arriving at the lab) or morphotype (see Figure SI 3). A plausible explanation is that the Sargassum samples were collected on the same date and grew in the same oceanic current with almost the same nutritional and environmental factors.

The freeze-dried Sargassum shares a similar pattern to the previously reported IR spectrum on the genus or other algae (Gómez-Ordóñez and Rupérez 2011; Tanniou et al. 2015; Rosado-Espinosa et al. 2020; Alzate-Gaviria et al. 2021; Deore et al. 2021; Derksen et al. 2023). It helps to find the chemical identity of the IR bands and their biological relevancy, which we will describe next. We start by explaining those with the larger wavenumber. The IR peak of 3254 cm−1 corresponds to the O–H stretch from the cell wall and reserve carbohydrates, the most abundant compounds in the brown algae composition (Gómez-Ordóñez and Rupérez 2011; Mohammed et al. 2020).

The IR peaks at 2935 and 2847 cm−1 are typical of C-H vibrational stretching from aliphatic compounds (Ferro et al. 2019). They have more relevance in samples with a high lipid composition, which is not the case for brown algae, which share a low percentage of lipids because they accumulate carbohydrates for energy reserve, like laminarin and mannitol, instead of acylglycerols (Michel et al. 2010). Thus, those IR bands did not reveal any change among different treatments. Therefore, we will focus on IR bands lower than 2000 cm−1.

The IR peak of 1607 cm−1 corresponds to a carboxylate (RCOO) asymmetric stretch that comes from the presence of the uronic acids, guluronic and mannuronic acids, which are the monomer of the alginate polymer (Ibrahim et al. 2005; Gómez-Ordóñez and Rupérez 2011; Rosado-Espinosa et al. 2020; Måge et al. 2021). The peak at 1414 cm−1 corresponds to a carboxylate (RCOO) stretch (sym) but also overlaps with a C-N stretch band. The PCA loading plot for the principal component 1 (PC1) (Fig. 3A and Figure SI 4) results shows that samples the cold transportation treatment before being flash-frozen had higher absorbance in that IR band of 1425–1385 cm−1, likely due to the formation of compounds related to oxidation and proteolytic degradation. This can be observed by comparing the reference spectrum for the d00 (Fig. 3C) and d01 (Fig. 3D), where the latter has increased its absorbance in the IR band of 1425–1385 cm−1.

Fig. 3
figure 3

The loadings plot on the Sargassum species FTIR-ATR data for flash-freeze comparison. The higher covariance values (y-scale) correspond to the signals driving the chemical difference among the samples, and the x-scale corresponds to the wavenumber with cm−1 unit. Figure 3A shows the loading plot for PC1 reveals that IR bands 1425–1385 and 880–860 cm−1 have higher absorbance in the lab samples (reference spectrum in Fig. 3D) over the field flash-frozen samples, while the IR bands 1100–1000, 1060–1010, and 940–920 cm−1 have a higher absorbance in the field samples (reference spectrum in Fig. 3C). Figure 3B shows the loading plot for PC with IR bands 3580–3460 and 1585–1570 have higher absorbance in the lab samples over the field flash-frozen samples, while the IR bands 1100–1070, 1060–1010, 750–700, and 630–600 cm−1 have a higher absorbance in the field samples

The peaks between 1100–1000 cm−1 (1081, 1055, and 1023 cm−1) come from the C-O stretch characteristic of cell wall carbohydrates (alginates, fucoidans, and cellulose) and reserve carbohydrates (laminarin and mannitol) (Michel et al. 2010; Gómez-Ordóñez and Rupérez 2011; NIST Mass Spectrometry Data Center, William E. Wallace 2023). Those peaks have the highest positive covariance in the PC2 loading plot (Fig. 3B) and diminish their absorbance in the samples flash-frozen in the lab after cold storage (compare reference spectrum between Fig. 3C and D). This decrease in absorbance means that reserve carbohydrate catabolism continues at low temperatures (~ 4 °C) in Sargassum spp., but those conforming to the cell wall should conserve their integrity.

The IR peaks lower than 1000 cm−1 are difficult to assign to just one functional group. However, the PCA results (Fig. 3) revealed that those peaks have large magnitude values in the covariance of PC1 and PC2 loading plots. The one at 932 cm−1 that diminishes through time could be from the consumption of mannitol. The 872 cm−1 peak increased in group d01 (flash-frozen in the lab) should arise from the carbonate of Sargassum epiphytes (Salter et al. 2020) that do not decompose over time, thus increasing its absorbance on the d01 samples (Paraguay-Delgado et al. 2020). The 818, 717, and 614 cm−1 peaks have a higher absorbance in the d00 group and are difficult to assign to a functional group. A probable origin contributing to those peaks could be the C-H bond vibrational bending from aromatic rings from the phenolic compounds (Margoshes and Fassel 1955), crucial metabolites in algae physiology (Shen et al. 2021). These minor compounds in the brown algae chemical composition (Zhang and Thomsen 2019) can also contribute to absorbance in other spectral zones like the 1600–1585 cm−1 (-C = C- stretch) range.

To confirm the IR regions with statistical relevance between treatments, we calculated a univariate analysis of variance on all wavenumber variables and the VIP scores from the multivariate PLS-DA (Figure SI 5). The obtained statistical values, p-values, and VIP scores were plotted in Fig. 4 to demonstrate the IR bands correlated with a lab flash-frozen (cold transportation degradation, d01) or field flash-frozen (fresh samples).

Fig. 4
figure 4

Spectra comparison for the flash-freeze treatments, field, and lab, and their ANOVA p-values and VIP scores. The x scale corresponds to the cm−1 values. The values in the y-scale were scaled for comparison purposes. The Q2_field and Q2_lab spectra correspond to the median absorbance of the respective groups. The p.value_aov spectrum corresponds to the p-value negative logarithm from the ANOVA test. Finally, the VarImp_PLSDA values come from the PLSDA VIPs. The highlighted bands with higher z-scores in the spectra correspond to the IR bands with the most differences in their absorbance among flash-freeze groups. The 1590–1480 cm−1 band increases in the lab flash-freeze samples, while the contrary happens to the IR bands at 1101–1070, 1062–1050, 1025–1020, 930–910, and 625–600 cm.−1

In Fig. 4, the bands with higher z-scores (> 2) in the spectra p.value_aov and VarImp_PLSDA correspond to the IR bands with the most differences in their absorbance among flash-freeze groups. Both tests presented similar z-score values for the same IR bands, proving that a simple univariate analysis can perform like a complex multivariate analysis like the PLS-DA classification model. The IR band at 1570–1475 cm−1 should correspond to carboxylate groups by oxidation that increased in the lab flash-freeze samples, while the contrary happens to the IR bands at 1100–1020, 935–910, and 625–600 cm−1, diminishing their absorbance in the spectra as reserve carbohydrates depleted. The summary of the most relevant IR spectral zones to classify between flash-freeze treatments is presented in Table 3.

Table 3 Summary of relevant IR spectral zones in the flash-freeze comparison

Comparison among morphotypes

Our results showed that most of the variation in the data is due to sample transportation. We implemented the supervised learning method from the PLS-DA model (Figures SI 6–11) to find the most relevant IR peaks that permit classification among the three morphotypes, S. fluitans III, S. natans I, and S. natans VIII, by evaluating a model for each of the two flash-freezing treatments (d00 and d01). It indicated that the IR peaks that permit classification among the species change between the flash-freeze treatments at the lab or field. The d00 sub-dataset shows a clear separation between S. fluitans and S. natans. Still, it fails to distinguish between the S. natans morphotypes according to the metrics in the confusion matrix (Figure SI 12). The IR regions important for the classification in the d00 sub-dataset are 1580–1560, 940–920, 895–885, and 750–735 cm−1. All these IR regions were peaks in the IR spectrum for S. fluitans III (see Fig. 1B) but did not appear in the S. natans samples. Furthermore, they even decreased on the S. fluitans III samples after cold transportation (d01).

For the species comparison in the d01 sub-dataset, the FTIR spectra of the S. natans VIII and S. fluitans III were more similar than those of the S. natans I, a result confirmed by the clustering analysis (see Figure SI 2). The IR spectra differences between the morphotypes of S. natans are the same IR bands that got the best VIP scores for the classification, which are 1415–1410, 985–945, and 915–890 cm−1, having higher absorbance in the S. natans VIII and S. fluitans III. The confusion matrix did not reveal any prediction error in the sub-dataset d01 as opposed to sub-dataset d00 (Figure SI 12). Therefore, the results in the flash-freeze comparison give us a hint as to how the IR spectra composition changes over time because the present data has shown its limits to classify among species or collection sites.

Open-air comparison

The open-air decomposition comparison IQR spectra in Fig. 5 presented similar higher absorbance variance peaks to the flash-freeze comparison in Fig. 1. However, the difference is that this comparison has more treatments (days left in the open air) and only one species, S. natans VIII, was used. The IQR spectra show some peaks whose absorbance continues to decrease or increase over time, from the first group with the best conservation (less decomposition) to the group with the longest time in open-air conditions and another that does not, since they seem to have limited linearity to a shorter time than the entire experiment lasted. These peaks include 614 cm−1 from the 7-day rot to the 28-day rot treatment, and the ones at 595 and 664 cm−1 appeared in the last two evaluations at 42 days and 56 days.

Fig. 5
figure 5

Spectral comparison for the interquartile range for the treatments in the open-air decomposition comparison. Present IR absorbance spectra are normalized, baseline corrected, and zoomed. Red lines correspond to Q1 and Q3, while the median is in black. Figure 5A corresponds to whole spectra while the Fig. 5B zoom in the non-constant region of 2000–525 cm−1 and compares the time groups, where d00 is flash-freeze in the field, d01 is flash-freeze in the lab, from d02 to d56 corresponds to days open in the air, so d02 is for two days, d04 is four days, d07 is seven days, d10 is ten days, d14 is fourteen days, d21 is 21 days, d28 is 28 days, d42 is six weeks, and d56 is eight weeks. Most relevant IR peaks are labeled

The PCA for the open-air decomposition in Fig. 6A shows a loading plot for PC1, revealing that IR bands 1028–1020 and 830–810 cm−1 have higher absorbance in samples kept in better condition (Fig. 6C) over the ones with a prolonged time in the open air (Fig. 6D), while the IR bands at 1575–1475 and 1425–1375 cm−1 have a lower absorbance in the best-conserved samples. The PC3 (Fig. 6B) separates the samples over time, in which the IR bands like 1425–1375 cm−1 with a higher negative covariance value correspond to samples that lasted more time under the sun in the open air. Thus, it is related to decomposition by anaerobic fermentation, forming carboxylic acid compounds like succinate and lactate (Ibrahim et al. 2005; Gómez-Ordóñez and Rupérez 2011). Their formation was documented in an NMR study on Sargassum spp (Hernández-Bolio et al. 2021). The loading plot for PC3 shows that IR bands with a high positive covariance in 620–600 cm−1 have a higher absorbance in samples before being left to rot under UV sunlight radiation.

Fig. 6
figure 6

The loadings plots for PC1 and PC3 on the S. natans VIII FTIR-ATR data for the open-air decomposition experiment. The higher covariance values (y-scale) correspond to the signals driving the chemical difference among the samples, and the x-scale corresponds to the wavenumber with cm−1 unit. The loading plot for PC1 (Fig. 6A) reveals that IR bands 1080–1020 and 830–810 cm−1 have higher absorbance in samples kept in better condition (Fig. 6C) over the ones with a prolonged time (Fig. 6D) in the open air, while the IR bands at 1575–1475 and 1425–1375 cm−1 have a lower absorbance in the best-conserved samples. The PC3 (Fig. 6B) separates them over time, in which the IR band of 1425–1375 cm−1 had the highest negative covariance value corresponding to samples that lasted more time under the sun in the open air. The loading plot for PC3 shows that IR bands with higher positive covariance, 1180–1100 and 625–624 cm−1, have a higher absorbance in samples without UV sunlight radiation

The exploratory analysis showed (Figure SI 13) that the regions of the infrared spectra related to the data variance were similar between both comparison datasets. Whether the algae sample was stored in cold and darkness or left in an environment with heat and UV radiation, both had a similar sample decomposition on their IR pattern. In summary, in the first stages of decomposition, the depletion of reserve carbohydrates (mainly mannitol) concurs with the disappearing IR peaks of 1084, 1023, 932, and 890 cm−1(NIST Mass Spectrometry Data Center, William E. Wallace 2023). The IR peaks that increase their absorbance over time are 1513 cm−1 and 1414 cm−1 in samples with more time under open-air conditions due to the accumulation of oxidated compounds possessing a carboxyl moiety in their structure (Ibrahim et al. 2005; Fajardo et al. 2012). These are generated in metabolic and non-enzymatic processes by oxidizing hydroxyl and aldehyde functional groups into carboxylic acid (Su et al. 2022).

We also did multivariate regression models to confirm with other statistical methods the IR peaks that better explain the degradation phenomenon in this dataset. According to the variance analysis in Fig. 7, the peaks present more relevant differences between the time treatments were the peaks of 1513, 932, 890, and 818 cm−1. These peaks agree with other results presented here for the flash-freeze comparison (Fig. 7).

Fig. 7
figure 7

Spectra for the open-air comparison, the d00 and d56, and their ANOVA p-values and VIP scores. The x scale corresponds to the cm−1 values. The values in the y-scale were scaled for comparison purposes. The Q2_d00 and Q2_d56 spectra correspond to the median absorbance of the respective groups. The highlighted bands with higher z-scores in the plot correspond to the IR bands with the most differences in their absorbance among open-air comparison groups. The p-value_aov spectrum corresponds to the p-value negative logarithm from the ANOVA test. The 1550–1500 cm−1 band has a higher absorbance in the samples that stayed longer rotting in the open air, while the contrary happens to the IR bands at 935–925, 900–885, and 820–800 cm−1. Finally, the VarImp_PLSR values from the PLS-R VIPs score showing the IR band of 1776–1759, 766–727, 670–651, and 597–590 cm-1 had the higher contribution to the regression model

In the most advanced stages of decomposition (42 days and 56 days), the calcium carbonate from the calcareous epiphytes increased its IR signal at 875 cm−1, indicating that the sample presents an advanced decomposition of the organic matter.

Multivariate regression models are a powerful tool to implement on FTIR-ATR data widely used to estimate the concentration of proteins or fats in algae or other foods (Lohumi et al. 2015; Måge et al. 2021; Cebi et al. 2023). Thus, we decided to evaluate if they could find a linear correlation between IR peak absorbance and time decomposition. The VIP scores results of regression models PLS-R and LASSO (see Figures SI 14 and 15) reveal that IR bands of 3767–3750, 3708–3707, 1776–1759, 886–879, 766–727, 670–651, 597–590, and 537–536 cm−1 explain linear regression in the Sargassum decomposition from time zero to 56 day left to rot under open-air conditions.

The resulting IR bands for the regression models differ from the other univariate or multivariate results. This happens because the compound producing the IR bands could degrade linearly but only through some time evaluated in this experiment (56 days). For example, mannitol has IR peaks in the IR zone of 1100–1000 cm−1 but decomposes faster than macromolecules like alginate or fucoidan. The IR peaks 664 and 595 cm−1 had good VIP scores in the regression models. Their absorbance increased over time, thus becoming visible after prolonged sunlight UV damage that promotes a non-enzymatic oxidative coupling for phenolic compounds and the characteristic sample browning (Wasikiewicz et al. 2005; Su et al. 2022). We called them advanced decomposition IR bands in Table SI 2.

Discussion

Multivariate analysis of the IR spectra revealed that the chemical composition of Sargassum changes rapidly due to the consumption of mannitol, the main reserve metabolite. The augmentation of carboxylic acids (detected as their conjugate base) in Sargassum when experiencing low light stress, water deprivation, and cold stress stops photosynthates biosynthesis and activates anaerobic metabolism. Mannitol consumption produces an absorbance decrease in 1080 and 1020 cm−1 peaks and an increase in bands due to the carboxylate (RCOO) functional groups (1575–1475 and 1425–1375 cm−1) from short-chain carboxylic acid. End products of anaerobic metabolism form as formate, acetate, lactate, and succinate (Sterry et al. 1985; Nedergaard et al. 2002).

The analysis of the morphotypes revealed that IR peaks between 985–885 cm−1 are important for the classification among morphotypes, although the IR peaks change between the laboratory or field flash-frozen samples. Nevertheless, our samples of S. fluitans had high absorbance IR peaks (such as mannitol IR peaks) that also corresponded to high absorbance IR peaks for the fresh samples in the flash-freeze experiment. When S. natans I was the morphotype, the chemical composition changed the most due to transport from the coast to the laboratory. This could indicate that the other morphotypes tolerate low light and cold stress better.

The open-air experiment aimed to simulate what happens in the field when the Sargassum biomass washes ashore. In the upper part of the sample, there is greater exposure to air and solar radiation. Thus, the sample dried and became oxidized fast by the UV radiation. In contrast, the lower part of the tube had a humid environment and anaerobic bacterial metabolism predominated. Therefore, the open-air exposure generated a greater change than the low-temperature sample transportation, with a pronounced decrease in the carbohydrate peaks and a greater increase in anaerobic metabolism end products. Furthermore, after the depletion of mannitol, other Sargassum macromolecules like proteins and polysaccharides of the cell wall (alginate and fucoidan) are decomposed by bacteria, thus producing short-chain carboxylic acids from anaerobic metabolism (Zhang et al. 2021).

Under these stress conditions, brown algae, like Sargassum species, produce a family of phenolic compounds, the phlorotannins, which are oligomers and polymers of phloroglucinol (Emeline et al. 2021; Catarino et al. 2022). These metabolites protect against oxidative stress and strengthen the cell wall. Thus, the Sargassum algae deplete the monomer source under abiotic stress and convert it into polymeric forms. Also, they pigment the Sargassum, giving it a dark color after exposure.

In a previous FTIR analysis on S. fluitans and S. natans, the authors assigned peaks in the fingerprint area to these phlorotannins but named in the research as lignin-like due to their presence in the cell wall (Alzate-Gaviria et al. 2021). However, there is no confirmation about lignin and its biosynthetic genes in brown algae species (Ragan 1984; Xue et al. 2022). Furthermore, the lack of lignin facilitates the yielding of good-quality cellulose from brown algae (Bogolitsyn et al. 2020). These phlorotannins plus some inorganic matter, such as calcium carbonate, could be part of the IR peaks in the fingerprint area on the late-stage samples of the open-air experiment.

Conclusion

This study provides a better understanding of the importance of preserving collected samples and how Sargassum species decomposition dynamics impact key compounds like alginate or mannitol. The chemometric FTIR-ATR study was essential to discover which spectral features in the Sargassum spectra correspond to damage caused by cold transport or prolonged exposure to sunlight. These results are promising for further investigations related to mannitol or alginate extraction or its usefulness in other purposes (construction and biomethane, for example) and how its spectral information changes through FTIR-ATR to understand the quality requirements for using available Sargassum biomass.

With the recent advances in FTIR-ATR technology, the quality control for the Sargassum biomass could be fast and easy so that we will know the biomass reliability on the shore by just checking it with affordable hand-held equipment. Thus, there is no need to transport Sargassum biomass to the lab to assess its quality. This information might contribute to better management strategies for improved subsequent use of products derived from these species of Sargassum from the southern Mexican Caribbean.