Introduction

In recent years, the adoption of multiomics approaches in biomedical research and clinical application has increased significantly (Hasin et al. 2017; Hoadley et al. 2018). The integration of multiomics or molecular phenomics data (including genomics, epigenomics, transcriptomics, proteomics, and metabolomics) along with deep phenotypic data enables the discovery of correlations between the diverse levels of genetic and regulatory information and distinct phenotypic traits, fostering a more comprehensive understanding of biological processes and facilitating the identification of disease mechanisms, potential therapeutic targets, and disease biomarkers (Jiang et al. 2023b). In 2015, NIST released the primary human genome DNA reference material, RM 8398, derived from HG001/NA12878, a healthy female of European ancestry. To improve the representation of human genetic diversity, NIST further developed DNA reference materials from different ethnic populations, including an Ashkenazi Jewish family trio (RM8392) and a Han Chinese son (RM8393) (Zook et al. 2018). The Quartet Project, led by Fudan University in close collaboration with the National Institute of Metrology of China and other organizations, established four immortalized lymphoblastoid cell lines from a Chinese Quartet family, including a father, mother and two monozygotic daughters (Ren et al. 2019). The GeT-RM has characterized DNA RMs for a wide range of genetic disorders, such as cystic fibrosis (Pratt et al. 2009), Duchenne and Becker muscular dystrophy (Kalman et al. 2011), fragile X syndrome (Amos Wilson et al. 2008), Huntington disease (Kalman et al. 2007), and many others, including 11 human leukocyte antigen loci (Bettinotti et al. 2018) and pharmacogenetic loci (Gaedigk et al. 2019; Pratt et al. 2016). These reference materials represent specific mutations associated with diseases and are available for research, clinical test development, quality assurance and control, and proficiency testing to ensure the accuracy of clinical testing.

Somatic variants are genetic mutations that occur in non-germline cells. They are typically detected in tumors from sequencing datasets of paired tumor and normal samples, with normal samples used to remove germline variants. Accurate and reliable detection of somatic variants is crucial for gaining insights into cancer biology, guiding targeted therapies and improving patient outcomes in cancer treatment. DNA RMs used to benchmark somatic variants usually consist of matched tumor and normal genomes.

The MicroArray and Sequencing Quality Control (MAQC-IV/SEQC2) consortium recently completed its fourth project, which aimed to develop standard analysis protocols and quality control metrics for the use of high-throughput DNA sequencing data in regulatory science research and precision medicine (MAQC Consortium 2021). The Somatic Mutation Working Group (WG1) of SEQC2 established paired tumor-normal DNA RMs and corresponding whole-genome reference datasets for small variants and structural variants (Fang et al. 2016) created DNA RMs from a metastatic melanoma (COLO829) and its paired B-lymphoblastoid normal cell line (COLO829BL).

While WGS and WES provide a more comprehensive view of the entire genome, targeted sequencing, also known as oncopanel sequencing, offers a more cost-effective and efficient approach by focusing on a limited number of cancer hotspot variants. It can detect variants with a variant allele frequency (VAF) as low as 0.5%. The Oncopanel Sequencing Working Group (WG2) of SEQC2 established two DNA RMs for oncopanel benchmarking (Jones et al. 2006, 2014), covering as many clinically related variants as possible to increase variant density in coding regions. Sample B is derived from a non-cancer male cell line (Agilent OneSeq Human Reference DNA, PN 5190-8848). To emulate the range of VAFs typically encountered in targeted sequencing and ctDNA sequencing, tumor Sample A was diluted by normal Sample B at different ratios to create a series of tumor DNA reference materials with even lower VAFs of variants. The SEQC2 WG2 employed these DNA RMs to conduct cross-platform multi-laboratory evaluations of commercially available oncopanels, and developed actionable guidelines to improve the performance and consistency of oncopanel sequencing across different laboratories and platforms (Deveson et al. 2023). This set of RNA reference materials enables the evaluation of the performance of isoform-specific RNA-seq workflows, and thus provides a more comprehensive evaluation of RNA-seq performance.

RNA-seq can be used to sequence long RNAs, such as messenger RNAs, as well as short RNAs, such as microRNAs (miRNAs), that differ in length. The Extracellular RNA Communication Consortium led a benchmark study for miRNA quantification across multiple protocols and laboratories using small RNA-seq (Giraldez et al. 2018). They used diverse combinations of synthetic RNAs to evaluate sequence-specific biases and accuracy. An equimolar pool consisted of over 1000 chemically synthesized RNA oligonucleotides (15–90 nt) mixed at equal concentration was used to assess reproducibility of absolute RNA sequences abundance at counts per million (CPM) level. Two synthetic small RNA pools with RNAs varied in defined relative amount were used to assess the concordance for relative quantification. Synthetic pools with unedited and edited miRNA variants in different ratios were used to determine the accuracy of quantifying miRNA editing.

Protein Reference Materials

The proteome refers to the entire set of proteins expressed by a cell, tissue or organism at a particular time. Proteomics is the systematic, high-throughput study of the composition, functions, and interactions of all proteins. In proteomics research, where the sheer multitude of proteins presents a formidable challenge, mass spectrometry (MS) is commonly used for both qualitative and quantitative protein analysis. This process involves comparing detected peptide maps with protein sequences sourced from databases. However, the complexity of MS-based proteomics experiments and their potential for considerable variability can hinder the achievement of accurate and reproducible results. To enhance the reliability and reproducibility of proteomics, numerous initiatives have been actively working for decades to establish community standards and guidelines. These efforts aim to ensure consistency, promote rigorous experimental practices, and facilitate the generation of reliable and comparable proteomic data across different laboratories and studies. For example, the Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) standardized practices and guidelines for data reporting formats (Deutsch et al. 2017), data quality control framework (Bittremieux et al. 2017) and data interpretation (Omenn 2021). CPTAC, launched by the US National Cancer Institute (NCI), intends to improve MS-based proteomics measurement quality for biomarker discovery in cancer research (Tabb et al. 2016; Zhou et al. 2017). The Proteomics Standards Research Group (sPRG) of the Association of Biomolecular Resource Facilities (ABRF) develops and implements standards to reflect the accuracy and consistency of proteomics (Tabb et al. 2010).

The limitations of traditional methods have driven the development of new technologies tailored for highly sensitive protein biomarker discovery while demanding minimal quantities of biological materials (Eldjarn et al. 2023; Sun et al. 2023). Examples of this innovation include Olink's Proximity Extension Assay (PEA) (Petrera et al. 2021; Wik et al. 2021), SomaLogic's SomaScan Assay (Candia et al. 2017, 2022), and Seer's Proteograph (Blume et al. 2023).

Synthetic Protein Reference Materials

Synthetic protein reference materials have been extensively used in benchmark studies of proteomic measurements to determine experimental and analytical variations by big consortia (Paulovich et al. 2010; Tabb et al. 2010). Notable examples of standard protein mixtures include: the Universal Proteomics Standards (UPS1 and UPS2), a mixture of 48 human recombinant proteins jointly developed by ABRF's sPRG and Sigma-Aldrich (Andrews et al. 2006); the HUPO Gold MS Protein Standard, a mixture of 20 human proteins, developed by the joint efforts of HUPO and Invitrogen (Bell et al. 2009); a mixture of 20 purified human proteins (NCI-20), produced by NIST and employed by CPTAC for intra- and inter-laboratory studies aiming at evaluating repeatability and comparability of qualitative proteomics (Tabb et al. 2010; Wang et al. 2014).

Chemical synthetic or modified peptide mixtures are also utilized as RMs. In comparison to protein mixtures, peptide mixtures have a simpler composition. However, it is important to note that they cannot fully capture the variability introduced during enzymatic digestion, as different laboratories may employ diverse proteolytic enzymes, chemicals, and conditions for digestion. Several synthetic peptide reference materials are commercially available, such as a mixture of 1000 heavy-label proteotypic peptides for conserved proteins across three species (human, mouse and rat), established by ABRF and JPT Peptide Technologies (2023). Synthetic peptides are especially important to evaluate the performance of targeted quantitative proteomic measurement, such as multiple reaction monitoring (MRM) and parallel reaction monitoring (PRM). They are often used to predict retention times (RTs) for large-scale scheduled liquid chromatography multiple reaction monitoring (LC-MRM) measurements with a single calibration run before the analytical runs. Biognosys has developed a mixture of 11 artificial synthetic peptides (iRT) to determine peptide retention time (RT) values and calibrate chromatographic systems for increasing the throughput (Escher et al. 2012). Additionally, well-defined synthetic protein reference materials can be added into biological protein reference materials or test samples to provide additional information of qualitative accuracy. An important consideration when spiking synthetic peptides into other samples is that these peptides should not overlap with the original sample content.

Metabolite Reference Materials

Metabolomics encompasses the extensive investigation of small molecules, known as metabolites, within cells, biological fluids, tissues, or organisms. It integrates the influences of factors from genomics, transcriptomics, proteomics, as well as environmental elements like diet and lifestyle. Since metabolites serve as indicators of the downstream effects of these factors on cellular functions, they closely represent the actual phenotypes of cells, tissues, or organisms, offering novel insights into metabolism and its regulation in physiological and pathological processes, including health, aging, and diseases. Metabolomics involves the simultaneous identification and quantification of various small molecule types, including amino acids, fatty acids, carbohydrates, and other products of cellular metabolic functions. In comparison to genomics, transcriptomics, and proteomics, the reliable identification and quantification of the metabolome are significantly more complex due to the chemical complexity and the presence of isomers—compounds with the same molecular formula but different structural arrangements—introducing challenges for precise identification and quantification.

To promote the advancement of metabolomics toward higher quality, several large research consortia have emerged in the field, aiming to enhance the reproducibility of metabolomics research results through comprehensive quality assurance and quality control measures. These consortia have undertaken various efforts, including the establishment of best practices, promotion of communication and education, and the advancement of the field toward higher-quality standards. The mQACC, consisting of experts in quality assurance and quality control, is focused on develo** universal best practices and reporting standards to ensure the robustness and reproducibility of untargeted metabolomics research (Beger et al. 2019; Evans et al. 2020). The Metabolomics Society Data Quality Task Group (DQTG) aims to enhance the robustness of quality assurance and quality control in the metabolomics community through communication, advocacy, education, and the promotion of best practices (Kirwan et al. 2022). The Standard Metabolic Reporting Structures (SMRS) group is dedicated to standardizing metabolomics analysis and provides comprehensive reports and summaries on relevant key issues (Beckonert et al. 2007; Lindon et al. 2005). The ABRF Metabolomics Research Group aims to study the reproducibility of metabolomics research and propose best data analysis strategies by comparing analysis groups using the same dataset (Turck et al. 2020). Additionally, the ABRF plays a role in improving the core competencies of biotechnology laboratories through research, communication, and education (Cheema et al. 2015; Turck et al. 2020). The Metabolomics Consortium has proposed guidelines for achieving high-quality reporting of LC–MS-derived metabolomics data, including the identification and prioritization of test materials, assessment of useful indicators of data quality, and descriptions of common practices and variations in quality assurance and quality control workflows (Broadhurst et al. 2018).

Quality control samples can be categorized into three primary types based on their intended purposes. System suitability test samples serve as a quality assurance measure applied before data acquisition to instill confidence in the eventual high-quality results (Broadhurst et al. 2018; Kirwan et al. 2022). Typically, these samples consist of solutions containing a small number of authentic chemical standards, typically ranging from five to 10 analytes, with known concentrations. They play a critical role in instrument calibration and assessment of critical system parameters, including mass-to-charge (m/z) ratio and chromatographic characteristics such as retention time, peak area, and peak shape.

Blank quality control samples and matrix-matched quality control samples are essential components of quality control measures to ensure that the quality management process is fulfilled. Blank quality control samples consist of samples devoid of metabolites, serving to identify potential sample contamination or instrument-related background signals, thereby eliminating interference from external contaminants or instrument-related background signals, thereby eliminating interference from external contaminants or the instrument itself (Kirwan et al. 2022). By comparing data from the actual samples to that from the blank samples, researchers can distinguish genuine metabolite signals from potential interferences or background noise. Within the category of matrix-matched quality control samples, the most commonly used are pooled samples. These samples are created by pooling a small amount of each analyzed biological sample within a study, representing both the sample matrix and metabolite composition. Pooled QC samples play a multifaceted role, conditioning the analytical platform, conducting intra-study reproducibility measurements, and mathematically correcting for systematic changes in parameter values (Broadhurst et al. 2018). A specific type of pooled QC sample can be used to assess data quality across different studies within the same laboratory, termed long-term reference (LTR) QC samples (Broadhurst et al. 2018). These samples are obtained either through the commercial purchase of the required sample types or by collecting representative samples from various studies within the laboratory. In this review, we focus on the use of external RMs for assessing performance across different laboratories, which are created and sold by a certified group.

Biological Metabolite Reference Materials

SRM 1950 released by NIST is one of the first developed metabolite reference materials, which is intended for quality control of identifying and quantifying metabolites in human plasma, such as fatty acids, electrolytes, vitamins, hormones, and amino acids (Phinney et al. 2013). It is a mixture of human plasma samples from 100 individuals reflecting a racial distribution in the US population at the time of implementation (77% white, 12% African-American or black, 2% American Indian or Askan Native, 4% Asian, 5% other, with about 15% Hispanic origin). A total of 90 metabolites are assigned with high confidence values of absolute concentrations by integrating several different analytical methods. SRM 1950 was initially designed for targeted metabolomics, and has been extensively used to benchmark platforms, protocols and workflows (McGaw et al. 2010; Misra and Olivier 2020; Siskos et al. 2017; Thompson et al. 2019). Recently, it has also been used in benchmark studies of untargeted metabolomics and lipidomics (Azab et al. 2019; Bowden et al. 2017; Cajka et al. 2017). NIST also released other standalone natural-matrix reference materials for organic contaminants from an assortment of biological materials, including frozen non-fortified human milk (SRM 1953), fortified human milk (SRM 1954), non-fortified human serum (SRM 1957), fortified human serum (SRM 1958) (Schantz et al. 2013), lyophilized human serum (SRM 909b and SRM 909c) (Aristizabal-Henao et al. 2021), smokers' human urine (SRM 3672), and non-smokers' urine (SRM 3673).

Like other quantitative omics, such as transcriptomics and proteomics, identifying differentially expressed metabolites between sample groups is one of the main purposes for metabolomics-based biomarker researches. RMs consisting of two or more sample groups can be used to assess the performance of distinguishing sample groups. The NIST Metabolomics Quality Assurance and Quality Control Materials (MetQual) Program released a suite of pooled plasma materials (RM 8231) comprising four different metabolic health states, including type 2 diabetes plasma, hypertriglyceridemia plasma, normal African-American plasma and normal human plasma (SRM 1950) (Met Qual Program Coordinators 2023). The MetQual Program is planning to conduct an inter-laboratory study to obtain consensus characterization of RM 8231 and assess measurement variability within the metabolomics community. NIST also developed several multi-sample metabolite reference materials from other biological resources. RM 971a consists of two serum mixtures: one from a pool of healthy, premenopausal adult females, and the other one from a pool of healthy adult males. It is intended to evaluate the accuracy of identify and quantify hormones in human serum (Aristizabal-Henao et al. 2021). SRM 1949 Frozen Human Prenatal Serum is a four-level material that was pooled from non-pregnant women and women during each trimester of pregnancy, aiming at quality control for the measurement of hormones and nutritional elements throughout pregnancy (Boggs et al. 2021; Sempos et al. 2022). A suite of human urine reference materials (RM 8232) is under development. The suite will consist of four pooled urine samples from female non-smokers, female smokers, male non-smokers and male smokers. Relative metabolite fold changes, percent differences for the top 20 metabolites and the identified top 30 abundant metabolites of the urine samples will be characterized by both LC–MS and nuclear magnetic resonance. RM 8462 Frozen Human Liver Suite mentioned in the protein reference materials section can be also used for metabolomics (Lippa et al. 2022).

The Quartet Project also developed a multi-sample metabolite RM suite by extracting metabolites from the four immortalized lymphoblastoid cell lines. Aiming at assessing the performance of detecting biological differences between different sample groups, reference datasets for fold changes of absolute abundance values between samples groups were constructed, by consensus across platforms, laboratories and replicates. The performance of quantitative metabolomics can be assessed not only by the consistency between fold changes of differentially expressed metabolites in query datasets and reference datasets, but also by SNR by measuring the ability to discriminate the intrinsic biological differences between the four sample groups.

Synthetic Metabolite Reference Materials

Synthetic metabolite reference materials are artificial substances that have identical chemical properties to naturally occurring metabolites in biological systems. They play an important role as calibration standards for analytical methods to allow accurate identification and quantification of metabolites. Synthetic metabolite RMs contain known concentrations of chemical components, which can be run separately or used as internal standards to perform system suitability tests, calibration, and metabolite quantification. These RMs can be prepared in individual laboratories to fit specific purposes for each study or can be purchased from vendors. They can be produced using chemical synthesis or enzymatic reactions, and they can be used for a range of applications, including targeted and untargeted metabolomics, and in the development and validation of new analytical methods. Synthetic metabolite RMs can also be used to assess the accuracy and precision of different analytical platforms and to facilitate inter-laboratory comparisons.

One example of a synthetic metabolite RM is the deuterated internal standards that are frequently used in MS-based metabolomics. These internal standards are made by incorporating deuterium into the metabolite of interest, allowing for accurate quantification of the metabolite in biological samples. Commercially available synthetic metabolite reference materials are typically mixtures of isotopically labeled or U-13C labeled metabolites that span a broad range of molecular weights, possess varied ionization propensities, and cover a distribution in class and retention time. Examples of commercially available synthetic metabolite reference materials include the QReSS kit from Cambridge Isotope Laboratories (CIL) (Cambridge Isotope Laboratories, Inc. 2023), the IROA-Long-Term Reference Standard (IROA-LTRS) from IROA Technologies (Evans et al. 2020), the Lipidyzer Platform kits from SCIEX (Lippa et al. 2022), and quantitative metabolic profiling kits from Biocrates (Biocrates 2023).

Multiomics Reference Materials

Multiomics integrates diverse omics data to better cluster and classify sample (sub)groups, and more comprehensively understand the mechanisms underlying biological processes by investigating molecular interaction across omics layers (Karczewski and Snyder 2018; Price et al. 2017; Schussler-Fiorenza Rose et al. 2019). Multiomics analysis inherits challenges from the single omics datasets and confronts new challenges in data harmonization and integration across different omics layers with varying numbers of features and statistical properties (Athieniti and Spyrou 2023; Sonia Tarazona 2021). Multiomics RMs derived from the same source that incorporate multiple omics types and provide unbiased ground truth serve as crucial tools for assessing the performance of methods for normalizing and integrating multiomics datasets, conducting cross-omics validation, and imputing missing data (Krassowski et al. 2020; Zheng et al. 2021). For example, the uniformity of MS1 intensity distribution reflects the consistency of chromatographic spray and mass spectrometry sensitivity, while the uniformity of MS2 intensity distribution reflects the consistency of fragment ion detection sensitivity. There is no universally accepted standard for pre-analytical performance metrics. Appropriate thresholds depend on specific library preparation protocols, sequencers or instruments, and algorithms. While these metrics can be calculated for samples of interest, the use of widely adopted reference materials facilitates better understanding of performance across different assays and laboratories.

Fig. 3
figure 3

Schematic overview of multiomics profiling and quality control workflows showing the use of reference materials. Omics reference materials can evaluate all steps of sequencing workflow, including sample or library preparation, sequencing, raw reads, and profiling. Illustrated is the workflow of sequencing and key performance metrics for each step

In cases where reference datasets are either unavailable or do not contain the features of interest, alternative methods can be employed to assess the performance. One such method involves evaluating the reproducibility of replicates, which compares the results of multiple measurements conducted on the same sample. Another approach is to utilize built-in truth from multi-sample reference materials, whereby a known standard is employed to evaluate the accuracy of the experiment.

The performance of variants calling results can be assessed by the repeatability and reproducibility of technical replicates or the Mendelian consistent ratio of family members. Technical replicates share the same variant calls and de novo mutations are rare; therefore, the majority of discordant variants is likely to represent genoty** errors (Veltman and Brunner 2012). The advantage of those reference datasets independent metrics is that they can evaluate the precision of variant calling on the whole genome without being restricted to the benchmark regions. However, these metrics cannot indicate how many true variants should be identified, or what the recall rate is.

To assess the accuracy of quantitative omics, three levels of reference dataset-independent metrics can be employed based on the number of available reference materials (Fig. 2). If a single reference material is available, the reproducibility between technical replicates is used to assess the performance of profiling results. However, a high correlation between two replicates of the same sample is not enough to ensure to accuracy in detecting differences between sample groups, because the replicates may share the same technical biases. If a pair of RMs is available, fold changes of features between sample pairs are expected to be the same as the designed expression signal ratios. If three or more RMs are available in a suite, PCA-based metrics can be used to assess the performance of distinguishing the intrinsic biological differences between sample groups.

Utilization of Reference Materials

Identifying reliable biomarkers that can accurately predict disease risk or response to treatment is a critical goal of omics-based cohort studies. Large cohort studies that involve collecting samples over a long period of time and profiling the samples with multiple platforms at multiple labs may suffer from issues related to data incomparability and batch effects, which add difficulties for biomarker discovery. In this section, we discuss how omics RMs can be integrated in large cohort studies to enhance the rigor and reproducibility of biomarker discovery (Fig. 4).

Fig. 4
figure 4

Utilization of reference materials in large cohort study to enhance the rigor and reproducibility of biomarker discovery

To ensure accurate and reliable results from large-scale analysis of precious cohort samples, it is important to assess the suitability of experimental and analytical pipelines using reference materials prior to initiating the data generation process. The first important step is to choose the suitable RMs based on study design and instruments available. Points of consideration include the availability of RMs, their comparability to the test material, and whether the assigned property values and their confidence levels include the features of interest. The matrix composition of RMs is a critical consideration in the QC process of LC–MS. The performance indicators, such as calibration effectiveness, extraction efficiency, column performance, and ion suppression level, are directly influenced by the composition of the sample matrix. To ensure accurate and reliable performance assessment, it is recommended to employ RMs with a matrix composition as similar as possible to that of the study samples.

At each omics level, a variety of sample preparation methods, data generation platforms, and bioinformatic tools are available. By utilizing RMs in benchmark studies and proficiency test, researchers can gain insights into the strengths and limitations of various methods and technologies. This knowledge facilitates the selection of appropriate experimental and analytical procedures tailored to the specific goals, samples types, and available resources.

RMs can also be effectively used to optimize protocols and parameters by identifying and troubleshooting potential issues. For example, in genomics and transcriptomics by NGS, sequencing performance is influenced by the insert fragment size, which is associated with DNA shearing time. Longer shearing time produces shorter DNA fragments, and the insert fragment sizes must be measured to ensure that they fall within the expected molecular weight range (Fang et al.

Challenges and Future Directions

High-throughput profiling technologies have revolutionized omics studies by enabling the generation of vast amounts of data in a relatively short period of time, allowing researchers to comprehensively study complex biological systems at an unprecedented level of resolution. However, performing high-throughput profiling is a highly complex and challenging process, and there are many potential sources of variability that can impact the results and reproducibility. Therefore, rigorous QA/QC is crucial to ensure confidence in the resulting data and biological discoveries. The use of RMs is an important aspect of QA/QC in high-throughput technologies to ensure accurate and reliable results. In this review, we aim to offer a comprehensive overview of the significance of utilizing well-characterized RMs across different levels of omics research, including genomics, transcriptomics, proteomics, and metabolomics. We provide insights into the characteristics, advantages, and limitations of RMs in each omics field, which are summarized in Table 2. Our goal is to assist researchers in making informed decisions when selecting suitable RMs for their specific research questions and analytical methods. Ultimately, the utilization of appropriate RMs can greatly enhance the accuracy and reliability of omics research outcomes.

Table 2 Characteristics of genomic, transcriptomic, proteomic, and metabolomic reference materials

By incorporating well-characterized RMs into omics research, researchers can overcome various challenges and limitations. RMs provide a standardized reference point that enables calibration and quality control throughout the experimental workflow. They serve as valuable tools for method optimization, validation, and troubleshooting, allowing researchers to assess the performance of their analytical methods and identify any potential biases or errors. Furthermore, the use of RMs facilitates inter-laboratory comparisons and promotes data harmonization, enabling the integration and comparison of results across different studies and platforms. Although the profiling of RMs may entail additional costs, implementing a thorough QA/QC methodology is important for evaluating and monitoring the performance of data generation processes. This upfront investment contributes to the long-term reliability and accuracy of the results, minimizing potential errors and ensuring the accuracy and reliability of the omics research.

The careful selection of RMs is crucial to ensure their relevance and applicability to the study at hand. Researchers should consider the intended use of the study and choose RMs that closely resemble the properties of the samples being investigated. Additionally, the selected RMs should be qualitatively and quantitatively representative of the entire collection of samples included in the study. This ensures that the RMs effectively mimic the characteristics of the biological samples, enabling accurate and meaningful comparisons and interpretations. When studying specific genetic or phenotypical features that vary among different ethnic groups, it is important to choose RMs that match the ethnicity of the study samples. This approach ensures that the RMs accurately reflect the characteristics of the study population, enabling the assessment of the detection performance of those specific genetic or phenotypical features (Hardwick et al. 2017).

As profiling methods continue to advance and new technologies emerge, the reference datasets for existing RMs will undergo continuous updates and refinements. One example of this is the utilization of long reads in genomic sequencing. Long reads are particularly valuable for profiling repetitive and complex regions, which are challenging to be mapped by short reads (Wenger et al. 2019). By incorporating long reads, benchmark variants in these regions can be better characterized (Wagner et al. 2022). Additionally, long-read technologies enable precise transcript detection and RNA modifications (Leger et al. 2021; Soneson et al. 2019). In proteomics, MS techniques are extensively used to study post-translational modification (PTMs) of proteins (Zecha et al. 2022). The reference materials will expand to encompass more omics types along with the development of technologies. For example, reference datasets of DNA epigenomics for DNA RMs can be developed, RNA RMs can include small RNA profiling and RNA modification reference datasets, and protein RMs can incorporate PTM reference datasets.

Challenges persist in the global promotion and adoption of reference materials and reference datasets. First, regulatory challenges, especially across different regions of the world, can pose additional obstacles in adopting a universal RM (Guerrier et al. 2012; Krogstad et al. 2010). Biological RMs, especially those intended for human genomics and transcriptomics, which are frequently derived from human specimens, require stricter adherence to informed consent principles and governmental controls. Currently, there is no single, comprehensive international model for governing human genetic resources. The distinct nature of informed consent across different countries, influenced by diverse cultures and social traditions, necessitated addressing legal, ethical, and logistical aspects related to genetic materials and data utilization while respecting each nation's sovereignty and cultural norms. International collaboration and agreements are imperative in addressing these challenges and ensuring the conscientious and equitable utilization of human genetic resources worldwide (Gainotti et al. 2016; van Belle et al. 2015).

Second, we strongly recommend that QC data should be made available alongside the study samples in databases or repositories that adhere to the FAIR principles (Findable, Accessible, Interoperable, and Reusable), which is crucial for enhancing data management and sharing (Conesa and Beck 2019; Wilkinson et al. 2016). Currently, QC information is often omitted from scientific publications, leading to uncertainty about the performance methodology used. In the future, guidelines may be developed to mandate the inclusion of QC metrics in data submissions to public repositories, similar to existing guidelines for other aspects of data reporting. Coupling comprehensive QC information to the experimental data will allow for quick assessment of the reliability of an experiment, which is crucial in light of recent reports of the general reproducibility crisis in various scientific fields (Anonymous 2021; Baker 2016; Shi et al. 2017). It is essential to prioritize and formalize QC practices to ensure the quality and reproducibility of high-throughput multiomics profiling results by fully utilizing well-characterized RMs and appropriate QC metrics.

Conclusion

In this review, we summarized reference materials across all levels of omics, including (epi-)genomics, transcriptomics, proteomics, and metabolomics. We have offered a comprehensive overview of leveraging omics reference materials to enhance data quality. This initiative is geared toward promoting robust scientific research and advancing our understanding of complex biological systems through the thoughtful application of omics technologies.