New software tools, databases, and resources in metabolomics: updates from 2020

Misra, Biswapriya B.

doi:10.1007/s11306-021-01796-1

New software tools, databases, and resources in metabolomics: updates from 2020

Review Article
Published: 11 May 2021

Volume 17, article number 49, (2021)
Cite this article

Download PDF

Metabolomics Aims and scope Submit manuscript

New software tools, databases, and resources in metabolomics: updates from 2020

Download PDF

Biswapriya B. Misra ORCID: orcid.org/0000-0003-2589-6539¹

22k Accesses
115 Citations
65 Altmetric
Explore all metrics

Abstract

Background

Precision medicine, space exploration, drug discovery to characterization of dark chemical space of habitats and organisms, metabolomics takes a centre stage in providing answers to diverse biological, biomedical, and environmental questions. With technological advances in mass-spectrometry and spectroscopy platforms that aid in generation of information rich datasets that are complex big-data, data analytics tend to co-evolve to match the pace of analytical instrumentation. Software tools, resources, databases, and solutions help in harnessing the concealed information in the generated data for eventual translational success.

Aim of the review

In this review, ~ 85 metabolomics software resources, packages, tools, databases, and other utilities that appeared in 2020 are introduced to the research community.

Key scientific concepts of review

In Table 1 the computational dependencies and downloadable links of the tools are provided, and the resources are categorized based on their utility. The review aims to keep the community of metabolomics researchers updated with all the resources developed in 2020 at a collated avenue, in line with efforts form 2015 onwards to help them find these at one place for further referencing and use.

Software tools, databases and resources in metabolomics: updates from 2018 to 2019

Article 07 March 2020

Metabolomics Resources: An Introduction of Databases and Their Future Prospective

Navigating freely-available software tools for metabolomics analysis

Article Open access 09 August 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The year 2020 has seen an enormous rise in applications of ion mobility mass-spectrometry (IMS), and data-independent acquisition (DIA) methods of analyses in both metabolomics and lipidomics. In terms of application, mass spectrometry as a technology promises advance care for cancer patients in clinical and intraoperative use (J. Zhang, Ge, et al., 2020; Zhang, Sans, et al., 2020), imaging mass spectrometry (MSI) based natural products (NPs) discovery (Spraker et al. 2020), nanoscale secondary ion mass spectrometry (nanoSIMS) usage in subcellular MS imaging and quantitative analysis in organelles (Thomen et al. 2020), capturing urban sources of contamination from high resolution mass spectrometry (HRMS) (Bowen et al., 2020) to detection of COVID-19 disease signatures (Mahmud & Garrett, 2020).

From an analytical method development stand point, interesting developments such as plasma pseudotargeted metabolomics method using ultra-high-performance liquid chromatography–mass spectrometry (UHPLC-MS) (Zheng et al. 2020) and the need for combined use of nuclear magnetic resonance spectroscopy and mass spectrometry approaches in metabolomics (Letertre et al. 2020) are notable. For volume-limited samples, solutions such as sub-nanoliter metabolomics via LC–MS/MS such as pulsed MS ion generation method known as triboelectric nanogenerator inductive nanoelectrospray ionization (TENGi nanoESI) MS (Li et al. 2020) was introduced. Flow-injection Orbitrap mass spectrometry (FI-MS) enabled reproducible detection of ~ 9,000 and ~ 10,000 m/z features in metabolomics and lipidomics analysis of serum samples, respectively, with a sample scan time of ~ 15 s and duty time of ~ 30 s; a ~ 50% increase versus current spectral-stitching FI-MS methods (Sarvin et al. 2020). A spatial metabolomics pipeline (metaFISH) that combined fluorescence in situ hybridization (FISH) microscopy and high-resolution atmospheric-pressure matrix-assisted laser desorption/ionization mass spectrometry to image host–microbe symbioses and their metabolic interactions (Geier et al. 2020) was also reported. Another study that compared the full-scan, data-dependent acquisition (DDA), and data-independent acquisition (DIA) methods in HR LC–MS/MS based metabolomics to reveal that spectra quality is better in DDA with average dot product score 83.1% higher than DIA and the number of MS² spectra (spectra quantity) is larger in DIA (Guo & Huan, 2020a). Furthermore, it was shown that DDA mode consistently generated fewer uniquely found significant features than full-scan and DIA modes (Guo & Huan, 2020b).

Using with Raman spectroscopy, followed by stimulated Raman scattering (SRS) microscopy and Raman-guided subcellular pharmaco-metabolomics in metastatic melanoma cells revealed intracellular lipid droplets that helped identify a previously unknown susceptibility of lipid mono-unsaturation within de-differentiated mesenchymal cells with innate resistance to BRAF inhibition (Du et al. 2020). Application of ³¹P NMR was shown to hold potential of expanding the coverage of the metabolome by detecting phosphorus-containing metabolites (Bhinderwala et al. 2020).

The effectiveness of the flow injection analysis-continuous accumulation of selected ions Fourier transform ion cyclotron resonance mass spectrometry (FIA-CASI-FTMS) workflow utilizing isotopic fine structure (IFS) for molecular formula assignment was realized for metabolomics applications (Thompson et al. 2020). A buffer modification workflow (BMW) in which the same sample is run by LC–MS in both liquid chromatography solvent with ¹⁴NH₃–acetate buffer and in solvent with the buffer modified with ¹⁵NH₃–formate, resulted in characteristic mass and signal intensity changes for adduct peaks, facilitating their annotation (Lu et al. 2020) was also demonstrated. In other innovative applications, use of short columns and direct solvent switches allowed for fast screening (3 min per polarity), where a total of 50 commonly reported diagnostic or explorative biomarkers were validated with a limit of quantification that was comparable with conventional LC–MS/MS (van der Laan et al. 2020).

From the stand point of data analysis, metabolomics as a field is starting to benefit by applying machine learning (ML) (Liebal et al. 2020) and deep learning (DL) (Pomyen et al. 2020; Sen et al. 2020) approaches to address diverse challenges from data preprocessing to biological interpretation. In the context of systems and personalized medicine LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples) and ssPCC (single sample network based on Pearson correlation) were evaluated and compared in the context of metabolite–metabolite association networks (Jahagirdar & Saccenti, 2020). In annotation domains for low resolution GC–MS data, usage of DL ranking for small molecules identification, a deep learning ranking model outperformed other approaches and enabled reducing a fraction of wrong answers (at rank-1) by 9–23% depending on the used data set (Matyushin et al. 2020). In the age of artificial intelligence, spatial metabolomics and IMS promise to revolutionize biology and healthcare (Alexandrov, 2020). Approaches such as an integrated strategy of fusing features and removing redundancy based on graph density (FRRGD) were proposed that greatly enhanced the metabolome detection coverage with low abundance (Ju et al. 2020).

For a software survey of other mass-spectrometry derived omics tools, packages, resources, softwares and databases, readers can consult other treatise for metaproteomics (Sajulga et al. 2020), data‐independent acquisition mass spectrometry‐based proteomics (F. Zhang, Ge, et al., 2020; Zhang, Sans, et al., 2020), single cell and single cell-type metabolomics (B. B. Misra, 2020a) among others.

Diverse online resources such as OMICtools (http://omictools.com/) (Henry et al. 2014), Fiehn laboratory pages (http://fiehnlab.ucdavis.edu/ and http://metabolomics.ucdavis.edu/Downloads), the International Metabolomics Society’s resource pages, software repositories such as Comprehensive R Archive Network (CRAN) (https://cran.r-project.org/web/packages/available_packages_by_name.html), Bioconductor (https://www.bioconductor.org/), the Python Package Index (PyPI) (https://www.pypi.org), GitLab (https://www.gitlab.com), and GitHub (https://www.github.com/) are excellent resources to obtain software tools, databases and resources for metabolomics research. Metabolomics Tools Wiki claimed to be an updated resource for metabolomics tools, databases and software resources has ceased to be updated since 2017 (Spicer et al. 2017). Whilst there exists a plethora of programming languages, modern interpreted scripting languages such as R, Python, Raku, Ruby, and MATLAB are evidently popular in metabolomics.

Building on the previously established review structure this overview of major tools and resources in metabolomics, spanning 2015–2019 (B. Misra & van der Hooft, 2015; O’Shea & Misra, 2020) is organized into the following sections: (1) Platform-specific tools, (2) Preprocessing and QC tools, (3) Annotation tools, (4) Multifunctional tools, (5) Tools for statistical analysis and visualization, (6) Databases, and (7) Other specialized tools.

Table 1 provides a summary of all reviewed resources and their availability. Furthermore, in Table 2, highlighted are unpublished tools that can be found in the CRAN and PyPI software repositories that are deemed useful for the metabolomics research community, but are not associated with a scholarly article that is published.

Table 1 The entire list of reviewed tools is organized by important analytical steps in metabolomics data analysis and includes details regarding their platform dependency, and implementation, e.g., programming language (R, Python, Java, C/C ++, etc.) or web browser based and their availability

Full size table

Table 2 List of useful R/ Bioconductor packages that surfaced/ were improved in 2020

Full size table

2 Platform-specific tools

Metabolomics as a discipline depends on mass spectrometry and spectroscopy analytical platforms to generate high through put omics scale data. These include, and are not limited to liquid chromatography-mass spectrometry (LC–MS), gas chromatography-mass spectrometry (GC–MS), capillary electrophoresis-mass spectrometry (CE-MS), and spectroscopic methods such as ¹H-NMR, ¹³C-NMR, Raman, and Fourier transform infrared (FTIR) among others. In this section, I discuss all the tools that appeared in 2020 for analyses of datasets that are specific to a metabolomics platform or technology, i.e., LC–MS, GC–MS, and NMR.

Automated spectraL processing system for NMR (AlpsNMR), is an R-package that provides automated signal processing for untargeted NMR metabolomics datasets by performing region exclusion, spectra loading, metadata handling, automated outlier detection, spectra alignment and peak-picking, integration and normalization (Madrid-Gambin et al. 2020). The tool can load Bruker and JDX samples and can preprocess them for downstream statistical analysis.

Signature map** (SigMa), developed as a standalone tool using MATLAB dependencies, for processing raw urine ¹H-NMR spectra into a metabolite table (Khakimov et al. 2020). SigMa relies on the division of the urine NMR spectra into Signature Signals (SS), Signals of Unknown spin Systems (SUS) and bins of complex unresolved regions (BINS), thus allowing simultaneous detection of urinary metabolites in large-scale NMR metabolomics studies using a SigMa chemical shift library and a new automatic peak picking algorithm.

NMR filter, is a stand-alone interactive software for high-confidence NMR compound identification that runs NMR chemical shift predictions and matches them with the experimental data, where it defines the identity of compounds using a list of matching rates and correlating parameters of accuracy together with figures for visual validation (Kuhn et al. 2020).

MSHub/ electron ionisation (EI)-Global Natural Product Social (GNPS) Molecular Networking analysis, as a platform enables users to store, process, share, annotate, compare and perform molecular networking of both unit/low resolution and GC–HRMS data (Aksenov et al. 2020). GNPS-MassIVE is a public data repository for untargeted MS² data, EI-MS data, with sample information (metadata) and annotated MS² spectra (Aron et al. 2020). MSHub performs the auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples, followed by GNPS molecular networking analyses.

RGCxGC toolbox, is an R-package that aids in analysis of two dimensional gas chromatography-mass spectrometry (2D GC–MS) data by offering pre-processing algorithms for signal enhancement, such as baseline correction based on asymmetric least squares, smoothing based on the Whittaker smoother, and peak alignment 2D Correlation Optimized War** and multiway principal component analysis (Quiroz-Moreno et al. 2020).

3 Preprocessing and quality control (QC) tools

In untargeted metabolomics workflows that use either LC–MS/MS, GC–MS or NMR, depend a lot on pre-processing of the acquired raw datasets prior to statistical analyses and interpretation. Preprocessing typically involves tools that aid in the detection of masses (as m/z’s) from mass spectra (i.e., feature detection), construct and display extracted ion chromatograms, detect chromatographic peaks, deconvolution, peak alignment, data matrix curation steps such as batch and blank corrections to filtration and normalization steps, and quality assessments. Though, there are decade old popular preprocessing tools available to the community in the form of xcms (Tautenhahn et al. 2008), MZmine 2 (MZmine Development Team 2015), MS-DIAL (Tsugawa et al. 2015) there is a consistent effort to improve the workflows- from reducing computational time, to develo** graphical user interfaces (GUIs) for users to render them user friendly to addressing challenges associated with interpretation of data from advanced platforms such as HRMS data or those from IMS, MSI etc. In fact, a recent comparative effort (among software tools such as software packages MZmine 2, enviMass, Compound Discoverer™, and XCMS Online) demonstrated a low coherence between the four processing tools, as overlap of features between all four programs was only about 10%, and for each software between 40 and 55% of features did not match with any other program (Hohrenk et al. 2020). Moreover, quality control (QC) tools are important to take care of systematic and random variations/ errors induced during experimental and analytical workflows. Batch effects can pose a lot of challenges, i.e., introduction of experimental artifacts that can interfere with the measurement of phenotype‐related metabolome changes in metabolomics data (Han & Li, 2020), and data normalization strategies, tools, and software solutions available are reviewed to circumvent some of these challenges (B. B. Misra, 2020b). In this section, I cover the preprocessing and the QC tools that appeared in 2020.

Correlation-based removal Of multiPlicities (CROP), implemented as an R-package is a visual post-processing tool that removes redundant features from LC–MS/MS based untargeted metabolomic data sets (Kouřil et al. 2020), where it groups highly correlated features within a defined retention time (RT) window avoiding the condition of specific m/z difference making it a second-tier strategy for multiplicities reduction. The output is a graphical representation of correlation network allowing a good understanding of the clusters composition that can aid in further parameter tuning.

neighbor-wise compound-specific Graphical Time War** (ncGTW), is an integrated reference-free profile alignment method, implemented as an R-package and is available as a plugin for xcms that aids in detecting and fixing the bad alignments (misaligned feature groups) in the LC–MS data to render accurate grou** and peak-filling (Wu et al. 2020).

TidyMS, is a Python package for preprocessing of untargeted LC–MS/MS derived metabolomics data that reads raw data fro-m a .mzML file format, generates spectra and total ion chromatograms (TICs), allows peak picking, feature detection, reads processed data from xcms, MZmine 2 among others, offers functionalities for data matrix curation, normalization, imputation, scaling, quality metrics, QC-based batch corrections and interactive visualization of results (Riquelme et al. 2020).

AutoTuner, available as an R-package, is a parameter optimization algorithm that obtains parameter estimates from raw data in a single step as opposed to many iterations in a data-specific manner to generate robust features from untargeted LC–MS/MS runs (McLean & Kujawinski, 2020). For input, AutoTuner requires at least 3 samples of raw data converted from proprietary instrument formats (e.g. .mzML, .mzXML, or .CDF).

remove unwanted variation in a hierarchical structure (hRUV), is an R-package (also available as Shiny app) that aids in removal of unwanted variation from large scale LC–MS metabolomics studies which it accomplishes by progressively merging the adjustments in neighboring batches (Taiyun Kim, Owen Tang, Stephen T Vernon, Katharine A Kott, Yen Chin KoaTaiyun Kim, Owen Tang, Stephen T Vernon, Katharine A Kott, Yen Chin Koay, John Park, David James, Terence P Speed, Pengyi Yang, John F. O’Sullivan, Gemma A Figtree, Jean Yee Hwa Yangy, 2020). The package uses sample replicates to integrate data from several batches for removal of intra-batch signal drift and inter-batch unwanted variation and outperforms existing tools while retaining biological variation. For assessment of the results, a user can visualize results as three kinds of diagnostic plots, i.e., principal component analysis (PCA) plots, relative log expression (RLE) plots, and metabolite run plots.

MetumpX, is a Ubuntu-based R- package that facilitate easy download and installation of 103 tools spread across the standard untargeted MS- based metabolomics pipeline (Wajid et al. 2020). The package can aid in automatically installation of software pipelines truly speeding up the learning curve to build software workstations.

MeTaQuaC, is an R- package and aids in implementation of concepts and methods for Biocrates kits and its application in targeted LC–MS metabolomics workflows and creates a QC report containing visualization and informative scores, and provides summary statistics, and unsupervised multivariate analysis methods among others (Kuhring et al. 2020).

Dbnorm, is an R-package that allows visualization and removal of technical heterogeneity from large scale metabolomics dataset, after allowing inspection at both in macroscopic and microscopic scales at both sample batch and metabolic feature levels, respectively (Bararpour et al., 2020). dbnorm includes several statistical models such as, ComBat (parametric and non-parametric)-model from sva package that are already in use for metabolomics data normalization, and ber function.

MetaClean, available as an R-package, uses 11 peak quality metrics and 8 diverse ML algorithms to build a classifier for the automatic assessment of peak integration quality of peaks from untargeted metabolomics datasets (Chetnik et al. 2020). It was shown that AdaBoost algorithm and a set of 11 peak quality metrics were best performing classifiers, and applying this framework to peaks retained after filtering by 30% relative standard deviation (RSD) across pooled QC samples was able to further distinguish poorly integrated peaks that were not removed from filtering alone.

NeatMS, is a Python package that is available for untargeted LC–MS signal labelling and filtering, which enables automated filtering out of false positive MS1 peaks reported by routine LC–MS data processing pipelines. It relies on neural networking-based classification, and can process outputs from MZMine 2 and xcms analysis.

4 Annotation tools

Metabolite annotation remains a critical step that defines the success or failure of untargeted metabolomics efforts. With newer technologies such as collision cross section (CCS) data for ion mobility, high resolution mass spectra from Orbitrap, direct injection data, data independent acquisition (DIA)/ all ion fragmentation (AIF), imaging MS and multi-dimensional chromatography the annotation results have gained additional impetus in compound identification, but these methods have offered newer challenges in themselves for tool development. False discovery rates (FDRs) of annotations indicate that low FDRs yield low number yet reliable annotations, whereas higher FDR report high number of annotations by those of poor-quality annotations. Though metabolite annotation efforts can benefit from RT as an orthogonal information, efforts for combining RT predictions with MS/MS data is currently lacking (Witting & Böcker, 2020). Clearly reference spectra and spectral DBs/ libraries are not enough to annotate roughly 5–30% of the total features captured (depending on the environmental/ biological matrices in question) in a given mass spectrometry-based metabolomics dataset. Though experimentally obtained MS/MS data and NMR data on pure standards are precious, and aid in development of computational solutions for compound identification, they do not suffice at their current numbers, accessibility, and availability. Moreover, in 2020, the Metabolite Identification Task Group of the International Metabolomics Society assessed and proposed a set of revised reporting standards for metabolite annotation/ identification and requested community feedback for levels from A-G, from defining an enantiomer or a chiral metabolite (level A) (to unknown molecular formula with specific spectral features (G). Once formalized, these would positively affect and improve reporting standards in studies and the publication landscape in metabolomics research. In Fig. 1, 2, 3, shown are the software interfaces and analysis outputs for some of the annotation tools discussed in the following sections.

MEtabolite SubStructure Auto-Recommender (MESSAR), is a web-based tool that provides an automated method for substructure recommendation guided by association rule mining, captures potential relationships between spectral features and substructures as learned from public spectral libraries for suggesting substructures for any unknown mass spectrum (Y. Liu, Mrzic, et al., 2020; Liu, Nellis, et al., 2020). Though the interface does not perform batch processing currently, it provides an open-source approach to annotate substructures.

Small Molecule Accurate Recognition Technology (SMART 2.0), is an artificial intelligence (AI) -based ML tool for mixture analysis in NMR data analysis workflow that aid in subsequent accelerated discovery and characterization of new NPs. SMART 2.0 generates structure hypotheses from two dimensional NMR data [¹H-¹³C- Hetero‐nuclear Single Quantum Coherence (HSQC) spectra], then compares with a query HSQC spectrum against a library of > 100,000 NPs to provide outputs as simplified molecular-input line-entry system (SMILES), structures, cosine similarity, and molecular weights for a given compound of interest.

MetFID, is a tool that uses an artificial neural network (ANN) trained for predicting molecular fingerprints based on experimental MS/MS data (Fan et al. 2020). MetFID retrieves candidates from metabolite databases using molecular formula or m/z value of the precursor ion of the analyte and the candidate whose fingerprint is most analogous to the predicted fingerprint which is used for metabolite annotation. However, no codes or accessible tools/ repositories are provided with the published scholarly article.

CPVA, is a web-based tool that is aimed at the analyses of untargeted LC–MS/MS generated metabolomics data for visualization and annotation of LC peaks, where the tool performs functions such as annotation of adducts, isotopes and contaminants, and allows visualization of peak morphology metrics (Luan et al. 2020). Further, the tool aids in capturing potential noises and contaminants encountered in chromatographic peak lists generated from LC–MS/MS data, thus resulting in a reduced false positive peak calling in order to help data quality and downstream data processing.

NRPro, is a web-based application dedicated for dereplication and characterization of peptidic natural products (PNPs) from LC–MS/MS datasets that performs automatic peak annotation through a statistically validated scoring system (Ricart et al. 2020). An example NRPro dereplication effort revealed that the software was able to identify 169 PNPs in a dataset of 352 spectra with an FDR of 3.55.

MetENP/MetENPWeb, is available as an R-package on the Metabolomics Workbench repository, also deployed as a web-based application that allows extending the metabolomics data enrichment analysis to include Kyoto Encyclopedia of Genes and Genomes (KEGG)-based species-specific pathway analysis, pathway enrichment scores, gene-enzyme data, and enzymatic activities of the significantly altered metabolites on any Metabolomics Workbench submitted studies/ datasets (Choudhary et al. 2020). Various plots and visualizations such as volcano plots and bar graphs are available to the user of the tool after the analyses.

Class Assignment aNd Ontology Prediction Using mass Spectrometry (CANOPUS), available as a part of SIRIUS (Dührkop et al. 2019) suite of software, is a computational tool for systematic compound class annotation from fragmentation spectra (Dührkop et al. 2020). CANOPUS uses a deep neural network to predict 2,497 compound classes from fragmentation spectra, including all biologically relevant classes, and explicitly targets compounds for which neither spectral nor structural reference data are available in addition to predicting compound classes lacking MS/MS training data. Recently, CANOPUS was made available for analysis of MS/MS spectra obtained from both positive and negative mode ionization datasets.

molDiscovery, is a mass spectral database search method that improves both efficiency and accuracy of small molecule identification by (i) using an efficient algorithm to generate mass spectrometry fragmentations, and (ii) learning a probabilistic model to match small molecules with their mass spectra (Mohimani et al. 2020). A search of over 8 million spectra from the GNPS molecular networking infrastructure demonstrated that this probabilistic model can correctly identify nearly six times more unique compounds than other previously reported methods.

MetIDfyR, developed as an R-package that aids in in silico drug phase I/II biotransformation prediction and mass-spectrometric data mining from untargeted LC-HRMS/MS datasets (Delcourt et al. 2020) to help with feature annotation. With the ability to predict drug metabolism products from in vitro and in vivo studies, this tool holds potential in annotation workflows in drug discovery programs.

Qemistree, is a cheminformatics tool available as an advanced analysis workflow on GNPS infrastructure that allows mass spectrometry data to be represented in the context of sample metadata and chemical ontologies (Tripathi et al. 2020). This tree-guided data exploration tool allows comparison of metabolomics samples across different experimental conditions such as chromatographic shifts. The Qemistree software pipeline is freely available to the microbiome and metabolomics communities in the form of a QIIME2 plugin as well.

Ion identity molecular networking (IIMN), a workflow available within the GNPS ecosystem that complements the feature based molecular networking (FBMN) by aiding in annotating and connecting related ion species in feature-based molecular networks (Schmid et al. 2020). Though, MS1-based ion identity networks (IIN), are well-known, IIMN helps to integrate IIN into MS2-based molecular networks in the GNPS environment, thus adding MS/MS information on top of MS1 characteristics of ions.

Food-Biomarker Ontology (FOBI), is a tool developed in R language, is a web-based analysis and visualization package that is focused on interactive visualization of the FOBI structure (Castellano-Escuder et al. 2020). FOBI (Food-Biomarker Ontology) is a new ontology that describes food and their associated metabolite entities and is composed of two interconnected sub-ontologies, the ‘Food Ontology’ consisting of raw foods and ‘multi-component foods’ and a second: ‘Biomarker Ontology’ containing food intake biomarkers classified by their chemical classes. These two sub-ontologies are conceptually independent but interconnected by different properties. Functionalities of the tool include static and dynamic network visualization, downloadable tables, compound ID conversions, classical and food enrichment analyses.

BioDendro, is a Python package, for feature analysis of LC–MS/MS metabolomics data as a workflow that enables users to flexibly cluster and interrogate thousands of MS/MS spectra and quickly identify the core fragment patterns causing grou**s leading to identification of core chemical backbones of a larger class, even when the individual metabolite of interest is not found in public databases (Rawlinson et al. 2020).

AllCCS, is a freely accessible database/ CCS atlas that covers vast chemical structures with > 5000 experimental CCS records and ~ 12 million calculated CCS values for > 1.6 million small molecules, with medium relative errors of 0.5–2% for a broad spectrum of small molecules (Zhou et al. 2020). The tool offers several modules to perform PCA, differential expression analysis, pathway analysis, and network analysis.

metPropagate, is a network-based approach that uses untargeted metabolomics data from a single patient and a group of controls to prioritize candidate genes in patients with suspected inborn errors of metabolism (IEMs) (Graham Linck et al. 2020). This approach determines whether metabolomic evidence could be used to prioritize the causative gene from this list of candidate genes, where each gene in a patient’s candidate gene list is ranked using a per-gene metabolomic score termed the “metPropagate score”, which represented the likely metabolic relevance of a particular gene to each patient.

9 Summary of current tools

In this section, I summarize the observed trends for the tools reported in 2020, which are:

a.
Majority of the software tools and packages focus on ‘annotations’, i.e., almost 35% of the total 72 tools reported for the year deal with untargeted metabolomics data annotation.
b.
82% of the total tools reported are concerned with data analysis challenges with “LC–MS/MS”, mostly untargeted LC–HRMS/MS efforts.
c.
Programming languages used for these tools mostly are R language packages (28 tools), Python language packages (11 tools), Java language (5 tools) or are web-servers/ web-based tools (23 tools).
d.
48% of the reported tools are ‘easy to use’ (click to start, web-based, or plug-and-play type tools) from a user stand point for community of biologists and chemists who are not computational savvy.
e.
Of the total tools reported here, 57% of the tools have a GitHub repository associated with them.
f.
Couple of tools are improved versions, suggesting these are active tools that are being developed/maintained.
g.
Lot of tools reported in the year deal with specialized applications: ranging from data integration (i.e., metabolomics data with proteomics/transcriptomics data), epidemiological metabolomics data, lipidomics, MSI data.

10 Concluding remarks

In summary, one can observe that there are numerous tools that were either developed from scratch or evolved from their previous versions in 2020 alone. Some tools and approaches found new applications, such as GNPS in the domain of GC–MS-based metabolomics (Aksenov et al. 2020), or released as a beta/ advanced version, i.e., MS-DIAL for lipidomics (Tsugawa et al. 2020) workflows. Only the future years will dictate as to which of these 2020 tools live on to see another year in terms of utility/ application, stays maintained and remain available, get improved, and get adopted by the metabolomics research community. Irrespective, all these tools help understanding metabolomics data from diverse stand points and are welcome additions to the community going forward into the big data-driven precision medicine era. In general, the trend is to develop, fast, computationally less intensive, robust, open-source, user-friendly tools that can adhere to findable, accessible, interoperable, and reproducible (FAIR) guidelines. Undoubtedly, the metabolomics research community needs more of these improved tools, and in the coming years the tools, resources, and databases will keep coming and getting better.

Abbreviations

AIF:: All ion fragmentation
ANOVA:: Analysis of variance
ANN:: Artificial neural network
CE:: Capillary electrophoresis
DDA:: Data dependent acquisition
DIA:: Data independent acquisition
DB:: Database
FDR:: False discovery rate
FIA:: Flow injection analysis
GC:: Gas chromatography
GNPS:: Global Natural Product Social molecular networking
GUI:: Graphical user interface
HRMS:: High-resolution mass spectrometry
HR MS/MS:: High-resolution tandem mass spectrometry
Q-ToF:: Hybrid quadrupole orthogonal time-of-flight
IMS:: Ion-mobility mass spectrometry
KEGG:: Kyoto encyclopedia of genes and genomes
LC:: Liquid chromatography
ML:: Machine learning
MSI:: Mass spectrometry imaging
MS:: Mass spectrometry
m/z :: Mass-to-charge
DL:: Meep learning
MSI:: Imaging mass spectrometry
MRM:: Multiple reaction monitoring
NMR:: Nuclear magnetic resonance
PLS-DA:: Partial least-squares-discriminant analysis
PCA:: Principal component analysis
CCS:: Collision cross section
QA:: Quality assurance
QC:: Quality control
RSD:: Relative standard deviation
RT:: Retention time
R:: R-Statistical programming
S/N:: Signal to noise ratio
SRM:: Single reaction monitoring
MS/MS:: Tandem mass spectrometry
QqQ:: Triple quadruple
UPLC-TOF:: Ultra-performance liquid chromatography-time-of-flight mass spectrometry
XCMS:: Various forms (X) of chromatography mass spectrometry

References

Adams, K. J., Pratt, B., Bose, N., Dubois, L. G., St. John-Williams, L., Perrott, K. M., et al. (2020). Skyline for small molecules: A unifying software package for quantitative metabolomics. Journal of Proteome Research, 19(4), 1447–1458. https://doi.org/10.1021/acs.jproteome.9b00640
Article CAS PubMed PubMed Central Google Scholar
Aksenov, A. A., Laponogov, I., Zhang, Z., Doran, S. L. F., Belluomo, I., Veselkov, D., et al. (2020). Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data. Nature Biotechnology. https://doi.org/10.1038/s41587-020-0700-3
Article PubMed PubMed Central Google Scholar
Alexandrov, T. (2020). Spatial metabolomics and imaging mass spectrometry in the age of artificial intelligence. Annual Review of Biomedical Data Science, 3, 1. https://doi.org/10.1146/annurev-biodatasci-011420-031537
Article Google Scholar
Aron, A. T., Gentry, E. C., McPhail, K. L., Nothias, L. F., Nothias-Esposito, M., Bouslimani, A., et al. (2020). Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nature Protocols, 15(6), 1954–1991. https://doi.org/10.1038/s41596-020-0317-5
Article CAS PubMed Google Scholar
Bararpour, N., Gilardi, F., Carmeli, C., Sidibe, J., Ivanisevic, J., Caputo, T., Augsburger, M., Grabherr, S., Desvergne, B., Guex, N., Bochud, M., Thomas, A. (2020). Visualization and normalization of drift effect across batches in metabolome-wide association studies. biorx, 914051. https://doi.org/10.1101/2020.01.22.914051
Berg, J. A., Zhou, Y., Cameron Waller, T., Ouyang, Y., Nowinski, S. M., Van Ry, T., George, I., Cox, J. E., Wang, B., Rutter, J. (2020). Gazing into the Metaboverse: Automated exploration and contextualization of metabolic data. bioRxiv, 171850. https://doi.org/10.1101/2020.06.25.171850
Beuchel, C., Kirsten, H., Ceglarek, U., & Scholz, M. (2020). Metabolite-Investigator: An integrated user-friendly workflow for metabolomics multi-study analysis. Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa967
Article PubMed Google Scholar
Bhinderwala, F., Evans, P., Jones, K., Laws, B. R., Smith, T. G., Morton, M., & Powers, R. (2020). Phosphorus NMR and its application to metabolomics. Analytical Chemistry, 92(14), 9536–9545. https://doi.org/10.1021/acs.analchem.0c00591
Article CAS PubMed Google Scholar
Bonini, P., Kind, T., Tsugawa, H., Barupal, D. K., & Fiehn, O. (2020). Retip: Retention time prediction for compound annotation in untargeted metabolomics. Analytical Chemistry, 92(11), 7515–7522. https://doi.org/10.1021/acs.analchem.9b05765
Article CAS PubMed Google Scholar
Bowen, Du., Tian, Z., Peter, K. T., & Kolodziej, E. P. (2020). Develo** unique nontarget high-resolution mass spectrometry signatures to track contaminant sources in urban waters. Environmental Science & Technology Letters, 7(12), 923–930. https://doi.org/10.1021/acs.estlett.0c00749
Article CAS Google Scholar
Capellades, J., Junza, A., Samino, S., Brunner, J. S., Schabbauer, G., Vinaixa, M., & Yanes, O. (2020). Exploring the use of gas chromatography coupled to chemical ionization mass spectrometry (GC-CI-MS) for stable isotope labeling in metabolomics. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c02998
Castellano-Escuder, P., González-Domínguez, R., Wishart, D. S., Andrés-Lacueva, C., & Sánchez-Pla, A. (2020). FOBI: An ontology to represent food intake data and associate it with metabolomic data. Database: The journal of biological databases and curation, 2020, 2020. https://doi.org/10.1093/databa/baaa033
Article Google Scholar
Charris-Molina, A., Riquelme, G., Burdisso, P., & Hoijemberg, P. A. (2020). Consecutive queries to assess biological correlation in NMR metabolomics: Performance of comprehensive search of multiplets over typical 1D 1H NMR database search. Journal of Proteome Research, 19(8), 2977–2988. https://doi.org/10.1021/acs.jproteome.9b00872
Article CAS PubMed Google Scholar
Chetnik, K., Petrick, L., & Pandey, G. (2020). MetaClean: A machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data. Metabolomics, 16(11), 117. https://doi.org/10.1007/s11306-020-01738-3
Article CAS PubMed PubMed Central Google Scholar
Choudhary, K. S., Fahy, E., Coakley, K., Sud, M., Maurya, M. R., & Subramaniam, S. (2020). MetENP/MetENPWeb: An R package and web application for metabolomics enrichment and pathway analysis in Metabolomics Workbench. bioRxiv, 2020.11.20.391912. https://doi.org/10.1101/2020.11.20.391912
Choudhury, R., Beezley, J., Davis, B., Tomeck, J., Gratzl, S., Golzarri-arroyo, L., et al. (2020). Viime: Visualization and Integration of Metabolomics Experiments. Journal of Open Source Software 5, 1–13. https://doi.org/10.21105/joss.02410
Delcourt, V., Barnabé, A., Loup, B., Garcia, P., André, F., Chabot, B., et al. (2020). MetIDfyR: An open-source r package to decipher small-molecule drug metabolism through high-resolution mass spectrometry. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c02281
Article PubMed Google Scholar
Du, J., Su, Y., Qian, C., Yuan, D., Miao, K., Lee, D., et al. (2020). Raman-guided subcellular pharmaco-metabolomics for metastatic melanoma cells. Nature Communications, 11(1), 4830. https://doi.org/10.1038/s41467-020-18376-x
Article CAS PubMed PubMed Central Google Scholar
Dudek, C.-A., Reuse, C., Fuchs, R., Hendriks, J., Starck, V., & Hiller, K. (2020). MIAMI––a tool for non-targeted detection of metabolic flux changes for mode of action identification. Bioinformatics, 36(12), 3925–3926. https://doi.org/10.1093/bioinformatics/btaa251
Article PubMed PubMed Central Google Scholar
Dührkop, K., Fleischauer, M., Ludwig, M., et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nature Methods, 16, 299–302 (2019). https://doi.org/10.1038/s41592-019-0344-8
Dührkop, K., Nothias, L.-F., Fleischauer, M., Reher, R., Ludwig, M., Hoffmann, M. A., et al. (2020). Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nature Biotechnology. https://doi.org/10.1038/s41587-020-0740-8
Article PubMed Google Scholar
Eicher, T., Kinnebrew, G., Patt, A., Spencer, K., Ying, K., Ma, Q., et al. (2020). Metabolomics and multi-omics integration: A survey of computational methods and resources. Metabolites, 10(5), 202. https://doi.org/10.3390/metabo10050202
Article CAS PubMed Central Google Scholar
Ekholm, J., Ohukainen, P., Kangas, A. J., Kettunen, J., Wang, Q., Karsikas, M., et al. (2020). EpiMetal: An open-source graphical web browser tool for easy statistical analyses in epidemiology and metabolomics. International Journal of Epidemiology, 49(4), 1075–1081. https://doi.org/10.1093/ije/dyz244
Article PubMed PubMed Central Google Scholar
Fan, Z., Alley, A., Ghaffari, K., & Ressom, H. W. (2020). MetFID: artificial neural network-based compound fingerprint prediction for metabolite annotation. Metabolomics, 16(10), 104. https://doi.org/10.1007/s11306-020-01726-7
Article CAS PubMed Google Scholar
Fiehn, O. (2002). Metabolomics—The link between genotypes and phenotypes. Plant Molecular Biology, 48(1–2), 155–171. https://doi.org/10.1023/A:1013713905833
Article CAS PubMed Google Scholar
Fraisier-Vannier, O., Chervin, J., Cabanac, G., Puech, V., Fournier, S., Durand, V., et al. (2020). MS-CleanR: A feature-filtering workflow for untargeted LC–MS based metabolomics. Analytical Chemistry, 92(14), 9971–9981. https://doi.org/10.1021/acs.analchem.0c01594
Article CAS PubMed Google Scholar
Geier, B., Sogin, E. M., Michellod, D., Janda, M., Kompauer, M., Spengler, B., et al. (2020). Spatial metabolomics of in situ host–microbe interactions at the micrometre scale. Nature Microbiology, 5(3), 498–510. https://doi.org/10.1038/s41564-019-0664-6
Article CAS PubMed Google Scholar
Graham Linck, E. J., Richmond, P. A., Tarailo-Graovac, M., Engelke, U., Kluijtmans, L. A. J., Coene, K. L. M., et al. (2020). metPropagate: network-guided propagation of metabolomic information for prioritization of metabolic disease genes. npj Genomic Medicine, 5(1), 25. https://doi.org/10.1038/s41525-020-0132-5
Article PubMed PubMed Central Google Scholar
Guo, J., & Huan, T. (2020a). Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography-mass spectrometry based untargeted metabolomics. Analytical Chemistry, 92(12), 8072–8080. https://doi.org/10.1021/acs.analchem.9b05135
Article CAS PubMed Google Scholar
Guo, J., & Huan, T. (2020b). Evaluation of significant features discovered from different data acquisition modes in mass spectrometry-based untargeted metabolomics. Analytica Chimica Acta, 1137, 37–46. https://doi.org/10.1016/j.aca.2020.08.065
Article CAS PubMed Google Scholar
Han, W., & Li, L. (2020). Evaluating and minimizing batch effects in metabolomics. Mass Spectrometry Reviews. https://doi.org/10.1002/mas.21672
Article PubMed Google Scholar
Helmus, R., ter Laak, T. L., van Wezel, A. P., de Voogt, P., & Schymanski, E. L. (2021). patRoon: Open source software platform for environmental mass spectrometry based non-target screening. Journal of Cheminformatics, 13(1), 1. https://doi.org/10.1186/s13321-020-00477-w
Article CAS PubMed PubMed Central Google Scholar
Henning, J., & Smith, R. (2020). A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data. BMC Bioinformatics, 21(1), 418. https://doi.org/10.1186/s12859-020-03752-7
Article PubMed PubMed Central Google Scholar
Henry, V. J., Bandrowski, A. E., Pepin, A.-S., Gonzalez, B. J., & Desfeux, A. (2014). OMICtools: an informative directory for multi-omic data analysis. Database, 2014, bau069–bau069. https://doi.org/10.1093/database/bau069
Hohrenk, L. L., Itzel, F., Baetz, N., Tuerk, J., Vosough, M., & Schmidt, T. C. (2020). Comparison of software tools for liquid chromatography–high-resolution mass spectrometry data processing in nontarget screening of environmental samples. Analytical Chemistry, 92(2), 1898–1907. https://doi.org/10.1021/acs.analchem.9b04095
Article CAS PubMed Google Scholar
Mohimani, H., Cao, L., Guler, M., & Tagirdzhanov, A. (2020). MolDiscovery: Learning Mass Spectrometry Fragmentation of Small Molecules. Research Square. https://doi.org/10.21203/rs.3.rs-71854/v1
Article Google Scholar
Huang, L., Currais, A., & Shokhirev, M. N. (2020). SUMMER, a shiny utility for metabolomics and multiomics exploratory research. Metabolomics, 16(12), 126. https://doi.org/10.1007/s11306-020-01750-7
Article CAS PubMed PubMed Central Google Scholar
Iakab, S. A., Sementé, L., García-Altares, M., Correig, X., & Ràfols, P. (2020). Raman2imzML converts Raman imaging data into the standard mass spectrometry imaging format. BMC Bioinformatics, 21(1), 448. https://doi.org/10.1186/s12859-020-03789-8
Article CAS PubMed PubMed Central Google Scholar
Jahagirdar, S., & Saccenti, E. (2020). Evaluation of single sample network inference methods for metabolomics-based systems medicine. Journal of Proteome Research. https://doi.org/10.1021/acs.jproteome.0c00696
Article PubMed PubMed Central Google Scholar
Jarmusch, A. K., Wang, M., Aceves, C. M., Advani, R. S., Aguirre, S., Aksenov, A. A., et al. (2020). ReDU: a framework to find and reanalyze public mass spectrometry data. Nature Methods. https://doi.org/10.1038/s41592-020-0916-7
Article PubMed PubMed Central Google Scholar
Ju, R., Liu, X., Zheng, F., Zhao, X., Lu, X., Lin, X., et al. (2020). A graph density-based strategy for features fusion from different peak extract software to achieve more metabolites in metabolic profiling from high-resolution mass spectrometry. Analytica Chimica Acta, 1139, 8–14. https://doi.org/10.1016/j.aca.2020.09.029
Article CAS PubMed Google Scholar
Kachman, M., Habra, H., Duren, W., Wigginton, J., Sajjakulnukit, P., Michailidis, G., et al. (2019). Deep annotation of untargeted LC-MS metabolomics data with Binner. Bioinformatics. https://doi.org/10.1093/bioinformatics/btz798
Article PubMed Central Google Scholar
Khakimov, B., Mobaraki, N., Trimigno, A., Aru, V., & Engelsen, S. B. (2020). Signature map** (SigMa): An efficient approach for processing complex human urine 1H NMR metabolomics data. Analytica Chimica Acta, 1108, 142–151. https://doi.org/10.1016/j.aca.2020.02.025
Article CAS PubMed Google Scholar
Klåvus, A., Kokla, M., Noerman, S., Koistinen, V. M., Tuomainen, M., Zarei, I., et al. (2020). “Notame”: Workflow for non-targeted LC–MS metabolic profiling. Metabolites, 10(4), 135. https://doi.org/10.3390/metabo10040135
Article CAS PubMed Central Google Scholar
Kockmann, T., & Panse, C. (2020). rawR - Direct access to raw mass spectrometry data in R. bioRxiv, 2020.10.30.362533. https://doi.org/10.1101/2020.10.30.362533
Koelmel, J. P., Li, X., Stow, S. M., Sartain, M. J., Murali, A., Kemperman, R., et al. (2020). Lipid annotator: Towards accurate annotation in non-targeted liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS) lipidomics using a rapid and user-friendly software. Metabolites, 10(3), 101. https://doi.org/10.3390/metabo10030101
Article CAS PubMed Central Google Scholar
Kostyukevich, Y., Zherebker, A., Orlov, A., Kovaleva, O., Burykina, T., Isotov, B., & Nikolaev, E. N. (2020). Hydrogen/deuterium and 16 O/ 18 O-exchange mass spectrometry boosting the reliability of compound identification. Analytical Chemistry, 92(10), 6877–6885. https://doi.org/10.1021/acs.analchem.9b05379
Article CAS PubMed Google Scholar
Kouřil, Š, de Sousa, J., Václavík, J., Friedecký, D., & Adam, T. (2020). CROP: correlation-based reduction of feature multiplicities in untargeted metabolomic data. Bioinformatics, 36(9), 2941–2942. https://doi.org/10.1093/bioinformatics/btaa012
Article CAS PubMed Google Scholar
Krassowski, M., Das, V., Sahu, S. K., & Misra, B. B. (2020). State of the field in multi-omics research: From computational needs to data mining and sharing. Frontiers in Genetics. https://doi.org/10.3389/fgene.2020.610798
Article PubMed PubMed Central Google Scholar
Kuhn, S., Colreavy-Donnelly, S., de Andrade Silva Quaresma, L. E., de Andrade Silva Quaresma, E., & Borges, R. M. (2020). Applying NMR compound identification using NMRfilter to match predicted to experimental data. Metabolomics, 16(12), 123. https://doi.org/10.1007/s11306-020-01748-1
Article CAS PubMed Google Scholar
Kuhring, M., Eisenberger, A., Schmidt, V., Kränkel, N., Leistner, D. M., Kirwan, J., & Beule, D. (2020). Concepts and software package for efficient quality control in targeted metabolomics studies: MeTaQuaC. Analytical Chemistry, 92(15), 10241–10245. https://doi.org/10.1021/acs.analchem.0c00136
Article CAS PubMed Google Scholar
Kutuzova, S., Colaianni, P., Röst, H., Sachsenberg, T., Alka, O., Kohlbacher, O., et al. (2020). SmartPeak automates targeted and quantitative metabolomics data processing. Analytical Chemistry, 92(24), 15968–15974. https://doi.org/10.1021/acs.analchem.0c03421
Article CAS PubMed Google Scholar
Letertre, M. P. M., Dervilly, G., & Giraudeau, P. (2020). Combined nuclear magnetic resonance spectroscopy and mass spectrometry approaches for metabolomics. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c04371
Article PubMed Google Scholar
Li, Y., Bouza, M., Wu, C., Guo, H., Huang, D., Doron, G., et al. (2020). Sub-nanoliter metabolomics via mass spectrometry to characterize volume-limited samples. Nature Communications, 11(1), 5625. https://doi.org/10.1038/s41467-020-19444-y
Article CAS PubMed PubMed Central Google Scholar
Liang, D., Liu, Q., Zhou, K., Jia, W., **e, G., & Chen, T. (2020). IP4M: an integrated platform for mass spectrometry-based metabolomics data mining. BMC Bioinformatics, 21(1), 444. https://doi.org/10.1186/s12859-020-03786-x
Article CAS PubMed PubMed Central Google Scholar
Liebal, U. W., Phan, A. N. T., Sudhakar, M., Raman, K., & Blank, L. M. (2020). Machine learning applications for mass spectrometry-based metabolomics. Metabolites, 10(6), 243. https://doi.org/10.3390/metabo10060243
Article CAS PubMed Central Google Scholar
Liu, K. H., Nellis, M., Uppal, K., Ma, C., Tran, V., Liang, Y., et al. (2020a). Reference standardization for quantification and harmonization of large-scale metabolomics. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c00338
Article PubMed PubMed Central Google Scholar
Liu, Y., Mrzic, A., Meysman, P., De Vijlder, T., Romijn, E. P., Valkenborg, D., et al. (2020b). MESSAR: Automated recommendation of metabolite substructures from tandem mass spectra. PLoS ONE, 15(1), e0226770. https://doi.org/10.1371/journal.pone.0226770
Article CAS PubMed PubMed Central Google Scholar
Lloyd, G. R., Jankevics, A., & Weber, R. J. M. (2020). struct: An R/Bioconductor-based framework for standardized metabolomics data analysis and beyond. Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa1031
Article PubMed PubMed Central Google Scholar
Lu, W., **ng, X., Wang, L., Chen, L., Zhang, S., McReynolds, M. R., & Rabinowitz, J. D. (2020). Improved annotation of untargeted metabolomics data through buffer modifications that shift adduct mass and intensity. Analytical Chemistry, 92(17), 11573–11581. https://doi.org/10.1021/acs.analchem.0c00985
Article CAS PubMed Google Scholar
Luan, H., Jiang, X., Ji, F., Lan, Z., Cai, Z., & Zhang, W. (2020). CPVA: A web-based metabolomic tool for chromatographic peak visualization and annotation. Bioinformatics, 36(12), 3913–3915. https://doi.org/10.1093/bioinformatics/btaa200
Article CAS PubMed Google Scholar
Madrid-Gambin, F., Oller-Moreno, S., Fernandez, L., Bartova, S., Giner, M. P., Joyce, C., et al. (2020). AlpsNMR: Asn R package for signal processing of fully untargeted NMR-based metabolomics. Bioinformatics, 36(9), 2943–2945. https://doi.org/10.1093/bioinformatics/btaa022
Article CAS PubMed Google Scholar
Mahmud, I., & Garrett, T. J. (2020). Mass spectrometry techniques in emerging pathogens studies: COVID-19 Perspectives. Journal of the American Society for Mass Spectrometry, 31(10), 2013–2024. https://doi.org/10.1021/jasms.0c00238
Article CAS PubMed Google Scholar
Manjarin, R., Maj, M. A., La Frano, M. R., & Glanz, H. (2020). %polynova_2way: A SAS macro for implementation of mixed models for metabolomics data. PLoS ONE, 15(12), e0244013. https://doi.org/10.1371/journal.pone.0244013
Article CAS PubMed PubMed Central Google Scholar
Matyushin, D. D., Sholokhova, A. Y., & Buryak, A. K. (2020). Deep learning driven GC-MS library search and its application for metabolomics. Analytical Chemistry, 92(17), 11818–11825. https://doi.org/10.1021/acs.analchem.0c02082
Article CAS PubMed Google Scholar
McLean, C., & Kujawinski, E. B. (2020). AutoTuner: High fidelity and robust parameter selection for metabolomics data processing. Analytical Chemistry, 92(8), 5724–5732. https://doi.org/10.1021/acs.analchem.9b04804
Article CAS PubMed PubMed Central Google Scholar
Misra, B. B. (2020). Open-source software tools, databases, and resources for single-cell and single-cell-type metabolomics. In B. Shrestha (Ed.), Single cell metabolism. Methods in molecular biology. (Vol. 2064, pp. 191–217). New York: Humana. https://doi.org/10.1007/978-1-4939-9831-9_15
Chapter Google Scholar
Misra, B. B. (2020). Data normalization strategies in metabolomics: Current challenges, approaches, and tools. European Journal of Mass Spectrometry, 26, 165–174
Article CAS Google Scholar
Misra, B. B. (2020). The connection and disconnection between microbiome and metabolome: A critical appraisal in clinical research. Biological Research For Nursing, 22, 561
Article Google Scholar
Misra, B. B., & Olivier, M. (2020). High resolution GC-orbitrap-MS metabolomics using both electron ionization and chemical ionization for analysis of human plasma. Journal of Proteome Research, 19(7), 2717–2731. https://doi.org/10.1021/acs.jproteome.9b00774
Article CAS PubMed Google Scholar
Misra, B., & van der Hooft, J. (2015). Updates in metabolomics tools and resources: 2014–2015. Electrophoresis., 37(1), 86–110
Article Google Scholar
Mohamed, A., Molendijk, J., & Hill, M. M. (2020). lipidr: A software tool for data mining and analysis of lipidomics datasets. Journal of Proteome Research, 19(7), 2890–2897. https://doi.org/10.1021/acs.jproteome.0c00082
Article CAS PubMed Google Scholar
MZmine Development Team. (2015). MZmine 2 manual, (c), 14.
Naylor, B. C., Catrow, J. L., Maschek, J. A., & Cox, J. E. (2020). QSRR automator: A tool for automating retention time prediction in lipidomics and metabolomics. Metabolites, 10(6), 237. https://doi.org/10.3390/metabo10060237
Article CAS PubMed Central Google Scholar
Ni, Z., & Fedorova, M. (2020). LipidLynxX: a data transfer hub to support integration of large scale lipidomics datasets, 33894. https://doi.org/10.1101/2020.04.09.033894
O’Shea, K., & Misra, B. B. (2020). Software tools, databases and resources in metabolomics: Updates from 2018 to 2019. Metabolomics, 16(3), 1–23
Article Google Scholar
Peng, B., Kopczynski, D., Pratt, B. S., Ejsing, C. S., Burla, B., Hermansson, M., et al. (2020). LipidCreator workbench to probe the lipidomic landscape. Nature Communications, 11(1), 2057. https://doi.org/10.1038/s41467-020-15960-z
Article CAS PubMed PubMed Central Google Scholar
Phapale, P., Palmer, A., Gathungu, R. M., Kale, D., Brügger, B., & Alexandrov, T. (2021). Public LC-orbitrap tandem mass spectral library for metabolite identification. Journal of Proteome Research. https://doi.org/10.1021/acs.jproteome.0c00930
Article PubMed Google Scholar
Pietzke, M., & Vazquez, A. (2020). Metabolite AutoPlotter—an application to process and visualise metabolite data in the web browser. Cancer & Metabolism, 8(1), 15. https://doi.org/10.1186/s40170-020-00220-x
Article Google Scholar
Pomyen, Y., Wanichthanarak, K., Poungsombat, P., Fahrmann, J., Grapov, D., & Khoomrung, S. (2020). Deep metabolome: Applications of deep learning in metabolomics. Computational and Structural Biotechnology Journal, 18, 2818–2825. https://doi.org/10.1016/j.csbj.2020.09.033
Article CAS PubMed PubMed Central Google Scholar
Quiroz-Moreno, C., Furlan, M. F., Belinato, J. R., Augusto, F., Alexandrino, G. L., & Mogollón, N. G. S. (2020). RGCxGC toolbox: An R-package for data processing in comprehensive two-dimensional gas chromatography-mass spectrometry. Microchemical Journal, 156, 104830. https://doi.org/10.1016/j.microc.2020.104830
Article CAS Google Scholar
Rawlinson, C., Jones, D., Rakshit, S., Meka, S., Moffat, C. S., & Moolhuijzen, P. (2020). Hierarchical clustering of MS/MS spectra from the firefly metabolome identifies new lucibufagin compounds. Scientific Reports. https://doi.org/10.1038/s41598-020-63036-1
Article PubMed PubMed Central Google Scholar
Ricart, E., Pupin, M., Müller, M., & Lisacek, F. (2020). Automatic annotation and dereplication of tandem mass spectra of peptidic natural products. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c03208
Article PubMed Google Scholar
Riquelme, G., Zabalegui, N., Marchi, P., Jones, C. M., & Monge, M. E. (2020). A python-based pipeline for preprocessing LC–MS data for untargeted metabolomics workflows. Metabolites, 10(10), 416. https://doi.org/10.3390/metabo10100416
Article CAS PubMed Central Google Scholar
Schmid, R., Petras, D., Nothias, L-F., Wang, M., Aron, A. T., Jagels, A., Tsugawa, H., Rainer, J., Garcia-Aloy, M., Dührkop, K., Korf, A., Pluskal, T., Kameník, Z., Jarmusch, A. K., Andrés Mauricio Caraballo-Rodrígu, P. C. D. (2020). Ion Identity Molecular Networking in the GNPS Environment. bioRxiv, 088948. https://doi.org/10.1101/2020.05.11.088948
Rosa, T. R., Folli, G. S., Pacheco, W. L. S., Castro, M. P., Romão, W., & Filgueiras, P. R. (2020). DropMS: Petroleomics data treatment based in web server for high-resolution mass spectrometry. Journal of the American Society for Mass Spectrometry, 31(7), 1483–1490. https://doi.org/10.1021/jasms.0c00109
Article CAS PubMed Google Scholar
Ross, D. H., Cho, J. H., Zhang, R., Hines, K. M., & Xu, L. (2020). LiPydomics: A python package for comprehensive prediction of lipid collision cross sections and retention times and analysis of ion mobility-mass spectrometry-based lipidomics data. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c02560
Article PubMed PubMed Central Google Scholar
Sajulga, R., Easterly, C., Riffle, M., Mesuere, B., Muth, T., Mehta, S., et al. (2020). Survey of metaproteomics software tools for functional microbiome analysis. PLoS ONE, 15(11), e0241503. https://doi.org/10.1371/journal.pone.0241503
Article CAS PubMed PubMed Central Google Scholar
Sarvin, B., Lagziel, S., Sarvin, N., Mukha, D., Kumar, P., Aizenshtein, E., & Shlomi, T. (2020). Fast and sensitive flow-injection mass spectrometry metabolomics by analyzing sample-specific ion distributions. Nature Communications, 11(1), 3186. https://doi.org/10.1038/s41467-020-17026-6
Article CAS PubMed PubMed Central Google Scholar
Schum, S. K., Brown, L. E., & Mazzoleni, L. R. (2020). MFAssignR: Molecular formula assignment software for ultrahigh resolution mass spectrometry analysis of environmental complex mixtures. Environmental Research, 191, 110114. https://doi.org/10.1016/j.envres.2020.110114
Article CAS PubMed Google Scholar
Sen, P., Lamichhane, S., Mathema, V. B., Mcglinchey, A., Dickens, A. M., Khoomrung, S., & Ore, M. (2020). OUP accepted manuscript. Briefings Bioinformatics. https://doi.org/10.1093/bib/bbaa204
Article Google Scholar
Sorokina, M., & Steinbeck, C. (2020). Review on natural products databases: where to find data in 2020. Journal of Cheminformatics, 12(1), 20. https://doi.org/10.1186/s13321-020-00424-9
Article CAS PubMed PubMed Central Google Scholar
Southam, A. D., Pursell, H., Frigerio, G., Jankevics, A., Weber, R. J. M., & Dunn, W. B. (2020). Characterization of monophasic solvent-based tissue extractions for the detection of polar metabolites and lipids applying ultrahigh-performance liquid chromatography-mass spectrometry clinical metabolic phenoty** assays. Journal of Proteome Research. https://doi.org/10.1021/acs.jproteome.0c00660
Article PubMed Google Scholar
Spicer, R., Salek, R. M., Moreno, P., Cañueto, D., & Steinbeck, C. (2017). Navigating freely-available software tools for metabolomics analysis. Metabolomics, 13(9), 106. https://doi.org/10.1007/s11306-017-1242-7
Article CAS PubMed PubMed Central Google Scholar
Spraker, J. E., Luu, G. T., & Sanchez, L. M. (2020). Imaging mass spectrometry for natural products discovery: A review of ionization methods. Natural Product Reports, 37(2), 150–162. https://doi.org/10.1039/C9NP00038K
Article PubMed PubMed Central Google Scholar
Taiyun Kim, Owen Tang, Stephen T Vernon, Katharine A Kott, Yen Chin KoaTaiyun Kim, Owen Tang, Stephen T Vernon, Katharine A Kott, Yen Chin Koay, John Park, David James, Terence P Speed, Pengyi Yang, John F. O’Sullivan, Gemma A Figtree, Jean Yee Hwa Yangy, J. Y. H. Y. (2020). hRUV: Hierarchical approach to removal of unwanted variation for large-scale metabolomics data. bioRxiv, 423723. https://doi.org/10.1101/2020.12.21.423723
Tarazona, S., Balzano-Nogueira, L., Gómez-Cabrero, D., Schmidt, A., Imhof, A., Hankemeier, T., et al. (2020). Harmonization of quality metrics and power calculation in multi-omic studies. Nature Communications, 11(1), 3092. https://doi.org/10.1038/s41467-020-16937-8
Article CAS PubMed PubMed Central Google Scholar
Tautenhahn, R., Böttcher, C., & Neumann, S. (2008). Highly sensitive feature detection for high resolution LC/MS. BMC Bioinformatics, 9(1), 504. https://doi.org/10.1186/1471-2105-9-504
Article CAS PubMed PubMed Central Google Scholar
Teo, G., Chew, W. S., Burla, B. J., Herr, D., Tai, E. S., Wenk, M. R., et al. (2020). MRMkit: Automated data processing for large-scale targeted metabolomics analysis. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c03060
Article PubMed PubMed Central Google Scholar
Thomen, A., Najafinobar, N., Penen, F., Kay, E., Upadhyay, P. P., Li, X., et al. (2020). Subcellular mass spectrometry imaging and absolute quantitative analysis across organelles. ACS Nano, 14(4), 4316–4325. https://doi.org/10.1021/acsnano.9b09804
Article CAS PubMed PubMed Central Google Scholar
Thompson, C. J., Witt, M., Forcisi, S., Moritz, F., Kessler, N., Laukien, F. H., & Schmitt-Kopplin, P. (2020). An enhanced isotopic fine structure method for exact mass analysis in discovery metabolomics: FIA-CASI-FTMS. Journal of the American Society for Mass Spectrometry, 31(10), 2025–2034. https://doi.org/10.1021/jasms.0c00047
Article CAS PubMed Google Scholar
Tripathi, A., Vázquez-Baeza, Y., Gauglitz, J. M., Wang, M., Dührkop, K., Nothias-Esposito, M., et al. (2020). Chemically informed analyses of metabolomics mass spectrometry data with Qemistree. Nature Chemical Biology. https://doi.org/10.1038/s41589-020-00677-3
Article PubMed Google Scholar
Tsugawa, H., Cajka, T., Kind, T., Ma, Y., Higgins, B., Ikeda, K., et al. (2015). MS-DIAL: Data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nature Methods, 12(6), 523–526. https://doi.org/10.1038/nmeth.3393
Article CAS PubMed PubMed Central Google Scholar
Tsugawa, H., Ikeda, K., Takahashi, M., Satoh, A., Mori, Y., Uchino, H., et al. (2020). A lipidome atlas in MS-DIAL 4. Nature Biotechnology, 38(10), 1159–1163. https://doi.org/10.1038/s41587-020-0531-2
Article CAS PubMed Google Scholar
van der Laan, T., Dubbelman, A.-C., Duisters, K., Kindt, A., Harms, A. C., & Hankemeier, T. (2020). High-throughput fractionation coupled to mass spectrometry for improved quantitation in metabolomics. Analytical Chemistry, 92(21), 14330–14338. https://doi.org/10.1021/acs.analchem.0c01375
Article CAS PubMed PubMed Central Google Scholar
Wajid, B., Iqbal, H., Jamil, M., Rafique, H., & Anwar, F. (2020). MetumpX—a metabolomics support package for untargeted mass spectrometry. Bioinformatics, 36(5), 1647–1648. https://doi.org/10.1093/bioinformatics/btz765
Article CAS PubMed Google Scholar
Wang, M., Jarmusch, A. K., Vargas, F., Aksenov, A. A., Gauglitz, J. M., Weldon, K., et al. (2020). Mass spectrometry searches using MASST. Nature Biotechnology, 38(1), 23–26. https://doi.org/10.1038/s41587-019-0375-9
Article CAS PubMed PubMed Central Google Scholar
Wang, M., Leber, C., Nothias, L., Reher, R., Kang, K. Bin, Hooft, J. J. Van Der, et al. (2020). NPClassifier: A deep neural network-based structural classification tool for natural products, (1). https://doi.org/10.26434/chemrxiv.12885494.v1
Weber, P., Pauling, J. K., List, M., & Baumbach, J. (2020). BALSAM—An interactive online platform for breath analysis visualization and classification. Metabolites, 10(10), 393. https://doi.org/10.3390/metabo10100393
Article CAS PubMed Central Google Scholar
Witting, M., & Böcker, S. (2020). Current status of retention time prediction in metabolite identification. Journal of Separation Science, 43(9–10), 1746–1754. https://doi.org/10.1002/jssc.202000060
Article CAS PubMed Google Scholar
Wolthuis, J. C., Magnusdottir, S., Pras-Raves, M., Moshiri, M., Jans, J. J. M., Burgering, B., et al. (2020). MetaboShiny: Interactive analysis and metabolite annotation of mass spectrometry-based metabolomics data. Metabolomics, 16(9), 99. https://doi.org/10.1007/s11306-020-01717-8
Article CAS PubMed PubMed Central Google Scholar
Wörheide, M. A., Krumsiek, J., Kastenmüller, G., & Arnold, M. (2021). Multi-omics integration in biomedical research—A metabolomics-centric review. Analytica Chimica Acta, 1141, 144–162. https://doi.org/10.1016/j.aca.2020.10.038
Article CAS PubMed Google Scholar
Wu, C.-T., Wang, Y., Wang, Y., Ebbels, T., Karaman, I., Graça, G., et al. (2020). Targeted realignment of LC-MS profiles by neighbor-wise compound-specific graphical time war** with misalignment detection. Bioinformatics, 36(9), 2862–2871. https://doi.org/10.1093/bioinformatics/btaa037
Article CAS PubMed PubMed Central Google Scholar
**ng, S., Hu, Y., Yin, Z., Liu, M., Tang, X., Fang, M., & Huan, T. (2020). Retrieving and utilizing hypothetical neutral losses from tandem mass spectra for spectral similarity analysis and unknown metabolite annotation. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.0c02521
Article PubMed PubMed Central Google Scholar
Xue, J., Guijas, C., Benton, H. P., Warth, B., & Siuzdak, G. (2020). METLIN MS2 molecular standards database: a broad chemical and biological resource. Nature Methods. https://doi.org/10.1038/s41592-020-0942-5
Article PubMed Google Scholar
Yang, Q., Wang, Y., Zhang, Y., Li, F., **a, W., Zhou, Y., et al. (2020). NOREVA: Enhanced normalization and evaluation of time-course and multi-class metabolomic data. Nucleic Acids Research, 48(W1), W436–W448. https://doi.org/10.1093/nar/gkaa258
Article CAS PubMed PubMed Central Google Scholar
Zhang, F., Ge, W., Ruan, G., Cai, X., & Guo, T. (2020). Data-independent acquisition mass spectrometry-based proteomics and software tools: A glimpse in 2020. Proteomics, 20(17–18), 1900276. https://doi.org/10.1002/pmic.201900276
Article CAS Google Scholar
Zhang, J., Sans, M., Garza, K. Y., & Eberlin, L. S. (2020). Mass spectrometry technologies to advance care for cancer patients in clinical and intraoperative use. Mass Spectrometry Reviews. https://doi.org/10.1002/mas.21664
Article PubMed Google Scholar
Zhao, S., & Li, L. (2020). Chemical derivatization in LC-MS-based metabolomics study. TrAC Trends in Analytical Chemistry, 131, 115988. https://doi.org/10.1016/j.trac.2020.115988
Article CAS Google Scholar
Zheng, F., Zhao, X., Zeng, Z., Wang, L., Lv, W., Wang, Q., & Xu, G. (2020). Development of a plasma pseudotargeted metabolomics method based on ultra-high-performance liquid chromatography–mass spectrometry. Nature Protocols, 15(8), 2519–2537. https://doi.org/10.1038/s41596-020-0341-5
Article CAS PubMed Google Scholar
Zhou, Z., Luo, M., Chen, X., Yin, Y., **ong, X., Wang, R., & Zhu, Z.-J. (2020). Ion mobility collision cross-section atlas for known and unknown metabolite annotation in untargeted metabolomics. Nature Communications, 11(1), 4334. https://doi.org/10.1038/s41467-020-18171-8
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

I acknowledge the efforts of the informatics and computational resource developers who help drive the field forward with their codes, packages, tools, and resources that enable the metabolomists, biologists and analytical chemists to keep pace with the volume and complexity of the metabolomics data generated. I do also apologize to all investigators whose tools and resources might have been missed in this review, inadvertently. I would like to acknowledge the independent reviewers and the editor for their comments to help improve this manuscript.

Funding

None.

Author information

Authors and Affiliations

Enveda Biosciences, Inc., Boulder, CO, 80301, USA
Biswapriya B. Misra

Authors

Biswapriya B. Misra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Biswapriya B. Misra.

Ethics declarations

Conflict of interest

None.

Ethical approval

This article does not contain any studies with human and/or animal participants performed by the authors.

Research involving human and/or animal participants

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Misra, B.B. New software tools, databases, and resources in metabolomics: updates from 2020. Metabolomics 17, 49 (2021). https://doi.org/10.1007/s11306-021-01796-1

Download citation

Received: 01 January 2021
Accepted: 17 April 2021
Published: 11 May 2021
DOI: https://doi.org/10.1007/s11306-021-01796-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

New software tools, databases, and resources in metabolomics: updates from 2020