Background

Our gut harbours trillions of microbes which play essential roles in many physiological and pathological processes. The disturbances of gut microbiome homeostasis can cause many diseases. For example, the enrichment of Fusobacterium nucleatum can induce colorectal cancer metastasis [1]. Many other diseases are also affected by microbes, such as Parkinson disease and diabetes, yet their mechanisms are still unclear [2, 3]. Thanks to advances in technologies such as sequencing, we are now able to observe and analyse the composition and status of gut microbes. This can help us to diagnose diseases early and even propose new treatments based on the increase and decrease of different microbial populations in diseases [4]. Although numerous studies uncover the mechanism among diseases, microbes, and metabolites, these associations are still scattered in the literature [5].

In recent decades, more and more studies have shown that microbes affect human health through their metabolites [6]. The metabolite is one of the key factors that drive the interaction between human gut microbes and diseases. For example, Trimethylamine N-oxide (TMAO) is a metabolite derived from the gut microbiota, which has been widely reported to be associated with cardiovascular disease [7]. Recently, some studies also reported that TMAO might be a key activator of antitumor immunity [Full size image

Validation of predicted disease-metabolite association

To prioritize and filter the meaning of disease-metabolite association and reveal the most interesting results, we defined an association strength score and a confidence score. As a rough evaluation of the accuracy of our method, we got 138 experimental disease-metabolite association from 49 published studies. Among them, 36 associations have meaning scores and 26 (72.2%) could be given consistent direction of disease-metabolite association (Supplementary table 1). We do not aim to pick out all the disease-metabolites associations, and our main purpose is to make our chosen association genuine in high probability and deserved to be studied as drug or marker candidates.

For example, by querying “Diabetes Mellitus, Type 2” (T2DM) in the “Search (Disease-Metabolite)” query box, GMMAD provides the 2137 T2DM associated metabolites. Each entry has an association strength score and a confidence score, which the user can browse in order of score value. For example, we sort all the “Diabetes Mellitus, Type 2” related entries according to their association strength scores, and the “nadide” is the third strongest negative association score with the T2DM (Sas = -0.625 and Sac = 3.13). Nadide (also called NAD + , Nicotinamide Adenine Dinucleotide) is an indispensable enzyme in the human body and an important coenzyme in the tricarboxylic acid cycle. It plays a crucial role in various biological processes such as human metabolism, stress, and cell differentiation. In patients with T2DM, NAD + synthesis is severely impaired [28, 29]. Many researchers are trying to promote the synthesis of NAD + as a treatment for patients with type 2 diabetes mellitus, and several drugs based on this mechanism have been developed [30, 31]. Similarly, the “Melatonin” is the fifth strongest negative association score with the T2DM (Sas = -0.500 and Sac = 2.00). Melatonin is well known for its sleep-promoting effects. In recent years, studies had found that melatonin has a certain hypoglycaemic effect, and its mechanism might be related to improving insulin resistance, protecting pancreatic β-cells, and regulating the hypothalamus–pituitary–adrenal axis. These findings reveal the importance of melatonin and melatonin-related bacteria and metabolites as potential therapeutic targets for type 2 diabetes [32, 33]. All of these demonstrate that our method can effectively expose the useful disease-metabolite associations.