Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma

Huang, Lin; Wang, Lin; Hu, **aomeng; Chen, Sen; Tao, Yunwen; Su, Haiyang; Yang, **g; Xu, Wei; Vedarethinam, Vadanasundari; Wu, Shu; Liu, Bin; Wan, **nze; Lou, Jiatao; Wang, Qian; Qian, Kun

doi:10.1038/s41467-020-17347-6

Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma

Article
Open access
Published: 16 July 2020

Volume 11, article number 3556, (2020)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue

Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma

Download PDF

Abstract

Early cancer detection greatly increases the chances for successful treatment, but available diagnostics for some tumours, including lung adenocarcinoma (LA), are limited. An ideal early-stage diagnosis of LA for large-scale clinical use must address quick detection, low invasiveness, and high performance. Here, we conduct machine learning of serum metabolic patterns to detect early-stage LA. We extract direct metabolic patterns by the optimized ferric particle-assisted laser desorption/ionization mass spectrometry within 1 s using only 50 nL of serum. We define a metabolic range of 100–400 Da with 143 m/z features. We diagnose early-stage LA with sensitivity~70–90% and specificity~90–93% through the sparse regression machine learning of patterns. We identify a biomarker panel of seven metabolites and relevant pathways to distinguish early-stage LA from controls (p < 0.05). Our approach advances the design of metabolic analysis for early cancer detection and holds promise as an efficient test for low-cost rollout to clinics.

Artificial intelligence and machine learning in precision and genomic medicine

Article 15 June 2022

Recent advances in lung cancer research: unravelling the future of treatment

Article 06 April 2024

Applications and Techniques of Machine Learning in Cancer Classification: A Systematic Review

Article Open access 11 September 2023

Introduction

Early diagnosis improves the survival rates of many types of cancer. For lung adenocarcinoma (LA), which accounts for almost half of all lung cancers and has a mortality rate up to 80%, early diagnosis can increase the 5-year survival rate to 52% and reduce the costs of management of the disease¹. However, conventional diagnostics using proteomic/genomic biomarkers or in vivo imaging are limited considering the detection throughput, diagnosis accuracy, analysis speed, and sampling invasiveness, particularly for early-stage LA^2,3.

Serum analysis holds promise for early diagnosis of LA⁴ and is superior to traditional biopsy and computed tomography (CT) methods⁵, because serum analysis is non-invasive and low-cost for point-of-care testing (POCT)^6,7 and has the desirable adaptability for universal applications. Most current serum analysis for the diagnosis of LA relies on selected genomic^8,9 or proteomic¹⁰ biomarkers with limited sensitivity and specificity.

Metabolic serum analysis is more distal over genomic and proteomic approaches for precision diagnostics^11,12,13, but it has rarely been reported or studied for complex diseases such as LA, due to the lack of efficient metabolite detection tools and systematically designed patient sub-groups. Changes in metabolism are associated with diverse diseases including LA^6,14. Specifically, malignant transformations are associated with altered metabolic pathways for biosynthetic and bioenergetic processes, which depict an adjustment in blood metabolomics. Serum metabolite-guided approach has been applied to detect blood metabolic fingerprints and to identify biomarkers in various diseases, including pancreatic adenocarcinoma¹⁵, acute myeloid leukaemia¹⁶, and hepatic steatosis¹⁷, etc. These changes can be used for diagnostic purposes, hence the intense interest in extracting and deciphering serum metabolic information. Therefore, it is urgent to construct an advanced analytical tool for the metabolic screening of early-stage diseases, including LA.

Spectrometry methods, including nuclear magnetic resonance (NMR)¹⁸ and mass spectrometry (MS), particularly laser desorption/ionization (LDI) MS, enable high-throughput extraction and measurement of metabolomic information, while tandem MS allows accurate identification of metabolites¹⁹. However, the metabolite abundance and sample complexity affect MS analysis, and rigorous pre-treatment procedures are required for enrichment and separation of metabolites from complex bio-mixtures.

Substrates decide the efficacy of LDI MS. The tailoring of material interfaces optimizes designed interactions between molecules and substrate materials for analytical use^20,21. For LDI MS, there have been global efforts, including ours, to engineer substrate materials^22,23,24. An ideal substrate material for LDI MS-based metabolic analysis should have the following properties: (1) nanoscale surface roughness with stability for the selective LDI of metabolites²⁵; (2) favourable surface charge for ion formation and conductivity for electron transfer²⁶; and (3) easy preparation with low costs for mass production aimed at large-scale clinic use. The current materials being used, including noble metals^27,28, silicon²⁶, carbon²⁹, metal oxides²³, and their hybrids, only have some of these properties, so novel material-based platforms combining all of the above merits are a pressing need for the practical use of LDI MS in clinics.

A further challenge is the processing of MS big data in serum samples to obtain the necessary accuracy. Machine learning of imaging and omic information has enjoyed huge success for diagnostic use in clinics³⁰. Compared with in vivo imaging and biopsy methods^31,32 that require expensive and invasive equipment, in vitro omics diagnostic methods are advantageous, although they require big data. As one of the major tools for omic information collection, MS techniques^33,34 (such as MasSpec Pen for cancer tissues) have afforded big data for processing and interpretation by machine learning. Notably, the selection and optimization of algorithms are required to apply machine learning in disease diagnostics.

Due to the biological significance of small metabolites (molecular weight (MW) <1000 Da) as end products of pathways and limitation performance of LDI MS in complex bio-mixtures, tackling the major problems in sample treatment, substrate materials, and data analysis for MS will lead to insights into metabolic pathways and identify effective diagnostic metabolic biomarkers. Here, we optimize the LDI MS approach to analyse a large range of metabolites (including biologically relevant metabolites) as metabolic patterns from serum samples without pretreatment by improving the substrate used. Further encoded by machine-learning algorithm, the serum metabolic patterns achieve high specificity and sensitivity diagnosis of early-stage LA and enable large-scale and low-cost rollout for use in clinics. Our approach contributes to the design of advanced metabolic analysis protocols for use in the development of precision medicine, and will lead to the development of personalized diagnostic tools for diverse diseases including but not limited to LA in the near future.

Results

Optimization of substrate material for selective LDI MS

To enable efficient extraction of serum metabolic patterns by LDI MS, we first prepared ferric particles using a modified low-cost solvo-thermal method, yielding ~0.5 g of product from a single experiment (Fig. 1 and Supplementary Fig. 1a). Ferric particles consisted of nanocrystals (~5 nm diameter) as shown by transmission electron microscopy (TEM) (Fig. 1a). High-resolution TEM (HR-TEM) (Supplementary Fig. 1b) demonstrated the polycrystalline structure of the ferric particles (Supplementary Fig. 1b) in addition to the diffraction pattern of the particles by selected area electron diffraction (SAED, inset of Fig. 1a). By scanning electron microscopy (SEM), we observed a raspberry-like morphology of the ferric particles, which were of uniform size (~300 nm diameter, polydispersity index (PDI) of 0.155) and had a rough surface (Fig. 1b and inset), which agreed with the TEM and dynamic light scattering (DLS) results (Supplementary Fig. 1c). These particles exhibited a large surface area of 154 m² g⁻¹ (Supplementary Fig. 1d) validating the existence of crevices on the rough surface to selectively accommodate metabolites other than proteins, and could undergo simple and fast (~45 s) separation with a magnet due to the superparamagnetic property (Supplementary Fig. 1e). We investigated the laser absorption properties of particles and showed strong absorption in the ultraviolet–visible region of 270–1100 nm (Supplementary Fig. 1f). We concluded that these ferric particles with designer structure might be ideal as a matrix for LDI MS.

**Fig. 1: Substrate material characteristics and schematics of extraction and machine-learning workflow.**

Optimizing the surface charge of substrate particles is critical for the LDI MS process of extracting serum metabolic patterns to allow ion formation and conductivity for electron transfer (Fig. 1c). We controlled the surface charge of the ferric particles during synthesis (Supplementary Fig. 2a), demonstrating that negatively charged particles with a zeta potential of –11.5 ± 2.65 mV produced by 0.4 g trisodium citrate afforded the optimized serum metabolite profile in LDI MS (Supplementary Fig. 2b) due to the enhanced formation of a positive metal ion layer on the surface to produce cation-adducted species. From 0 to 0.4 g of trisodium citrate, the metabolite signals with a signal-to-noise ratio (S/N) > 3 increased in number. Further increasing the amount of trisodium citrate resulted in no improvement in the number of metabolite signals. In addition, the ferric particles we produced had a specific band gap of <3 eV, with specific ultraviolet absorption that could be easily excited (from ground state E₀ to excitation state E₁) by a 355 nm laser for facile electron transfer during ionization (Fig. 1c).

We also compared LDI MS results using the conventional organic matrix (α-cyano-4-hydroxycinnamic acid, CHCA) and inorganic matrices (silica and carbon nanoparticles) together with blank controls using no matrices, showing either strong interference in low mass range or limited sensitivity/selectivity in the analysis of bio-samples to demonstrate the superiority of our approach (Supplementary Fig. 3). Specifically, as control experiments, we observed no signals by LDI MS without any matrix due to low LDI efficiency (Supplementary Fig. 3a). We obtained overwhelming background noises with few peaks from small metabolites using the organic matrix (CHCA) and carbon particles (Supplementary Fig. 3b, c) and could only recognize glucose signal using silica nanoparticles (Supplementary Fig. 3d), all of which demonstrated the advantages of ferric particles over current matrices. Notably, the rough surface of the particles offered abundant cavities for the selective and sensitive LDI of small metabolites in the presence of salts and proteins (Supplementary Fig. 4a–c), while the stable crystalline structure prevented unwanted fragmentation under laser irradiation. The features of the ferric particles that we designed promised the efficient extraction of metabolic patterns from complex fluids (e.g. serum) based on selective LDI that would enable subsequent data analysis (Fig. 1d).

There are four major aspects as rationales to select ferric particles as the substrate for our described method, including photo-thermal properties, preparation process, structural stability, and experimental cost. For photo-thermal properties, ferric particles show strong laser absorption (absorption coefficients at 355 nm as ~3.6 × 10⁵ cm⁻¹) and low thermal conductivity (heat capacity as 653 J (kg K)⁻¹). Thus, ferric particles can be heated to a high temperature by the laser irradiation, towards the efficient molecular desorption^35,36. For preparation process, the solvo-thermal method required is facile to synthesize the ferric particles and the yield of ~0.5 g of product can be used to detect ~10⁶ samples for large-scale clinical use. For comparison, the preparation of various types of silicon substrates requires complicated devices and procedures, such as micro-electro-mechanical system (MEMS)³⁷. For structural stability, ferric particles with stable polycrystalline structure prevented unwanted fragmentation under laser irradiation, compared to carbon nanomaterials (Supplementary Fig. 3c) that produced unavoidable carbon cluster peaks in the low MW region at high laser fluence^38,39. For experimental cost, the ferric particles (~£0.05 g⁻¹) are much cheaper, compared with noble metals (~£36.36 g⁻¹ for gold), silicon (~£3.59 g⁻¹), and carbon (~£0.30–43.72 g⁻¹).

Extraction of serum metabolic patterns

Having optimized the substrate, we tested the ability of ferric particle-assisted LDI MS, to extract serum metabolic patterns from patients. A total of 481 serum samples from 200 patients with early-stage LA, 200 healthy controls, 36 patients with other lung cancer, and 45 with benign lung diseases were included. The blood was drawn at initial diagnosis, without surgery or anaesthesia. The blood collection for each subject enroled in this project was following the same protocol. We also included power analysis (a universal method to derive the optimal sample size by estimating statistical power in a hypothesis test) on a dataset from a pilot study of 12 samples (6/6, LA/control) to compute the minimum sample number required for the meaningful machine learning (Supplementary Fig. 5). Based on the power analysis result, the minimum number of samples was 200 (100/100, LA/control) with predicted power ~0.8 at a false discovery rate (FDR) of 0.1, which can be a sufficient confidence level to conclude the statistical meaningful results according to previous refs. ^40,45,46. For the double-blind test, we demonstrated the discriminant performance (AUC of 0.915) by double-blind test in diagnosis was consistent with the results (AUC of 0.921) by cross-validation in classifier building. Notably, the double-blind test cohort was independently enrolled, decreasing the risk of model overfitting and refusing overly optimistic results. The consistency between double-blind test and cross-validation further guaranteed a robust model without overfitting, according to previous reports^46,47. Recently reported proteomic and genomic approaches (with AUC of ~0.6–0.9) require time-consuming (~hours) reactions (e.g. immunoassay and polymerase chain reaction) that are not ideal for routine clinical use^4,48. For comparison, our metabolic approach provided desirable analytical performance (speed of ~seconds) and diagnostic performance (AUC of ~0.9) for early-stage LA detection in serum, demonstrating that computer-aided diagnosis based on serum metabolic patterns detects early-stage LA.

Construction of the metabolic biomarker panel

We further set out to find metabolic biomarkers (also as potential therapeutic targets) in patterns to characterize relevant pathways. We identified a biomarker panel of seven metabolites (<400 Da) based on accurate mass measurement (for both Na⁺- and K⁺-adducted signals) and tandem MS (Fig. 4a, Supplementary Figs. 9–15, Supplementary Table 5), accounting for an AUC of 0.894 (Supplementary Fig. 16a). The panel consisted of: uracil (Ura), histamine (His), cysteine (Cys), 3-hydroxypicolinic acid (HPA), uric acid (UA), indoleacrylic acid (IA), and fatty acid (FA) (18:2). Notably, a strong Pearson correlation between Na⁺-adducted and K⁺-adducted signals (>0.5) for the seven metabolites validated the presence and role of these metabolites as biomarkers (Fig. 4b, Supplementary Fig. 17). Specifically, we computed the odds ratios of the metabolic biomarkers in a logistic regression model (referred to the basic model) and adjusted for age and sex, according to previous reports⁴⁹. As a result, age and sex were not significant covariates for any metabolic biomarker and thus the seven metabolites retained significant odds ratios (≠ 1) when adjusted for age and sex (Supplementary Table 6). The localized mass spectra and scatter plots for serum metabolic patterns showed significant differences (p < 0.05, Supplementary Figs. 18 and 19) between early-stage LA and healthy controls for each biomarker.

**Fig. 4: Construction of metabolic biomarker panel.**

There are two aspects regarding the breadth of metabolites, including both chemical (molecular structure) and physical (molecular size) properties. For molecular structure, metabolites containing polar functional groups (like hydroxyl group) can be cationized on the surface of ferric particles, through the dipole–dipole interaction^50,51. Therefore, our approach exploits an ability to produce cation (Na⁺, K⁺)-adducted metabolite species for polar compounds (e.g. amino acids, polyamines, carbohydrates, organic acids, nucleosides, etc.). For molecular size, only small metabolites (MW < 1000 Da) can be selectively accommodated and trapped by the nano-crevices (~nm) of ferric particles, due to the size-exclusive effect as demonstrated in literatures^22,52. Therefore, the surrounding alkali metal ions in the nano-crevices may facilitate efficient LDI of small metabolites typically with MW < 1000 Da. Notably, we did not observe H⁺-adducts by using ferric particle-assisted LDI MS, which was validated by the standard molecule detection (Supplementary Fig. 20) and consistent with previous reports^35,53. Importantly, to further investigate the ion adduction process and characterize the competing adduction effect regarding H⁺/Na⁺/K⁺, we performed quantum simulation with density functional theory (DFT) calculation to the exposed surface [1,1,1] of ferric particles (Supplementary Fig. 21). The binding affinity of H⁺ is −13.6 eV (Fig. 4c) on the surface of ferric particles, much higher than those of Na⁺ (−4.7 eV, Fig. 4d) and K⁺ (−4.0 eV, Fig. 4e), hindering the cation transfer to analytes and coupled cationization.

Notably, we found that uracil (increases of 3.36-fold) and UA (increases of 2.95-fold) were the most highly altered species with over expression, while HPA was the most highly altered specie with down expression (Fig. 4f). Principle component analysis (PCA) of these seven metabolites (Supplementary Fig. 16b) displayed enhanced clustering, compared with that of all 161 m/z features (Supplementary Fig. 16c) between early-stage LA and healthy controls. Single one of these biomarkers cannot be very useful in discriminating disease from control samples. Only poor AUC (<0.7) can be acquired by univariate receiver operating characteristic (ROC) curve analysis for single one of these biomarkers (Supplementary Table 5). Importantly, the combination of seven biomarkers together accounted for an enhanced AUC of 0.894 by multivariate ROC curve analysis, in differentiating early-stage LA from healthy controls (Supplementary Fig. 16a), compared to the poor diagnostic performance by single one of these biomarkers (AUC < 0.7). Therefore, we concluded that the panel of seven biomarkers was useful in discriminating disease from control samples. The success can be attributed to that multivariate analysis by combined biomarkers is superior to univariate analysis by one single biomarker, which had been well established and recognized in literatures^4,54. The construction of the biomarker panel facilitated the simple analysis and large-scale use of our approach in clinics.

We also performed further data analysis to demonstrate the metabolic differences and similarities, among early-stage LA and other lung cancers/benign diseases (Supplementary Table 1). For metabolic differences, we identified another two new panels of metabolites based on the metabolic patterns, to differentiate early-stage LA from other lung cancers/benign diseases. Notably, the two panels showed superior diagnostic performance, due to the metabolic differences related to disease phenotypes (Supplementary Fig. 22, Supplementary Tables 7 and 8). For metabolic similarities, we identified the overlap** metabolites that were differentially expressed, among early-stage LA and other lung cancers/benign diseases. Specifically, in the differentiation of early-stage LA and other lung cancers from healthy controls, we observed that Ura and IA were the overlap** metabolites. In parallel, in the differentiation of early-stage LA and other lung diseases from healthy controls, we observed that IA was the overlap** metabolite. Due to the pathological process of lung diseases and altered metabolic pathways, the metabolic similarities reflected the systematic response to diseases.

In-silico interrogation of potentially altered metabolic pathways (Fig. 4g, Supplementary Table 9) were analysed by the pathway topology analysis in MetaboAnalyst (http://www.metaboanalyst.ca/), displaying the major metabolic contributions from nucleotides (Ura and UA), FA, organic acids (Cys, HPA, and IA), and active amine (His). Specifically, the differential expression of Ura and UA (the nucleotide metabolism intermediate metabolites) reflected metabolic adaptation to the increased transcriptional activity and differential regulation of purine and pyrimidine metabolism due to cancer cell proliferation^11,18. The abnormal expression of FA fit with the current theory that FA degradation is reduced in tumour cells^12,34, which was the pathway with the most significant impact (0.656). Among the organic acids correlated with protein and energy metabolism disorders, the changes in Cys, HPA, and IA suggested differential regulation of cysteine and methionine metabolisms, and sulfur metabolism caused by the greatly increased biosynthesis of proteins and abnormal activation of degradation enzymes during tumour growth^12,55. Finally, active amine (His) is involved in allergy and inflammation, which are involved in the cancer initiation process^56,57. Moreover, we found six metabolic pathways were shared both in early-stage LA and other lung cancers, including (1) beta-alanine metabolism, (2) pyrimidine metabolism, (3) pantothenate and CoA biosynthesis, (4) glycine, serine, and threonine metabolism, (5) taurine and hypotaurine metabolism, and (6) histidine metabolism (Supplementary Fig. 23a). Similarly, we found (1) histidine metabolism and (2) pyrimidine metabolism were shared both in early-stage LA and benign lung diseases (Supplementary Fig. 23b). Together, we concluded that the commonly altered metabolisms were observed in lung diseases, also as demonstrated in literatures^58,59.

Pathway topology analysis has been widely applied in biomedical research and depends on the metabolite importance and metabolite number. For metabolite importance, the importance of one compound is estimated by its centrality measure (node or edge), in a given metabolic network according to literatures^40,68. Chromatography was performed on an Agilent Technologies Acquity UPLC system. Mass spectrometric detection was carried out using an Agilent Technologies Xevo G2-XS QTOFMS mass spectrometer equipped with an ESI source.

Preparation of clinical samples

A total of 481 subjects were consecutively recruited from 2014 to 2019 in Shanghai Chest Hospital, including 200 patients suffering early-stage LA and 200 healthy controls undergoing routine health care maintenance, 36 patients with squamous carcinoma (including squamous cell carcinoma and small cell carcinoma), and 45 patients with benign lung diseases (including pneumonia, hamartoma, pulmonary tuberculosis, granuloma, and others). All patients were diagnosed by a panel of pathologists together and the tumours staged according to the international standards for TNM staging of lung cancer. The pathologists were blind to any information about the acquisition from MS analysis. Patients were excluded from the study if they had evidence of autoimmune syndromes or drugs. The blood was drawn at initial diagnosis without surgery or anaesthesia. All blood samples were drawn by venepuncture and clotted at room temperature within 40 min¹⁶. Serum samples were obtained by centrifuging at 5100×g and 4 °C for 10 min. After centrifugation, the precipitate was discarded and the supernatant serum was stored at −80 °C immediately (within 15 min). The elapsed time was within 1 h between blood draw, centrifugation, and ultimate storage at −80 °C⁶⁹.

To validate the classification of early-stage LA and healthy controls, we recruited an independent double-blind test cohort from Shanghai Chest Hospital, with serum samples from 58 subjects (23/35, early-stage LA/healthy controls). The situations for blood drawn were the same for all subjects.

All the investigation protocols in this study were approved by the institutional ethics committees of the Shanghai Chest Hospital and School of Biomedical Engineering, SJTU (KS1736). All subjects provided written informed consent to participate in the study and approved the use of their biological samples for analysis, according to the Helsinki Declaration.

Machine learning and computer-aided diagnosis

Considering the large size of MS data, the sparse learning and regression model was employed for the diagnosis of subjects. Models generated can be simpler to interpret duet to the “sparse” models (involving only a subset of the features). Given a set of training subjects, we defined the matrix X = {⋯, x_i,⋯}, where each row recorded the serum metabolic patterns (mass spectra) of the corresponding subject. The disease labels (i.e., ‘1’ for early-stage LA, ‘0’ for healthy control) of the training subjects were known already and were vectorized into the column vector $\overrightarrow {\mathbf{y}} = \left( { \cdots ,\overrightarrow {{\mathbf{y}}_{\mathbf{i}}} , \cdots } \right)\prime$ accordingly. The l₁-norm (and the squared l₂-norm) regularized logistic regression model could thus be acquired by solving the following:

$$\min _{\overrightarrow \beta ,c}\mathop {\sum }\limits_{i = 1}^m \ln \left( {1 + {\mathrm{{e}}}^{ - \overrightarrow {\mathbf{y}} _i\left( {x_i\overrightarrow {\beta} + c} \right)}} \right) + \frac{{\lambda _1}}{2}\left\| {\overrightarrow {\beta} } \right\|_{l_1} + \lambda _2\left\| {\overrightarrow {\beta} } \right\|_{l_2}$$

(1)

where λ₁ was the l₁-norm regularization parameter enforcing the sparsity constraint, and λ₂ was the regularization parameter for the squared l₂-norm. The model chose a limited number of m/z features by adjusting l₁-norm to attenuate the coefficients of the less significant features to 0, and fit the disease labels of the training subjects according to the selected m/z features. A mathematical weight for each statistically informative feature was calculated depending on the importance of the mass spectral feature in differentiating early-stage LA versus healthy control. The regression model was applicable to infer the disease label of a new test subject and provided a prediction score for each pattern of a test sample. Specifically, we detected x_test and computed $\overrightarrow {\mathbf{y}} _{{\mathrm{test}}} = {\mathbf{x}}_{{\mathrm{test}}}^\prime \cdot \vec \beta + c$. The outcome was thresholded and converted to a diagnosis.

For a typical machine-learning-based diagnosis, five mass spectra obtained for each sample were used to build molecular databases. Pre-processing of the raw mass spectra data, including baseline correction, peak detection, extraction, alignment, normalization, and standardization, was carried out by MATLAB (R2016a, The MathWorks, Natick, MA) prior to pattern recognition analysis. The total number of metabolite signals for each mass spectrum was detected, and then, m/z features were selected based on the Otsu algorithm and utilized in the subsequent analysis.

To build the classifier model and evaluate the performance, a five-fold cross-validation approach was performed to estimate the performance of the predictor for both the inner-loop and outer cross-validation (20 rounds for each fold, thus 100 models for outer cross-validation in total). The performance of the classifiers was measured based on the receiver operation curve (ROC) by the area under curve (AUC), calculating the proportions of concordant pairs among all pairs of observations, with 1 indicating perfect prediction accuracy.

To validate the discriminant performance of the built classifier on an external double-bind test cohort for differentiating early-stage LA from healthy controls, 58 samples (23/35: LA/healthy controls) were enrolled. The disease labels of the double-bind test cohort were unknown and predicted by the classifier. Further comparing the predicted disease labels with the true disease status, we computed the sensitivity, specificity, and AUC. A step-by-step protocol describing the preparation of ferric particles, MS data acquisition, clinical sample preparation, and computer-aided diagnosis can be found at Nature Protocol Exchange⁷⁰.

Potential biomarker identification

To identify the metabolic panel that contributed the most to diagnosis, two major aspects were considered for the 100 tuned models. First, we ranked the m/z features according to the model selected frequency and chose the top m/z features with repeat occurrence over 90% in 100 models. In parallel, we selected m/z features with a p-value < 0.05 according to two-sided Student’s t-test. Verification of the metabolites that were both frequently occurring and displayed a significant difference between early-stage LA and healthy control was conducted manually by m/z feature selection using the human metabolome database (HMDB, http://www.hmdb.ca/) and subsequent validation by tandem MS and accurate mass measurement (for both Na⁺-adducted and K⁺-adducted signals). Pearson correlations were computed between the Na⁺-adducted and K⁺-adducted signals of metabolites. The differential metabolomic profiles reflecting their respective biochemical pathways were analysed by MetaboAnalyst (http://www.metaboanalyst.ca/).

Statistical analysis

Multivariate statistics were performed using the SIMCA software package (version 14.0, Umetrics, Umeå, Sweden). Before analysis, all mass spectra were scaled to Pareto (par) by dividing variables using the square root of the standard deviation when centring was completed. All covariates were tested, including age and sex. Logistic regression model was fit to evaluate the association of metabolic biomarkers with the presence of early-stage LA. Odds ratios with 95% confidence interval (CI) were calculated for metabolic biomarkers (including histamine, uracil, cysteine, HPA, UA, IA, and FA (18:2)). Before the analysis, all metabolites were centred and standardized to have a mean of 0 and a standard deviation of 1. Age and sex were added as covariates to the basic logistic regression model to calculate the adjusted odds ratios. An unsupervised principal component analysis (PCA) model was constructed from a number of principal components (PCs, orthogonal transformation of m/z features into linearly uncorrelated variables). All the statistical models above were manually optimized. The transformation was defined that the first PC accounted for the largest variance (as much of the variability in the dataset as possible). From the results of PCA analysis, we can obtain a PCA score plot, by visualizing the first two PCs in a two-dimensional space. To quantify the reproducibility of clinical serum samples, the p value for the normal distribution test (Lilliefors (Kolmogorov–Smirnov) test) was acquired through the lillietest function in MATLAB, with the null hypothesis at the default 5% significance level.

Power analysis was performed by uploading 12 samples (6/6: LA patients/healthy controls) as the pilot metabolomic data into MetaboAnalyst at a FDR of 0.1. As the result, the predicted power for estimating the effect sample size was set as 0.8^40,41. To investigate the spectra similarity within one group, we computed the similarity scores for each group (both early-stage LA and healthy controls). Typically, one experimental spectrum obtained from a serum sample for different cohorts was randomly selected and fixed as the reference spectrum. The other experimental spectra within the same cohort were compared with the reference spectrum, and spectral similarity scores were calculated. The similarity score between two mass spectra (i and j) was calculated by cosine correlation method following a reported algorithm⁴⁴ defined as

$${\mathrm{{cos}}} = \frac{{\overrightarrow {{\mathbf{y}}_{\boldsymbol{i}}} \cdot \overrightarrow {{\mathbf{y}}_{\boldsymbol{j}}} }}{{\left| {\overrightarrow {{\mathbf{Y}}_{\boldsymbol{i}}} } \right| \cdot \left| {\overrightarrow {{\mathbf{Y}}_{\boldsymbol{j}}} } \right|}} = \frac{{\mathop {\sum }\nolimits_{k = 1}^l y_{ik}y_{jk}}}{{\sqrt {\mathop {\sum }\nolimits_{t = 1}^{n_i} Y_{it}^2} \cdot \sqrt {\mathop {\sum }\nolimits_{t = 1}^{n_j} Y_{jt}^2} }}$$

(2)

where y was the normalized intensity of a peak appearing in both spectrum i and spectrum j (an identical peak), l was the number of identical peaks in the two spectra, Y was the normalized intensity of a peak appearing in a spectrum and n was the number of peaks in a spectrum.

Other statistical analyses in this work were performed by using SPSS software (version 19.0, SPSS Inc., USA) to calculate the p value for statistical demonstration, including two-sided Student’s t-test and one-way ANOVA. All significance level was set as 5%. Specifically, the means comparison in one-way ANOVA was based on Bonferroni corrections.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The verification of the metabolites in this study was achieved by comparing the m/z features with human metabolome database (HMDB, http://www.hmdb.ca/). The data that support the findings of this study are available from the corresponding author upon reasonable request. Source data are provided with this paper.

Code availability

The custom computer codes utilized during the current study are available from the corresponding author upon reasonable request, due to the competing financial interests. Source data are provided with this paper.

References

Reck, M. & Rabe, K. F. Precision diagnosis and treatment for advanced non-small-cell lung cancer. N. Engl. J. Med. 377, 849–861 (2017).
CAS PubMed Google Scholar
Zhang, M. et al. Bright quantum dots emitting at similar to 1,600 nm in the NIR-IIb window for deep tissue fluorescence imaging. Proc. Natl Acad. Sci. USA 115, 6590–6595 (2018).
CAS PubMed Google Scholar
Lim, C. T. Future of health diagnostics. View 1, e3 (2020).
Google Scholar
Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018).
ADS CAS PubMed PubMed Central Google Scholar
Henschke, C. I. et al. Survival of patients with stage I lung cancer detected on CT screening. N. Engl. J. Med. 355, 1763–1771 (2006).
PubMed Google Scholar
Goodwin, J. et al. The distinct metabolic phenotype of lung squamous cell carcinoma defines selective vulnerability to glycolytic inhibition. Nat. Commun. 8, 15503 (2017).
ADS CAS PubMed PubMed Central Google Scholar
Sathish, S. et al. Proof-of-concept modular fluid handling prototype integrated with microfluidic biochemical assay modules for point-of-care testing. View 1, e1 (2020).
Google Scholar
Gootenberg, J. S. et al. Multiplexed and portable nucleic acid detection platform with Cas13, Cas12a, and Csm6. Science 360, 439–444 (2018).
ADS CAS PubMed PubMed Central Google Scholar
Rosell, R. et al. Genetics and biomarkers in personalisation of lung cancer treatment. Lancet 382, 720–731 (2013).
CAS PubMed Google Scholar
The Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).
ADS PubMed Central Google Scholar
Banerjee, S. et al. Diagnosis of prostate cancer by desorption electrospray ionization mass spectrometric imaging of small metabolites and lipids. Proc. Natl Acad. Sci. USA 114, 3334–3339 (2017).
CAS PubMed Google Scholar
DeBerardinis, R. J. & Chandel, N. S. Fundamentals of cancer metabolism. Sci. Adv. 2, e1600200 (2016).
ADS PubMed PubMed Central Google Scholar
Xu, W. et al. Diagnosis and prognosis of myocardial infarction on a plasmonic chip. Nat. Commun. 11, 1654–1654 (2020).
ADS CAS PubMed PubMed Central Google Scholar
Liu, J. et al. A biomimetic plasmonic nanoreactor for reliable metabolite detection. Adv. Sci. 7, 1903730 (2020).
Google Scholar
Mayers, J. R. et al. Elevation of circulating branched-chain amino acids is an early event in human pancreatic adenocarcinoma development. Nat. Med. 20, 1193–1198 (2014).
CAS PubMed PubMed Central Google Scholar
Chen, W. L. et al. A distinct glucose metabolism signature of acute myeloid leukemia with prognostic value. Blood 124, 2893–2893 (2014).
CAS Google Scholar
Hoyles, L. et al. Molecular phenomics and metagenomics of hepatic steatosis in non-diabetic obese women. Nat. Med. 24, 1070–1080 (2018).
CAS PubMed PubMed Central Google Scholar
Jain, M. et al. Metabolite profiling identifies a key role for glycine in rapid cancer cell proliferation. Science 336, 1040–1044 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Wishart, D. S. Emerging applications of metabolomics in drug discovery and precision medicine. Nat. Rev. Drug Discov. 15, 473–484 (2016).
CAS PubMed Google Scholar
Gasilova, N. et al. On-chip spyhole mass spectrometry for droplet-based microfluidics. Angew. Chem. Int. Ed. 53, 4408–4412 (2014).
CAS Google Scholar
Li, X. & Wang, C. The potential biomedical platforms based on the functionalized Gd@C82 nanomaterials. View 1, e7 (2020).
Google Scholar
Huang, L. et al. Plasmonic silver nanoshells for drug and metabolite detection. Nat. Commun. 8, 220 (2017).
ADS PubMed PubMed Central Google Scholar
Wu, J. et al. Multifunctional magnetic particles for combined circulating tumor cells isolation and cellular metabolism detection. Adv. Funct. Mater. 26, 4016–4025 (2016).
CAS PubMed PubMed Central Google Scholar
Zhu, Y. et al. Detection of antimicrobial resistance-associated proteins by titanium dioxide-facilitated intact bacteria mass spectrometry. Chem. Sci. 9, 2212–2221 (2018).
CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Urine metabolic fingerprints encode subtypes of kidney diseases. Angew. Chem. Int. Ed. 59, 1703–1710 (2020).
CAS Google Scholar
Lim, A. Y. et al. Development of nanomaterials for SALDI-MS analysis in forensics. Adv. Mater. 24, 4211–4216 (2012).
CAS PubMed Google Scholar
Chiang, C.-K. et al. Nanoparticle-based mass spectrometry for the analysis of biomolecules. Chem. Soc. Rev. 40, 1269–1281 (2011).
CAS PubMed Google Scholar
Liu, Y.-C. et al. Using a functional nanogold membrane coupled with laser desorption/ionization mass spectrometry to detect lead ions in biofluids. Adv. Funct. Mater. 21, 4448–4455 (2011).
CAS Google Scholar
Lee, J. et al. Laser desorption/ionization mass spectrometric assay for phospholipase activity based on graphene oxide/carbon nanotube double-layer films. J. Am. Chem. Soc. 132, 14714–14717 (2010).
CAS PubMed Google Scholar
Hong, G. et al. Near-infrared fluorophores for biomedical imaging. Nat. Biomed. Eng. 1, 0010 (2017).
ADS CAS Google Scholar
Katki, H. A. et al. Development and validation of risk models to select ever-smokers for CT lung cancer screening. JAMA 315, 2300–2311 (2016).
CAS PubMed PubMed Central Google Scholar
Wang, W. et al. Molecular cancer imaging in the second near-infrared window using a renal-excreted NIR-II fluorophore-peptide probe. Adv. Mater. 30, 1800106 (2018).
Google Scholar
Li, X.-J. et al. A blood-based proteomic classifier for the molecular characterization of pulmonary nodules. Sci. Transl. Med. 5, 207ra142 (2013).
ADS PubMed PubMed Central Google Scholar
Zhang, J. et al. Nondestructive tissue analysis for ex vivo and in vivo cancer diagnosis using a handheld mass spectrometry system. Sci. Transl. Med. 9, eaan3968 (2017).
PubMed PubMed Central Google Scholar
Yagnik, G. B. et al. Large scale nanoparticle screening for small molecule analysis in laser desorption ionization mass spectrometry. Anal. Chem. 88, 8926–8930 (2016).
CAS PubMed Google Scholar
Chiang, C.-K. et al. Nanomaterial-based surface-assisted laser desorption/ionization mass spectrometry of peptides and proteins. J. Am. Soc. Mass Spectr. 21, 1204–1207 (2010).
CAS Google Scholar
Sim, G.-D. et al. Nanotwinned metal MEMS films with unprecedented strength and stability. Sci. Adv. 3, 1700685 (2017).
ADS Google Scholar
Chu, H.-W. et al. Nanoparticle-based laser desorption/ionization mass spectrometric analysis of drugs and metabolites. J. Food Drug Anal. 26, 1215–1228 (2018).
CAS PubMed Google Scholar
Qian, K. et al. Laser engineered graphene paper for mass spectrometry imaging. Sci. Rep. 3, 1415 (2013).
PubMed PubMed Central Google Scholar
Chong, J. et al. MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic Acids Res. 46, 486–494 (2018).
Google Scholar
**a, J. et al. MetaboAnalyst 3.0-making metabolomics more meaningful. Nucleic Acids Res. 43, 251–257 (2015).
Google Scholar
Otsu, N. A threshold selection method from gray-level histogram. IEEE Trans. Syst. Man Cybern. 9, 62–66 (2007).
Google Scholar
Cao, J. et al. Metabolic fingerprinting on synthetic alloys for medulloblastoma diagnosis and radiotherapy evaluation. Adv. Mater. 32, 2000906 (2020).
CAS Google Scholar
Zhu, Y. et al. Sensitive and fast identification of bacteria in blood samples by immunoaffinity mass spectrometry for quick BSI diagnosis. Chem. Sci. 7, 2987–2995 (2016).
CAS PubMed PubMed Central Google Scholar
Bergmeir, C. et al. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput. Stat. Data 120, 70–83 (2018).
MathSciNet MATH Google Scholar
Moen, E. et al. Deep learning for cellular image analysis. Nat. Methods 16, 1233–1246 (2019).
CAS PubMed Google Scholar
Dorfman, H. M. & Gershman, S. J. Controllability governs the balance between Pavlovian and instrumental action selection. Nat. Commun. 10, 5826 (2019).
ADS CAS PubMed PubMed Central Google Scholar
Bin, L. et al. High performance, multiplexed lung cancer biomarker detection on a plasmonic gold chip. Adv. Funct. Mater. 26, 7994–8002 (2016).
Google Scholar
Zeng, C. et al. Disparities by race, age, and sex in the improvement of survival for major cancers results from the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) program in the United States, 1990 to 2010. JAMA Oncol. 1, 88–96 (2015).
PubMed PubMed Central Google Scholar
Yang, J. et al. Magnetic solid phase extraction of brominated flame retardants and pentachlorophenol from environmental waters with carbon doped Fe3O4 nanoparticles. Appl. Surf. Sci. 321, 126–135 (2014).
ADS CAS Google Scholar
Zakett, D. et al. Laser-desorption mass spectrometry/mass spectrometry and the mechanism of desorption ionization. J. Am. Chem. Soc. 103, 1295–1297 (1981).
CAS Google Scholar
Sun, X. et al. Metabolic fingerprinting on a plasmonic gold chip for mass spectrometry based in vitro diagnostics. ACS Cent. Sci. 4, 223–229 (2018).
CAS PubMed PubMed Central Google Scholar
Hansen, R. L. et al. Sputter-coated metal screening for small molecule analysis and high-spatial resolution imaging in laser desorption ionization mass spectrometry. J. Am. Chem. Soc. 30, 299–308 (2019).
CAS Google Scholar
Ahmad, R. et al. A rapid triage test for active pulmonary tuberculosis in adult patients with persistent cough. Sci. Transl. Med. 11, eaaz9925 (2019).
Google Scholar
Bar-Peled, L. et al. Chemical proteomics identifies druggable vulnerabilities in a genetically defined cancer. Cell 171, 696–709 (2017).
CAS PubMed PubMed Central Google Scholar
Yang, X. D. et al. Histamine deficiency promotes inflammation-associated carcinogenesis through reduced myeloid maturation and accumulation of CD11b+Ly6G+ immature myeloid cells. Nat. Med. 17, 87–95 (2010).
PubMed PubMed Central Google Scholar
Lavin, Y. et al. Innate immune landscape in early lung adenocarcinoma by paired single-cell analyses. Cell 169, 750–765 (2017).
CAS PubMed PubMed Central Google Scholar
Seow, W. J. et al. Association of untargeted urinary metabolomics and lung cancer risk among never-smoking women in China. JAMA Netw. Open 2, 1911970–1911970 (2019).
Google Scholar
Chung, K.-P. et al. Mitofusins regulate lipid metabolism to mediate the development of lung fibrosis. Nat. Commun. 10, 3390 (2019).
ADS PubMed PubMed Central Google Scholar
**a, J. & Wishart, D. S. Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. Nat. Protoc. 6, 743–760 (2011).
CAS PubMed Google Scholar
Molins, C. R. et al. Metabolic differentiation of early Lyme disease from southern tick-associated rash illness (STARI). Sci. Transl. Med. 9, eaa12717 (2017).
Google Scholar
Naviaux, R. K. et al. Metabolic features of chronic fatigue syndrome. Proc. Natl Acad. Sci. USA 114, 3749–3749 (2017).
Google Scholar
Zheng, H. et al. Honeybee gut microbiota promotes host weight gain via bacterial metabolism and hormonal signaling. Proc. Natl Acad. Sci. USA 114, 4775–4780 (2017).
CAS PubMed Google Scholar
Wang, X. et al. Targeting pyrimidine synthesis accentuates molecular therapy response in glioblastoma stem cells. Sci. Transl. Med. 11, eaau4972 (2019).
PubMed Google Scholar
Stöber, W. et al. Controlled growth of monodisperse silica spheres in the micron size range. J. Colloid Interface Sci. 26, 62–69 (1968).
ADS Google Scholar
Neese, F. Software update: the ORCA program system, version 4.0. WIRES Comput. Mol. Sci. 8, e1327 (2018).
Google Scholar
Neese, F. et al. Efficient, approximate and parallel Hartree–Fock and hybrid DFT calculations. A ‘chain-of-spheres’ algorithm for the Hartree–Fock exchange. Chem. Phys. 356, 98–109 (2009).
CAS Google Scholar
Huang, L. et al. A multifunctional platinum nanoreactor for point-of-care metabolic analysis. Matter 1, 1669–1680 (2019).
Google Scholar
Winer, L. et al. SOD1 in cerebral spinal fluid as a pharmacodynamic marker for antisense oligonucleotide therapy. JAMA Neuro 70, 201–207 (2013).
Google Scholar
Huang, L. & Qian, K. Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma. Nat. Protoc. Exch. https://doi.org/10.21203/rs.3.pex-963/v1 (2020).
Article Google Scholar

Download references

Acknowledgements

We are grateful for the financial support from Projects 81971771 and 81771983 by National Natural Science Foundation of China (NSFC), Projects 2017YFE0124400 and 2017YFC0909000 by Ministry of Science and Technology of China, Innovation Group Project of Shanghai Municipal Health Comission (2019CXJQ03), and Project 16CR2011A by Clinical Research Plan of SHDC. This work was also sponsored by the Shanghai Rising-Star Programme (19QA1404800) and Programme for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning.

Author information

Authors and Affiliations

State Key Laboratory for Oncogenes and Related Genes, School of Biomedical Engineering, Shanghai Jiao Tong University, 200030, Shanghai, P. R. China
Lin Huang, Haiyang Su, **g Yang, Wei Xu, Vadanasundari Vedarethinam, Qian Wang & Kun Qian
Department of Laboratory Medicine, Shanghai Chest Hospital, Shanghai Jiao Tong University, 200030, Shanghai, P. R. China
Lin Wang, **aomeng Hu & Jiatao Lou
iMS Clinic, 310052, Hangzhou, P. R. China
Sen Chen, Shu Wu, Bin Liu & **nze Wan
Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, TX, 75275-0314, USA
Yunwen Tao

Authors

Lin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
**aomeng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Sen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yunwen Tao
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Su
View author publications
You can also search for this author in PubMed Google Scholar
**g Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Vadanasundari Vedarethinam
View author publications
You can also search for this author in PubMed Google Scholar
Shu Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar
**nze Wan
View author publications
You can also search for this author in PubMed Google Scholar
Jiatao Lou
View author publications
You can also search for this author in PubMed Google Scholar
Qian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kun Qian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.Q. planned this work and designed the overall approach with L.H., L.W., J.L., and Q.W. L.H., and X.H. carried out experiments and wrote the manuscript. W.S., B.L., X.W., and J.L. helped with sample collection. S.C., Y.T., H.S., W.X., V.V., J.Y., and Q.W. contributed to the data analysis. All authors joined in the critical discussion and edited the manuscript.

Corresponding author

Correspondence to Kun Qian.

Ethics declarations

Competing interests

The authors declare the following competing interests. The authors have filed patents for both the technology and the use of the technology to detect bio-samples.

Additional information

Peer review information Nature Communications thanks Paul Hofman, Jason Locasale and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, L., Wang, L., Hu, X. et al. Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma. Nat Commun 11, 3556 (2020). https://doi.org/10.1038/s41467-020-17347-6

Download citation

Received: 22 January 2020
Accepted: 24 June 2020
Published: 16 July 2020
DOI: https://doi.org/10.1038/s41467-020-17347-6
Springer Nature Limited

Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma

From

Abstract

Similar content being viewed by others

Artificial intelligence and machine learning in precision and genomic medicine

Recent advances in lung cancer research: unravelling the future of treatment

Applications and Techniques of Machine Learning in Cancer Classification: A Systematic Review

Introduction

Results

Optimization of substrate material for selective LDI MS

Extraction of serum metabolic patterns

Construction of the metabolic biomarker panel

Preparation of clinical samples

Machine learning and computer-aided diagnosis

Potential biomarker identification

Statistical analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Reporting summary

Source data

Source Data

Rights and permissions

About this article

Cite this article

Navigation

Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma

From

Abstract

Similar content being viewed by others

Artificial intelligence and machine learning in precision and genomic medicine

Recent advances in lung cancer research: unravelling the future of treatment

Applications and Techniques of Machine Learning in Cancer Classification: A Systematic Review

Introduction

Results

Optimization of substrate material for selective LDI MS

Extraction of serum metabolic patterns

Construction of the metabolic biomarker panel

Preparation of clinical samples

Machine learning and computer-aided diagnosis

Potential biomarker identification

Statistical analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Reporting summary

Source data

Source Data

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation