Introduction

While the amount of medical literature that needs to be covered is increasing rapidly every day, the amount of time available to medical specialists is limited. To account for this limitation, the medical field is subdivided and specialized, and a multidisciplinary team board (MTB) has been introduced for smooth communication between specialists in each subdivided medical field. The MTB is a physical or virtual meeting in which different specialists converge to discuss clinical cases and determine the best diagnostic or therapeutic method. An MTB is recommended in a variety of cancer guidelines and is the principal approach for managing cancer patients in many countries [1].

However, owing to the rapid expansion of medical literature and advances in techniques, staying aware of updates is challenging, even for specialists in subdivided medical fields [2]. Thus, it will become increasingly difficult for MTB specialists to present individualized treatment recommendations for cancer patients based on the latest medical research. In this situation, the application of artificial intelligence (AI) in clinical practice could be an ideal solution for assisting MTB in managing patients with cancer.

AI can support clinicians in the diagnosis, treatment, and prognosis of a variety of diseases through the quick analysis of complex medical data. Therefore, AI has the potential to compensate for human shortcomings and provide better treatment recommendations [3, 4]. Furthermore, reliance on AI in the management of cancer patients is expected to increase over time. However, for AI to be applied in actual medical practice, it must be validated using a pre-existing principal method.

To investigate the validity of AI in managing cancer patients, previous studies have compared the treatment recommendations of an MTB and AI and analyzed the concordance rate. AI has shown a consistent and high concordance rate of 80–90% with an MTB for breast and colorectal cancer in many countries. In contrast, the concordance rate of AI for gastric cancer varies according to country [5]. In addition, unlike other cancers, there are only a few studies on using AI for the treatment of gastric cancer, and these were limited as they only included advanced stages of gastric cancer or the sample size was very small for each stage [6,7,8].

Therefore, we performed a large-scale study to investigate factors that lower the concordance rate between medical AI and an MTB in the recommendation of gastric cancer treatment. If these factors are supplemented, it is expected that AI will be able to better assist the MTB as one of the members in actual clinical practice.

Methods

Patients and Study Design

Treatment recommendations for 322 consecutive gastric cancer cases were analyzed and the degree of agreement was compared between Watson for Oncology (WFO) (IBM Watson Health, Cambridge, MA) and a 7-member MTB comprising two surgeons, two medical oncologists, a pathologist, a radiologic oncologist, and a radiologist at the Daegu Catholic University Medical Center (DCUMC), Daegu, South Korea, from January 2015 to December 2018. Patients who fulfilled the following criteria were eligible for this study: (1) diagnosed with gastric cancer and the results confirmed using gastroduodenoscopy and histologic examination of the resected specimens, and (2) surveillance or adjuvant chemotherapy after curative gastrectomy, or palliative chemotherapy in stage IV. The exclusion criteria were as follows: (1) neoadjuvant chemotherapy or conversion surgery; (2) history of other malignancies within 5 years; (3) incomplete data; and (4) below 18 years of age. The gastric cancer stage was determined according to the 7th edition of the American Joint Commission on Cancer (AJCC). The clinicopathological characteristics of the patients are summarized in Table 1. Patients with stage IV disease progression following systemic therapy (second-line and beyond) were also excluded. The study design was reviewed and approved by the Institutional Review Board (IRB) of DCUMC (DCMC-CR-21-017).

Table 1 Baseline characteristics of patients

AI and MTB Concordance Determination

The WFO version 16.4 was used as the medical AI. In gastric cancer treatment, WFO considers not only the disease stage, postoperative pathologic findings, human epidermal growth factor receptor 2 (HER2) status, and the patient’s general condition (critical disease and performance status) but also the age at diagnosis, sex, weight, histologic type, and prior therapy.

WFO provides therapeutic recommendations in three categories: recommended, for consideration, and not recommended. Data were retrospectively analyzed to compare the treatment recommendations of the MTB and AI. MTB recommendations were defined as concordant with WFO if they corresponded to the recommended or for consideration categories and as non-concordant if they corresponded to the not recommended or not available categories. Therapies recommended by the MTB were classified as “not available” if they were not known to the WFO at the time of analysis.

Data Analysis and Statistics

All statistical analyses were performed using SPSS (version 22.0; IBM Corporation, Armonk, NY, USA), with a significance level set at p < 0.05. Concordance was expressed as a percentage agreement. Continuous variables are expressed as mean (standard deviation) and compared using a two-sample t-test. Categorical variables were expressed as frequencies (percentage) and analyzed using the chi-square test. A logistic regression model was used to simultaneously control for the determinants of concordance, and odds ratios and 95% confidence intervals were calculated.

Results

The mean age of the 322 gastric cancer patients was 64.43 years (SD, 12.42). Among them, 285 (88.5%) patients were under 80 years of age and 37 (11.5%) were over 80 years old. The distribution of gastric cancer stages was 163 (50.6%) in stage I, 45 (14%) in stage II, 66 (20.5%) in stage III, and 48 (14.9%) in stage IV (Table 1). The concordance rate between WFO and the MTB was 72.98% at the recommended level (235/322) and 86.96% at the consideration level (280/322) (Fig. 1).

Fig. 1
figure 1

Overall concordance rate between AI and MTB. The concordance rate between WFO and MTB was 72.98% at the recommended level (235/322) and 86.96% at the consideration level (280/322). Abbreviations: AI, artificial intelligence; MTB, multidisciplinary tumor board

In univariate analysis, there were significant differences in age (p = 0.03), sex (p = 0.016), performance status (p = 0.012), tumor location (p = 0.014), and gastric cancer stage (p < 0.001) between the concordance and non-concordance groups. In contrast, there was no significant difference in HER2 status (p = 0.558), lympho-vascular invasion (p = 0.224), perineural invasion (p = 0.468), serum bilirubin level (p = 0.408), and serum creatinine level (p = 0.251) between the concordance and non-concordance group (Table 2).

Table 2 Univariate analysis of concordance by variables using chi-square test

More than half of the concordance group (56.4%) had stage I disease, and the concordance rate for stage I was the highest at 96.93%. The concordance rates for stages II and III were 88.89% and 90.91%, respectively. However, the concordance rate for stage IV was the lowest at 45.83% (Fig. 2).

Fig. 2
figure 2

Concordance rate between AI and MTB by gastric cancer stage. The concordance rate for stage I was the highest (96.93%). The concordance rates for stages II and III were 88.89% and 90.91%, respectively, which were close to 90%; however, the concordance rate for stage IV was the lowest at 45.83%. Abbreviations: AI, artificial intelligence; MTB, multidisciplinary tumor board

The treatment options selected for each stage of gastric cancer were as follows (Fig. 3): All patients with stage I to stage III gastric cancer underwent curative gastrectomy. Among 163 gastric cancer patients with stage I disease, the MTB selected surveillance for 160 patients and S-1 adjuvant chemotherapy for three patients. The selected treatment options for the 45 stage II gastric cancer patients were as follows: surveillance for four patients, S-1 adjuvant chemotherapy for 33 patients, and XELOX adjuvant chemotherapy for eight patients. The selected treatment options for 66 patients with stage III gastric cancer were XELOX adjuvant chemotherapy for 37 patients and S-1 adjuvant chemotherapy for 29 patients. Of the 48 patients with stage IV gastric cancer, 32 patients were treated with palliative chemotherapy, of which 21 cases were treated with palliative S-1 + cisplatin chemotherapy, and 11 cases with palliative FOLFOX chemotherapy. The remaining 16 patients with stage IV gastric cancer underwent adjuvant chemotherapy after gastrectomy, and the MTB selected adjuvant FOLFOX chemotherapy for 11 patients, and adjuvant S-1 + cisplatin chemotherapy for five patients.

Fig. 3
figure 3

Selected treatment options by gastric cancer stage. All patients with stage I to stage III gastric cancer underwent curative gastrectomy. Of the 48 patients with stage IV gastric cancer, 32 patients were treated with palliative chemotherapy, of which 21 cases were treated with palliative S-1 + cisplatin chemotherapy, and 11 cases with palliative FOLFOX chemotherapy. The remaining 16 patients with stage IV gastric cancer underwent adjuvant chemotherapy after gastrectomy; MTB selected adjuvant FOLFOX chemotherapy for 11 patients and adjuvant S-1 + cisplatin chemotherapy for 5 patients. Abbreviations: MTB, multidisciplinary tumor board; XELOX, capecitabine/oxaliplatin; FOLFOX, fluorouracil/leucovorin/oxaliplatin

The discrepancies between the MTB and AI for each gastric cancer stage were as follows (supplementary tables): In stage I, there were five patients in the non-concordance group for which the AI recommended adjuvant chemotherapy as the treatment option (after gastrectomy) but the MTB selected surveillance (Table S1). In stage II, five gastric cancer patients belonged to the non-concordance group. The AI recommended adjuvant chemotherapy for four patients whereas the MTB selected surveillance as the treatment option. For the remaining gastric cancer patient, AI recommended FOLFOX as adjuvant chemotherapy, but the MTB selected S-1 adjuvant chemotherapy (Table S2). In stage III, there were six patients in the non-concordance group. The AI recommended 5-FU or FOLFOX for four patients, but the MTB selected S-1 adjuvant chemotherapy. For the remaining two patients, AI recommended S-1, capecitabine + radiation, or capecitabine + cisplatin, but the MTB selected the XELOX regimen (Table S3). In stage IV, 26 patients with the S-1 plus cisplatin regimen belonged to the non-concordance group (Table S4).

Table 3 indicates the results of the multivariate analysis of concordance as a function of age, sex, performance status, tumor location, and cancer stage. Several variables, such as age, performance status, and stage IV gastric cancer, had a significant effect on the concordance between the MTB and AI (Table 3).

Table 3 Multivariate analysis of concordance using a binary logistic regression model

Discussion

This retrospective study was conducted to investigate the factors that should be supplemented in AI to assist the MTB in gastric cancer treatment recommendations. Few studies have analyzed the concordance rate between AI and an MTB in gastric cancer treatment recommendations. Choi et al. reported that stage IV gastric cancer was the only significant factor that affected concordance rates [6]. In contrast, Tian et al. reported that HER-2 positivity was a significant factor, while stage IV was not [8]. Interestingly, in the present study, HER-2 positivity was not significant, but age > 80 years, performance status, and stage IV gastric cancer were all significant factors affecting the concordance rate between the MTB and AI (Table 3).

Recommendations pertaining to patients aged > 80 years were less likely to be concordant than those pertaining to patients aged < 80 years (OR 0.175, 95% CI, 0.069–0.441; p = 0.000). Elderly patients have more comorbidities than younger patients, a tendency to refuse chemotherapy requiring hospitalization, and a fluctuating general status [9, 10]. These clinical features could explain the lower concordance rate in patients over 80 years of age.

The higher the performance score, the lower the probability of concordance between the MTB and AI in gastric cancer recommendations (OR 0.203, 95% CI 0.072–0.574, p-value = 0.003 for performance score 1; OR 0.191, 95% CI 0.057–0.639, p-value = 0.007 for performance score 2; OR 0.089, 95% CI 0.026–0.301, p-value = 0.000 for performance score 3). In other words, as the performance status score increases, it becomes easier to select a chemotherapy regimen that does not match the AI recommendation. In general, elderly patients have higher performance scores than younger patients [11].

Age > 80 years, performance status were the main factors for discrepancies in stage II and stage III. In stage II, the MTB selected surveillance as the treatment option for four patients, when taking age, performance status score, and comorbidities into account. The patients were aged 68, 69, 72, and 85 and their performance status scores were all grade three. They also had comorbidities, such as dementia, chronic kidney disease, and chronic heart failure. For the remaining gastric cancer patient, the MTB selected S-1 adjuvant chemotherapy because of the inconvenience of frequent hospitalization due to the old age of the patient (82 years) and a performance status score of grade two (Table S2). In stage III, the MTB selected S-1 adjuvant chemotherapy for four patients after considering age (all over 80 years), a performance status score of two or three, and the inconvenience of frequent hospitalization. For the remaining two patients, the MTB selected the XELOX regimen after taking age (49 and 64 years), a performance status of one, and the presence of no comorbidities into account (Table S3).

In this study, there was no significant difference in concordance rates between stage II and stage III when compared to stage I (p = 0.367 for stage II, p = 0.673 for stage III). Interestingly, the concordance rate for stage IV was significantly lower than that of stage I due to differences in the preference for a palliative chemotherapy regimen in the local guidelines between the MTB and WFO (OR 0.017, 95% CI 0.005–0.055, p = 0.017). S-1 plus cisplatin is a commonly used chemotherapy regimen in Korea and Japan, following the Japanese guidelines [12, 13], whereas S-1 is an investigational agent in the National Comprehensive Cancer Network (NCCN) guidelines of 2018, and therefore not used in the Memorial Sloan Kettering Cancer Center (MSKCC) following the NCCN guidelines [14]. Therefore, since S-1 + cisplatin was not included in the WFO chemotherapy regimen based on the MSKCC data, the concordance rate between the MTB and AI was significantly lower in stage IV (Table S4). Theses local guideline differences were a primary cause of non-concordance not only in stage IV but also in stage I. D2 lymph node dissection is commonly performed in East Asia, unlike the Western world, and observation is recommended in the Japanese guidelines for pathologic stage I after curative gastrectomy [13]. The MTB judged that since five patients with gastric cancer stage I in the non-concordance group had undergone D2 lymph node dissection and were over 65 years of age, there was little benefit from adjuvant chemotherapy considering its complications (Table S1).

In summary, when analyzing the pattern of discordance between AI and MTB for each stage, along with the results of multivariate analysis, discrepancies due to local guideline differences were primarily observed in stage I and stage IV, while discrepancies based on age > 80 years and performance status were mainly evident in stage II and stage III.

AI is applied in various areas of medicine, such as robotics, medical diagnosis, and medical statistics. WFO, an AI system for clinical decision support, is expected to offer many advantages, such as increased work efficiency and a decreased workload for doctors, decision support for junior oncologists, and treatment selection based on the latest medical research, even in hospitals with few or no experts [15, 16]. However, several factors lowered the concordance rate between the medical AI and experts in gastric cancer, resulting in the reduced validity of the medical AI.

First, AI lacks a comprehensive understanding of individual patient. The WFO cannot understand the comprehensive status of patients, such as patient compliance and rapport with doctors, comorbidities that may affect chemotherapy, and interpretation of whether biochemical study results are temporary or persistent [5].

However, it is expected that these AI shortcomings will be compensated for as technology advances. For example, a wearable device or sensor, that can check the patient’s condition and evaluate their activity 24 h a day, could provide continuous rather than fragmentary patient information. Therefore, an accurate individual performance status can be obtained through individual activity history, rather than through performance scores, which are limited in their range [

Availability of Data and Materials

The anonymized data used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

AI:

Artificial intelligence

MTB:

Multidisciplinary team board

WFO:

Watson for Oncology

DCMC:

Daegu Catholic University Medical Center

HER2:

Human epidermal growth receptor 2

XELOX:

Capecitabine/oxaliplatin

FOLFOX:

Fluorouracil/leucovorin/oxaliplatin

OR:

Odds ratio

CI:

Confidence interval

NCCN:

National Comprehensive Cancer Network

MSKCC:

Memorial Sloan Kettering Cancer Center

References

  1. Berardi R, Morgese F, Rinaldi S, Torniai M, Mentrasti G, Scortichini L, et al. Benefits and limitations of a multidisciplinary approach in cancer patient management. Cancer Manag Res. 2020;12:9363.

  2. Curioni-Fontecedro A. A new era of oncology through artificial intelligence. ESMO Open. 2017;2(2).

  3. Mirbabaie M, Stieglitz S, Frick NR. Artificial intelligence in disease diagnostics: a critical review and classification on the current state of research guiding future direction. Heal Technol. 2021;11(4):693–731.

    Article  Google Scholar 

  4. Tang X. The role of artificial intelligence in medical imaging research. BJR Open. 2019;2(1):20190031.

  5. Jie Z, Zhiying Z, Li L. A meta-analysis of Watson for Oncology in clinical application. Sci Rep. 2021;11(1):5792.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Choi YI, Chung JW, Kim KO, Kwon KA, Kim YJ, Park DK, et al. Concordance rate between clinicians and Watson for Oncology among patients with advanced gastric cancer: early, real-world experience in Korea. Can J Gastroenterol Hepatol. 2019;2019.

  7. Suwanvecho S, Suwanrusme H, Sangtian M, Norden AD, Urman A, Hicks A, et al. Concordance assessment of a cognitive computing system in Thailand. Am Soc Clin Oncol. 2017.

  8. Tian Y, Liu X, Wang Z, Cao S, Liu Z, Ji Q, et al. Concordance between Watson for Oncology and a multidisciplinary clinical decision-making team for gastric cancer and the prognostic implications: retrospective study. J Med Internet Res. 2020;22(2): e14122.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Lichtman SM, editor. Chemotherapy in the elderly. Seminars in oncology. Elsevier; 2004.

  10. Radecka B, Czech J, Siedlaczek A, Maczkiewicz M, Jagiełło-Gruszfeld A, Duchnowska R. Chemotherapy compliance in elderly patients with solid tumors: a real-world clinical practice data. Oncol Clin Pract. 2022.

  11. Berthelot G, Johnson S, Noirez P, Antero J, Marck A, Desgorces FD, et al. The age-performance relationship in the general population and strategies to delay age related decline in performance. Arch Public Health. 2019;77:1–9.

  12. Guideline Committee of the Korean Gastric Cancer Association DWG, Panel R. Korean practice guideline for gastric cancer 2018: an evidence-based, multi-disciplinary approach. J Gastric Cancer. 2019;19(1):1–48.

  13. jp JGCAjkk-ma. Japanese gastric cancer treatment guidelines 2014 (ver. 4). Gastric Cancer. 2017;20(1):1–19.

  14. Ajani JA, D'Amico TA, Almhanna K, Bentrem DJ, Chao J, Das P, et al. Gastric cancer, version 3.2016, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2016;14(10):1286–312.

  15. Murphy EV. Clinical decision support: effectiveness in improving quality processes and clinical outcomes and factors that may influence success. Yale J Biol Med. 2014;87(2):187.

    PubMed  PubMed Central  Google Scholar 

  16. Printz C. Artificial intelligence platform for oncology could assist in treatment decisions. Cancer. 2017;123(6):905-.

  17. Lu L, Zhang J, **e Y, Gao F, Xu S, Wu X, et al. Wearable health devices in health care: narrative systematic review. JMIR Mhealth Uhealth. 2020;8(11):e18907.

  18. Wu M, Luo J. Wearable technology applications in healthcare: a literature review. Online J Nurs Inform. 2019;23(3).

  19. Hong L, Luo M, Wang R, Lu P, Lu W, Lu L. Big data in health care: applications and challenges. Data Inf Manag. 2018;2(3):175–97.

    Google Scholar 

  20. Lee J, Kim HS, Kim J. Out-of-hospital data: patient generated health data. J Korean Diabetes. 2020;21(3):149–55.

    Article  Google Scholar 

  21. Abdollahi H, Mollahosseini A, Lane JT, Mahoor MH, editors. A pilot study on using an intelligent life-like robot as a companion for elderly individuals with dementia and depression. In: 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids). IEEE; 2017.

  22. Organization WH. Technical report: pricing of cancer medicines and its impacts: a comprehensive technical report for the World Health Assembly Resolution 70.12: operative paragraph 2.9 on pricing approaches and their impacts on availability and affordability of medicines for the prevention and treatment of cancer. 2018.

  23. Ocran Mattila P, Ahmad R, Hasan SS, Babar ZUD. Availability, affordability, access, and pricing of anti-cancer medicines in low-and middle-income countries: a systematic review of literature. Front Public Health. 2021:462.

  24. Schulte B. Capacity of ChatGPT to identify guideline-based treatments for advanced solid tumors. Cureus. 2023;15(4).

  25. Gebrael G, Sahu K, Chigarira, B, Tripathi N, Mathew Thomas V, Sayegh N, et al. Enhancing triage efficiency and accuracy in emergency rooms for patients with metastatic prostate cancer: a retrospective analysis of artificial intelligence-assisted triage using ChatGPT 4.0. Cancers. 2023;15(14):3717.

  26. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010.

  27. Zhou J, Li T, Fong SJ, Dey N, Crespo RG. Exploring chatGPT’S potential for consultation, recommendations and report diagnosis: gastric cancer and gastroscopy reports’ case. IJIMAI. 2023;8(2):7–13.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Yong-Eun Park and Hyundong Chae designed the study. Yong-Eun Park collected the data. Yong-Eun Park and Hyundong Chae analyzed and interpreted the patient data. The Hyundong Chae commented on the drafts of the paper. Yong-Eun Park was a major contributor to the writing of the manuscript. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Yong-Eun Park.

Ethics declarations

Ethics Approval and Consent to Participate

This study was approved by the Institutional Review Board of the Daegu Catholic University Medical Center in Daegu, Republic of Korea (IRB No. DCMC-CR-21–017). Written informed consent was obtained from the patients.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 22 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, YE., Chae, H. The Fidelity of Artificial Intelligence to Multidisciplinary Tumor Board Recommendations for Patients with Gastric Cancer: A Retrospective Study. J Gastrointest Canc 55, 365–372 (2024). https://doi.org/10.1007/s12029-023-00967-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12029-023-00967-8

Keywords