1 Introduction

Chromosomal abnormalities contribute to a wide array of clinical syndromes and conditions, resulting from either numerical or structural alterations in chromosomes. These defects in newborns occur with an estimated frequency of 0.6%, often leading to intellectual disabilities, developmental delays, and various congenital malformations [1]. The incidence rises dramatically to 25% in miscarriages and stillbirths, with 50–60% of these occurring in the first trimester [2]. Chromosome karyoty** (CK), a critical diagnostic tool for identifying chromosome anomalies, is used to detect and characterize such defects [3, 4]. Despite its importance, traditional CK is limited by a low level of automation and a high threshold for technical expertise, necessitating labor-intensive procedures such as manual chromosome segmentation, identification, and analysis during metaphase.

Advancements in digital imaging and artificial intelligence (AI) have brought about significant breakthroughs in medical research. An exemplary achievement in this domain is AlphaFold’s success in predicting protein structures. By map** out the structures of 350,000 proteins found in the human body, AlphaFold demonstrated that its automated modeling could achieve an average accuracy score of 92.4 out of 100. Scores above 90 are deemed nearly equivalent to those determined by manual experiments [5, 6]. In genetics, the AI algorithm StrVCTVRE has been instrumental in distinguishing pathogenic from benign structural variants (SVs) in coding regions, striking a balance between maintaining 90% sensitivity and enabling clinicians to dismiss nearly half of the SVs as benign [7].

Emerging chromosome analysis software has significantly enhanced the level of automation in chromosome diagnostics, thereby improving the efficiency of cytogeneticists. These advancements are categorized into two phases: the first involved traditional image processing technologies employed by companies like Leica and Zeiss, which increased the accuracy of automated chromosome segmentation to about 50% [8,9,10,11,12]. Subsequently, the second stage witnessed the widespread application of deep learning technologies in the field of chromosome analysis [13,2.2. Section 3 is dedicated to the data analysis of our experimental results, encompassing a comparison of the application effects of the two software systems, a comparative evaluation of their diagnostic values in normal cases (experiment 1), and an assessment of cases with defects (experiment 2). Additionally, we investigate the universality and efficiency enhancement offered by the two software types (experiment 3). The fourth section of our paper delves into a thorough discussion of our research outcomes and discoveries. In the final section, our conclusion, we reiterate the research questions and main findings.

2 Subjects and methods

2.1 Subjects in study

2.1.1 Experimental installation

The product under research is chromosome analysis software (Specification and model: AutoVision AI; Version: V2.0); The control product is a traditional analysis software named Image Analysis and Capture System (Manufacturer: Leica Biosystems Richmond, Inc; Specifications and Models: GSL-120; Version number: V7.4; Registration No.: GXZZ 20192222425).

2.1.2 Experimental data

The metaphase chromosome images were obtained using an upstream chromosome scanner, specifically the GSL-120 model manufactured by Leica. These images were acquired by scanning case slides containing the chromosomes. Following the acquisition, the metaphase chromosome images were processed to produce single chromosome images. This process involved either manual cutting or the use of AI for automated cutting, preparing the images for further chromosome arrangement analysis.

For the experimental dataset, a total of 1168 cases from the Women and Children’s Hospital of Ningbo University were randomly selected for the study. Among these, 1041 were identified as normal cases, comprising 527 males and 514 females. The dataset also included 127 abnormal cases, categorized as follows: 73 cases exhibited numerical abnormalities, 6 cases had longer satellites (ps+), 14 cases showed an increased heterochromatin region (qh+), and 34 cases were identified with structural defects.

2.1.3 Analysis process

The main steps for chromosome analysis with AutoVision are as follows:

  1. 1.

    Uploading microscope images of cases: the first step in the experimental protocol involves configuring the system to automatically retrieve and upload case data directly from the microscope equipment, ensuring a direct and seamless transfer of images to analysis system. For each case, this process involves the capture of high-quality chromosome images, using a standard high-resolution scanning device designed to systematically scan chromosome slides. This device captures images of all analyzable metaphase chromosomes without applying any selection, filtering, or pre-processing. Such an approach ensures the integrity and consistency of the experimental procedure by guaranteeing that each analysis system receives the same set of raw, unaltered images. This protocol is critical for maintaining a fair, consistent, and unbiased basis for performance comparison among different chromosome analysis software, facilitating a precise evaluation of their efficiency, accuracy, and reliability in processing identical image data.

  2. 2.

    AI automatic analysis: once the case is loaded into the system, the system will do AI-based automated analysis, which includes picture screening, image optimization, automatic chromosome counting, and automatic arranging.

    1. (1)

      In the image screening process, chromosome metaphase division diagram suitable for analysis is selected through intelligent scoring and optimization (according to chromosome length, dispersion, and banding) is arranged at the front of the entire case, eliminating the need for analysts to manually select images.

    2. (2)

      In the image optimization process, background denoising (nucleus, chromosome fragments and other background noises of surrounding cells are automatically removed) and image optimization are automatically carried out through image optimization, so that the optimized image with better band contrast is obtained to facilitate the observation of chromosome structure and band pattern in the later stage.

    3. (3)

      In the automatic counting process, by using the intelligent algorithm for adhesion, cross and complex cluster chromosome cutting, karyotypes can be automatically counted.

    4. (4)

      In the automatic arrangement process, AutoVision can realize intelligent recognition and arrangement, automatically arranging the chromosomes cut by AI into the correct chromosome number.

  3. 3.

    Manual review: most of the analysis operations have been completed using AI automatic analysis, and manual review of the analysis results is required in the end. The case analysis was completed through counting confirmation, cutting correction, modification of arrangement position and arrangement confirmation of chromosomes.

  4. 4.

    Report issuance: once manual review is done, enter the report details and proceed to issue the report.

2.2 Research methods

2.2.1 Evaluation metrics

  1. (1)

    Automatic analysis accuracy: this is the primary metric for assessing the system’s accuracy in automatically analyzing individual chromosomes. It is calculated by comparing the number of correctly analyzed chromosomes to the total number of chromosomes. The higher the accuracy, the better the system’s performance.

  2. (2)

    Number and proportion of completely correct maps: this metric measures the system’s capability to accurately analyze all chromosomes in a given case without any errors. A “completely correct map” refers to a situation where every chromosome is identified accurately without any mistakes. This metric directly reflects the reliability of the system in practical applications.

  3. (3)

    Number and proportion of maps with errors within 2 errors: considering that completely error-free situations are rare in practical applications, this metric measures the system’s performance when a certain range of errors (e.g., within 2 errors) is allowed. This metric helps understand the system’s robustness when dealing with complex samples.

  4. (4)

    Analysis time efficiency ratio: this metric assesses the comparative efficiency by calculating the ratio of the time taken by experts to analyze identical cases using traditional analysis software versus the intelligent analysis system. An improvement in analysis speed signifies enhanced laboratory operational efficiency and the capacity to process a higher volume of samples more effectively.

2.2.2 Experiment 1

In order to compare the performance of automatic analysis of the two analysis software, the experiment covered as many cases as possible while controlling the scale, so as to avoid the differences between different cases (such as sample preparation, production and shooting effect and other variables in real scenes).

The karyotype was randomly selected for each case, and automatic analysis was performed using traditional analysis software and intelligent analysis system. The results of the automatic analysis of each karyotype will be independently reviewed by at least two or more cytogeneticists, and in the event of discrepancies in the results reviewed, the results will be finally confirmed by a chief expert.

2.2.3 Experiment 2

In order to further compare and analyze the clinical diagnostic value of the software, the most common cases with abnormalities in number of the defective karyotypes and some cases with microstructural defects that did not affect the artificial judgment of chromosome types, for example, ps+ and qh+, were selected in the experiment. The analysis effect and diagnostic value of the two tools for cases with such defects at the case level were determined through experiments.

In the experiment, according to the proportion of each defect type in the case bank, 151 cases with abnormalities in number, 7 cases with ps+ and 15 cases with qh+ were randomly selected. Among the cases with abnormalities in number, 94 were trisomy 21, 27 were trisomy 18, 12 were XXX, 13 were XXY and 5 were others. Among the cases with ps+, 3 cases were on Chromosome 21, 2 cases on Chromosome 15, 1 case on Chromosome 13 and 1 case on Chromosome 14. Among the cases with qh+, 10 cases were on Chromosome 1, 3 cases on Chromosome 9, and 2 cases on Chromosome 16.

2.2.4 Experiment 3

To assess the effectiveness of an AI analysis system in enhancing the productivity of cytogeneticists, three representative individuals with different specialties were chosen for this experiment. They were tasked with analyzing various cases using both traditional analysis software and the AI analysis system. The time taken to complete each case was recorded for comparison.

Expert A, the primary investigator of this study, possesses extensive knowledge and proficiency in both software. Expert B is a novice employee with three years of experience and lacks familiarity with both software. Expert C is a seasoned employee with over two decades of experience. They are well-versed in conventional analysis software but lack familiarity with AI analysis systems. For the experiment, a total of 1168 cases were randomly sampled, consisting of 20 normal cases and 20 abnormal cases. Among the abnormal cases, there were 10 cases with abnormalities in number and 10 cases with structural abnormalities.

3 Results

3.1 Comparison of the application effects of the two analysis software

The experimental results showed that the total number of single chromosomes in 1168 cases was 53,761, and the total number of correct single chromosomes automatically analyzed by traditional analysis software was 31,254. Therefore, the automatic analysis accuracy was 0.5814. The total number of correct single chromosomes automatically analyzed by the intelligent analysis system was 52,763, and the automatic analysis accuracy rate was 0.9815. The number of karyotypes for 1168 cases was 1168, the number of completely correct maps by traditional analysis software was 25, accounting for 0.0214, and the number of maps with the number of errors within 2 was 111, accounting for 0.0950. The number of completely correct maps by intelligent analysis system is 652, accounting for 0.5582, and the number of maps with the number of errors within 2 was 1056, accounting for 0.9041. See Figs. 1 and 2 for details. The specific comparison of the accuracy of the two groups is shown in Table 1.

Fig. 1
figure 1

Distribution diagram of the number of errors for traditional chromosome analysis software

Fig. 2
figure 2

Distribution diagram of the number of errors for intelligent chromosome analysis system

Table 1 Results of automatic analysis accuracy for grouped samples

According to the daily analysis efficiency, two groups of cytogenetic experts were randomly selected to analyze the same cases with traditional analysis software and intelligent analysis system. Each case will be analyzed according to the requirements of “arranging 5 maps and counting 25”. The analysis time of both the intelligent analysis system and the traditional analysis software would fluctuate significantly with the difference in the shooting and filming effect of each case. However, the analysis speed of the former is significantly faster than that of the latter, and the difference in analysis efficiency is about 5–15 times (Fig. 3).

Fig. 3
figure 3

Comparison of analysis efficiency between the two analysis software

3.2 Comparison of diagnostic value between the two analysis software

3.2.1 Normal cases (experiment 1)

In experiment 1, out of the total 1168 pictures, the intelligent analysis system successfully performed accurate automatic trimming on all of them. Among these, 830 pictures could be directly counted, which accounts for 0.7106 of the total. Additionally, the intelligent analysis system correctly performed automatic trimming and arrangement on 652 pictures, which accounts for 0.5582 of the total.

In the process of medical diagnosis, there are generally 30–50 countable images with good film shooting for each case, and among them, there are about 15–30 images with clear bands can be arranged. Cytogeneticists generally perform software analysis in a “arranging 5 maps, counting 25” manner, in which “counting” requires a complete cut, and “arranging” requires complete cut and correct arrangement. According to the ratio estimated above, the mathematical expectation for automatically counting the correct number of maps per case is 21–35, and the mathematical expectation of automatically arranging the correct number of maps is 8–16.

3.2.2 Cases with defects (experiment 2)

Examples of chromosomes with structural abnormalities involved in this study are shown in Fig. 4.

Fig. 4
figure 4

Satellite abnormalities (top) and heterochromatin abnormalities (bottom) included in the experiment

3.2.2.1 Abnormality in number

Among the 151 cases with abnormalities in number, the karyotypes of abnormality in number were all polysomy, the proportion of complete cut and correct arrangement was 0.3775, and the proportion of successful polysomy detection was 0.9272. For the abnormal polysomites in number, the number of each numbered chromosome is 3 (The abnormal karyotype observed in this study is XXY, indicating the presence of two X and one Y chromosomes). According to Table 1, the probability of each chromosome with normal structure being correctly cut and arranged is about 0.9814, then the expected probability of the three specific chromosomes being correctly cut and arranged at the same time is consistent with the experimental results. The AI analysis system’s high accuracy of chromosome cutting and arrangement and consistent stability of analysis results for various numbered chromosomes can help experts quickly determine chromosome number abnormalities.

3.2.2.2 Satellite abnormalities

The typical ps+ images were shown in the first row of Fig. 4, in which chromosomes 13–15 and 21 were proximal centromeric chromosomes. Hyperelongated satellite chromosomes show an increase in short arm length, and the number and size of the satellite are usually variable, which poses a challenge for the identification of abnormal satellite chromosomes.

3.2.2.3 Heterochromatin abnormalities

The typical qh+ images were shown in the second row of Fig. 4. Using AI analysis system, in the analysis of 15 cases of qh+, the proportion of complete cutting and correct arrangement was 0.3333, and the proportion of chromosomes with structural defects was 0.5333. Compared with normal chromosomes and abnormal chromosomes, the accuracy of cutting and arrangement was significantly decreased. The possible reason is that the qh+ is more widely distributed than the ps+ and appears as lighter stained parts in the G-band, which are easily misidentified as additional bands. In 9qh+ in Fig. 4, the length of Chromosome 9 increased significantly after hetero chromatin was increased, and the bands also changed greatly, which made it difficult to identify.

3.3 Comparison of the universality of efficiency improvement of two kinds of analysis software (experiment 3)

The analysis results are shown in Fig. 5. Expert A, the main researcher of this study, is familiar with both software. By using the AI analysis system, the analysis efficiency for normal cases, cases with abnormalities in number and cases with abnormalities in structure is increased by 6.3 times, 4.7 times and 6.8 times, respectively, compared with the traditional analysis software. Expert B, a young employee who has been working for 3 years, is relatively unfamiliar with both types of software, and the efficiency of using AI analysis system to analyze different types of cases is 5.7 times, 3.5 times and 6.6 times higher than that of the traditional analysis software, respectively. Expert C is an old employee who has worked for more than 20 years and is skilled in traditional analysis software but unfamiliar with AI analysis system. The efficiency of using AI analysis system to analyze the above different types of cases is 4.1 times, 3.6 times and 5.1 times higher than that of the traditional analysis software, respectively.

Fig. 5
figure 5

Comparison of the analysis efficiency of the two analysis software for different types of cytogeneticists and different types of cases

The experimental results show that compared with traditional analysis software, the time required for different types of experts to use AI analysis system to conduct analysis in normal cases, cases with abnormalities in number and cases with abnormalities in structure can be greatly reduced, and the efficiency is significantly improved. Moreover, the time fluctuation of using AI analysis system for different types of cases is smaller than that of using traditional analysis software, which further proves that AI analysis system can improve the efficiency of case analysis more significantly and reliably.

4 Discussion

This investigation underscores the critical necessity for the development and implementation of swift and precise automated analysis mechanisms within clinical environments, particularly those focusing on prenatal diagnostics and chromosomal anomalies. Given the confluence of China’s diminishing birth rates and the escalating incidence of chromosomal defects among expectant mothers, the exigency for adept diagnostic processes to accommodate the burgeoning requisites for prenatal examinations becomes evident [1]. In comparison to conventional chromosome karyoty** techniques, our avant-garde intelligent chromosome analysis system manifests substantial enhancements in terms of both precision and operational efficiency.

The statistical examination of birth defects reveals a considerable genetic component, with chromosomal disorders accounting for a notable share of these anomalies [19]. This observation is in concordance with the insights of Zhou et al. [21, 22], scant examples exist that illustrate the tangible deployment of such technologies within clinical settings. Our study not only exemplifies the utilization of AI in routine CK analysis but also conducts a methodical juxtaposition with traditional methodologies. The outcomes of experiment 1 suggest that the analysis duration can be significantly curtailed whilst sustaining elevated levels of diagnostic accuracy for prevalent chromosomal irregularities, marking a notable advancement over the findings of Wang et al. [23], who documented only a modest enhancement in efficiency with their methodologies.

In experiment 2, the adeptness of the intelligent analysis system in accurately identifying both numerical and structural abnormalities underscores its resilience and dependability, aligning with the evaluations of van Dijk ADJ and associates regarding AI applications in genetics [24]. Nonetheless, our results also intimate at substantial potential for the refinement of AI systems, especially in discerning structurally complex chromosomal defects. This reinforces the assertions made by Nimitha N et al. [25] concerning the limitations inherent in current AI techniques for detecting intricate cytogenetic anomalies.

5 Conclusion

The seamless incorporation of artificial intelligence (AI) into the domain of clinical chromosomal karyoty** (CK) analysis heralds a pivotal progression, poised to enhance diagnostic rapidity and mitigate the incidence of birth defects substantially. Our research enriches the scientific corpus, delineating a comprehensive appraisal of an AI-driven system juxtaposed with conventional CK methodologies, thereby establishing a novel paradigm for AI’s utility in clinical cytogenetics.

The amalgamation of AI with traditional practices paves the way for more efficient, reliable, and swift diagnostic procedures, potentially revolutionizing prenatal care and genetic disorder identification.

Nonetheless, it is imperative to acknowledge the limitations inherent in our study. The scope of AI’s efficacy, particularly in identifying complex chromosomal anomalies, remains an area ripe for further exploration. This acknowledgment does not diminish our findings but rather illuminates the path for future research directions. There is a pressing need to refine AI algorithms and enhance their capability to decipher intricate cytogenetic patterns, which, in turn, could broaden the spectrum of detectable genetic conditions.

As we venture forward, it is recommended that subsequent investigations focus on the development of more sophisticated AI models that are capable of overcoming the current limitations. This entails not only the enhancement of algorithmic accuracy and the ability to handle complex cases but also the integration of AI tools within a wider range of clinical settings to validate their efficacy and adaptability. Through such endeavors, we can aspire to fully realize the potential of AI in transforming clinical cytogenetics, making strides towards a future where genetic diagnostics are more accessible, accurate, and expedient.