Introduction

The use of information technology (IT) in the dental field has increased significantly over the past 25 years and has helped reduce cost, time, dependence on human expertise, and medical errors. As a subfield of computer science, artificial intelligence (AI) encompasses both hardware and software that can perceive its environment and take action that maximizes its chances of successfully achieving its goals [1,2,3,4]. Although developments in AI had started in 1943 [5], it was only in 1956 that the term was coined by John McCarthy and adopted during a meeting at Dartmouth College [6]. AI allows examination, organization, representation, and cataloging of medical information, and its robust pattern finding and prediction algorithms are hel** drive discoveries across all sciences [7]. In 2019, Morgan Stanley estimated that the global market for AI in healthcare could surge from $1.3 billion to $10 billion by 2024, growing at an annual compound rate of 40% [8].

While AI is a broad term and includes various classifications, there are two main categories of AI: symbolic AI and machine learning from an algorithmic perspective. Symbolic AI is a collection of techniques based on structuring the algorithm in a human-readable symbolic manner. This category was the paradigm of AI research until the late 1980s and is widely known as GOFAI—good old-fashioned AI [9]. AI is still used for solving problems in which the possible outcomes are limited, computational power is scarce, or human explainability is essential. However, in healthcare, where problems tend to be complex, not always fully understood, and have many explanatory variables, building a model based on a limited set of rules is extremely difficult, if not impossible [10].

Machine learning (ML)—a term first phrased by Arthur Samuel in 1952, is the current paradigm. The fundamental difference between ML and symbolic AI is that, in ML, the models learn from examples rather than a set of rules established by a human [7]. By utilizing a mixture of statistical and probabilistic tools, machines can learn from previous models and improve their actions when new data is introduced. This could be in the form of predictions, identifying new patterns, or classifying new data. ML can be categorized into three types, depending on the type of learning of the algorithm and the chosen outcome: supervised learning (used for classification or prediction based on a known outcome), unsupervised learning (finding hidden patterns and structures with unknown outcomes), and reinforcement learning (machine develops a modified algorithm based on previous versions that maximizes the intended reward) [11].

Deep learning (DL) is a sub-domain of ML in which the machine itself calculates specific features of a given input. DL’s precursor is an artificial neural network (ANN), which was initially developed in the 1900s. With the exponential increase in computational technology and power, researchers have designed more complicated and “deeper” neural networks to solve more complicated practical problems. The neural network has become known as “deep learning.” [12].

Algorithms used for ML may also be used for data mining. Data mining applies these algorithms to historical data to identify new relationships or patterns and therefore aid practitioners in optimizing decision-making in their daily practice, as well as improving quality of care [13, 14]. Alternatively, if predictions are desired, ML should be utilized. For example, a practitioner can use existing data about a disease to train the machine to calculate predictions about the diagnosis or prognosis of patients that has not yet been seen before. Notably, ML predictive models have been shown to have greater accuracy than statistic-based models [15].

In recent years, sco** reviews have become an increasingly adopted approach and have been published across various social sciences and healthcare fields. Orthodontics has been a slow starter on this terrain [16]. Sco** reviews are of particular use when a body of literature has not yet been comprehensively reviewed or exhibits a large, complex, or heterogeneous nature not amenable to a more thorough systematic review [17].

Current orthodontic literature is replete with studies that have documented various applications of AI and ML, utilizing different types of algorithms mentioned above, albeit in isolation. No study hitherto, has attempted to systematically organize the existing literature to review existing AI and ML applications in orthodontics, classify the types of algorithms applied, and provide a comprehensive map** of studies conducted in this field. Hence, this sco** review aims to provide an overview of the existing evidence of how far the earlier AI and ML advancements in orthodontics have translated into clinical fruition and the limitations that have precluded their envisioned development. The authors have attempted (1) to chart the evolution of AI in the orthodontic field over the years, (2) to examine the utilization of applications of AI and ML in the field of orthodontics, and (3) to collate the type of artificial intelligence algorithms that have been implemented in orthodontics.

Materials and method

A sco** review of the published literature was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Sco** Reviews (PRISMA-ScR) guidelines. This protocol was not registered previously.

The eligibility criteria of the sco** review are outlined in Table 1.

Table 1 Sco** review eligibility criteria

The first author (YMB) conducted an initial literature search in PubMed on 2 July 2020 with the keywords listed in Table 2.

Table 2 Keywords for initial literature search

No restrictions were made concerning year or publication status; however, studies with missing English abstracts were excluded. After eliminating duplicate studies, three of the authors (YMB, AYB, NRV) independently screened the titles and abstracts of the retrieved citations to exclude non-eligible articles based on the study’s eligibility criteria and keywords. A copy of the full text was obtained for the articles considered potentially useful after this selection stage. The same authors then read each full-text article to determine whether it met the inclusion criteria. Additional material about the articles included as an appendix was acquired when needed, and the reference lists of acquired articles were also searched for relevant articles. Any disagreement was resolved by discussion between three authors (YMB, AYB, NRV) until a final consensus was achieved.

Data extraction

Data extraction was charted according to “PICO” guidelines and can be found in the supplementary materials. The collected information included the author names, year of publication, country of origin, whether the study was published in an orthodontic specialty journal, study population or sample size, intervention type, comparator, the outcome of the intervention, type of AI algorithm employed, and finally a broad-based orthodontic outcome domain.

Results

The initial search strategy resulted in 289 records, 65 of which were excluded as duplicates, 6 studies that did not appear in the initial search strategy but were identified through references were added subsequently, resulting in 230 records screened for further eligibility. A total of 140 records were excluded as they were irrelevant to the topic of AI/ML applications in orthodontics (n = 108), did not feature an English abstract (n = 17), and finally because the full-text articles of these records could not be retrieved (n = 15). Hence, 90 full-text articles were then evaluated for eligibility, of which case series or case reports (n = 3), personal opinions/descriptive papers/interviews (n = 19), and technique articles/proof of concept (n = 6) were excluded, resulting in 62 articles that fulfilled the inclusion criteria of this study (Fig. 1 illustrates the PRISMA-ScR flowchart for the sco** review). The details of the studies included in the sco** review are listed in Supplementary table 1.

Fig. 1
figure 1

PRISMA flow diagram of the sco** review

In regard to the date of publication, 43 out of the 62 studies (69.35%) included in this sco** review were published between 2011 and 2020, 12 studies (19.35%) between 2001 and 2010 and 7 studies (11.3%) were published between 1991 and 2000 (Supplementary table 2). Results reveal that 11 studies originated from the USA, 9 from South Korea, 7 from China, 6 each from Japan and Italy, 4 from Turkey, 3 each from Brazil, India, Germany, and the UK, 2 from Switzerland, Mexico, Colombia, Spain, and 1 study each from Russia, Morocco, Thailand, Iran, Serbia, Singapore, and Australia (Supplementary table 3). Thirty-six out of the 62 studies were published in non-orthodontic journals (58%) whereas 42% (26 studies) were published in orthodontic specialty journals (Supplementary table 4).

Artificial neural networks (ANNs) were found to be utilized as the AI/ML algorithm in 13 studies, convolutional neural networks (CNNs) and support vector machine (SVM) in 9 studies and regression in 8 studies apart from 23 other algorithms utilized in various studies. Results classified as per the type of AI algorithm employed in the study are listed in Table 3.

Table 3 Studies classified by AI/ML algorithm employed

The study results also helped classify artificial intelligence applications into 4 core domains and a 5th domain that clubbed together miscellaneous applications. Table 4 enlists results classified as per domains of applications of artificial intelligence in orthodontics.

Table 4 Studies classified as per domains of applications of artificial intelligence in orthodontics

Discussion

There has been a steady increase in the number of sco** reviews published in orthodontic literature over the last few years [80, 81]. Through this sco** review, the authors have attempted to organize the existing literature in a systematic manner in order to document existing applications of AI and ML in the field of orthodontics.

The study results reveal that one of the earliest studies to document the utilization of AI in the field of orthodontics was published in 1986 in a non-orthodontic journal; however, this was excluded from this sco** review based on its inclusion criteria [82]. The first study included in this review was found to be published in 1991 [57]; between 1991 and 2000, 2001 and 2010, and 2011 and 2020, there was a progressive increase in publications from 7 to 12 and 43 respectively, clearly indicating an exponential rise in this subject due to technological advancements and the continuous digitization of orthodontics.

These studies were primarily published in non-orthodontic journals (36) compared to orthodontic specialty journals (26). This is perhaps reflective of the far-reaching applications of AI and ML and points out toward possible scope of a further collaboration of various disciplines in the field of AI and ML.

This distribution map of research undertaken in the field of AI in orthodontics shows that the majority of the studies originated from the USA (11), South Korea (7), and Japan and China (7 each). Apart from these 4 countries, studies were also found to originate from 17 other countries, reflecting the more considerable increase in AI and ML’s interest and their subsequent utilization in orthodontics.

This sco** review also tried to examine the types of AI algorithms commonly employed in various studies. The results reveal that artificial neural networks (ANNs) were the widely utilized AI/ML algorithm (10) followed by convolutional neural networks (CNNs), support vector machine (SVM)-8 studies and regression (logistic and linear) in 8 studies apart from 23 other algorithms utilized in various studies.

The research question—“what are AI and ML applications in the field of orthodontics?” threw up five major domains. Each domain was addressed with the PICO framework for literature evaluation and can be enumerated as (1) diagnosis and treatment planning—either broad based or specific, (2) automated anatomic landmark detection and/or analyses, (3) assessment of growth and development, (4) evaluation of treatment outcome and finally a (5) miscellaneous category.

Our study results show that the maximum number of publications focused on automated anatomic landmark detection and/or analyses as the major domain of AI utilization, chiefly from lateral cephalograms, more recently from CBCT images and lastly from frontal cephalograms. This review’s first study to utilize AI for automatic extraction of cephalometric landmarks was published back in 1998 in an orthodontic specialty journal [75]. Several studies since then [27, 32, 35, 37, 64, 65, 69, 70, 74, 77, 78] have affirmed greater accuracy of landmark detection, reduced time, and human effort spent on anatomic landmark detection and/or analyses with AI/ML as compared to traditional methods.

Studies have shown that cephalometric analysis’s angles and lengths, predicted by neural networks, were not statistically significant from those calculated from manually plotted points. Yu et al. [32] proposed a system that exhibited > 90% sensitivity, specificity, and accuracy for vertical and sagittal skeletal diagnosis and concluded that CNN-incorporated system showed potential for skeletal orthodontic diagnosis without the need for intermediary steps requiring complicated diagnostic procedures. The use of CBCT for cephalometric analysis has now become commonplace. Various studies [31, 58,59,60,61, 76] that have employed AI and ML techniques for automatic landmark detection and analysis have shown that the results obtained are as accurate and less time-consuming as compared to those obtained with manual analysis. At least one study [19] included in this review compared frontal cephalometric landmarking ability of humans versus that of artificial neural networks and the results showed that ANNs could achieve accuracy comparable to humans in placing cephalometric points, and in some cases surpasses the accuracy of inexperienced doctors (students, residents, graduate students).

The second domain incorporating AI and ML utilization was broadly labeled as diagnosis and treatment planning, with applications intended either for broad-based or specific clinical situations. Diagnosis remains the cornerstone of successful orthodontic treatment. However, thus far, no tools exist to lead patients and clinicians out of the decision-making uncertainty in which they are trapped, especially when they face a condition that has several possible correct treatment options and orthodontists over the years have attempted to create systems that take the subjective bias out of diagnostic decision-making.

Expert systems [55,56,57, 79] are one of the earliest and most basic implementations of AI and have been popular for diagnosis and treatment planning in the medical and dental fields. They process the input information and provide solutions based on “if-then” rules. “If-then” based expert systems are limited to currently existing data when the system is created, and regular updates are required to ensure that the outcomes is correct and up to date. Rule-based expert systems have now become obsolete due the aforementioned limitations, and the development of newer technologies such as ML.

To “extract or not to extract” has been the question in orthodontics since time immemorial, with substantial variability noted between orthodontists’ decisions. Unsurprisingly, the most significant number of studies [18, 23, 25, 42, 46, 53, 61]. Thanathornwong [53] utilized the Bayesian network (BN) for assessment of the need for orthodontic treatment and concluded that the results obtained by the decision support system were comparable with those suggested by expert orthodontists. Wang et al. [42] explored the function of an eye-tracking method to evaluate orthodontic treatment need and treatment outcome from the lay perspective in an objective way when compared to traditional methods. The authors employed support vector machine techniques and concluded that the eye-tracking device was able to objectively quantify the effect of malocclusion on facial perception and the impact of orthodontic treatment on malocclusion from a lay perspective.

Predictions of treatment outcomes in class II and class III patients have also been reported. Auconi et al. developed a system to predict outcomes in untreated class III patients [62]. Unsupervised learning was used to cluster patients as hypermandibular, hyperdivergent, or balanced based on cephalometric variables. The system was then applied to a treated sample, where it showed that all of the unsuccessful cases belonged to either the hypermandibular or the hyperdivergent cluster. The same author [73] also attempted to identify critical peculiarities of class II and class III malocclusions and demonstrated that class II subjects exhibited few highly connected orthodontic features, while class III patients showed more compact network structure characterized by strong co-occurrence of normal and abnormal clinical functional and radiological features. The study concluded network analysis could allow orthodontists to visually evaluate and anticipate the co-occurrence of auxological anomalies during individual craniofacial growth and possibly localize reactive sites for a therapeutic approach to a malocclusion.

A group of researchers have specifically studied the applications of AI and ML for the detection of TMJ osteoarthritis [22, 47, 66, 67] and have concluded that deep learning neural network was the most accurate method for classification of TMJ-OA that allows disease staging of bony changes in TMJ-OA. The authors expected their efforts to boost future studies into early detection and osteoarthritis patient-specific therapeutic interventions, and thereby improve the health of articular joints.

Three studies in this sco** review focused on the assessment of maxillary constriction and/or maxillary canine impactions [51, 54, 71]. Chen et al. [71] developed a machine learning algorithm utilizing Learning-based multi-source IntegratioN frameworK for segmentation (LINKS) used with CBCT images to quantify volumetric skeletal maxilla discrepancies and suggested palatal expansion could be beneficial for those with unilateral canine impaction, as underdevelopment of maxilla often accompanies canine impaction in early teen years. Another study [51] concluded that among learning machine methods tested to classify data, the best performance was obtained by random forest method, with an overall accuracy of 88.3% in predicting canine eruption. The authors performed measurements on 2D routinely executed radiographic images, found them to be independently related to canine impaction and showed reliable accuracy in predicting maxillary canine eruption. Bayesian network analysis [54] showed bilateral impaction was associated with palatal impactions and longer treatments, pre-treatment alpha-angle was a determinant for the duration of the orthodontic traction and the post-treatment periodontal outcome was not related to pretreatment radiographic variables.

One of the challenges for less experienced orthodontists is the selection of the appropriate treatment modality and appliance. A system was developed to help orthodontists select the appropriate type of headgears [63]. Compared to the selections made by eight expert orthodontists, the system correctly identified the appropriate headgears 95.6% of the time.

Isolated studies included in this sco** review under the domain of diagnosis and treatment planning have also investigated the applications of AI and ML for screening of osteoporosis from panoramic radiographs [34, 41]; assessment of airflow dynamics, prediction of upper airway collapsible sites, and obstructive sleep apnea [40]; prediction of the association between C. difficile infections in hospitalized patients with major surgeries [52]; genetic risk assessment for non-syndromic orofacial cleft patients [48] and for prediction of occurrence of obstructive sleep apnea in patients with Down’s syndrome [72].

The third domain of applicability of AI can be described as the evaluation of orthodontic treatment outcomes and one of the major areas researched includes the effect of orthognathic surgery on facial appearance and age perception [36, 38]. The algorithms used in these studies concluded that most patient’s appearance improved with treatment (66.4%), resulting in a younger appearance of nearly 1 year, especially after a profile altering surgery. Similar improvement was noted on facial attractiveness in 74.7% of patients, especially after lower jaw surge and the authors concluded that AI might be considered to score facial attractiveness and apparent age in orthognathic patients [38].

With regard to the assessment of growth and development and/or evaluation of growth patterns. Spampinato et al. [39] proposed and tested several deep learning approaches to assess skeletal bone age automatically in what was one of the first studies for an automated skeletal bone age assessment, tested on a public dataset and for all age ranges, races, and genders and with a source code available. Results showed an average discrepancy of 0.8 years between manual and automatic assessment and considered to be a state-of-art performance reliability as per the authors. A recent study [20] compared seven artificial intelligence algorithms—k-nearest neighbors, Naïve Bayes, decision tree, artificial neural networks, support vector machines, random forest, and logistic regression algorithms to determine the preferred method of cervical vertebrae maturation and concluded that ANN was the most stable and preferred method of determining the same.

There have been many attempts to aid orthodontists in classifying patient growth patterns [83,84,85]. One of the first methods of using ANN in evaluating growth occurred in 1998 where the growth of 43 untreated children was classified based on size and shape changes [30]. Nino-Sandoval et al. utilized a support vector machine to classify skeletal patterns through craniomaxillary variables but achieved only 74.51 % accuracy in the correct distinction of class II skeletal pattern from class III pattern and vice-versa [24, 44].

Applications of AI and ML that could not be described under the above four major domains were grouped under the miscellaneous category in this review and these include automated tooth segmentation either from CBCT images or dental models [33, 43], detection of activation pattern of tongue musculature [50] and evaluation of effects of a different curing unit and light-tips on temperature increase during orthodontic bonding [26].

Limitations

Some AI applications may have been missed out due to the inclusion criteria, utilized search terms, if they were published in a language other than English and/or due to the non-inclusion in PubMed.

Due to its nature, sco** reviews are not expected to utilize a risk of bias tool to assess methodological strengths of included studies. The overall idea is to explore in a superficial way what is currently known in a specific area.

Conclusion

  • This sco** review showcases that there has been an exponential increase in the number of orthodontic studies involving various applications of AI and ML over the past three decades.

  • The majority of these studies originated from the USA, followed by South Korea and China.

  • The number of studies published in non-orthodontic journals (36) was found to be more extensive than those published in orthodontic specialty journals (26).

  • Artificial neural networks (ANNs) were found to be the most commonly utilized AI/ML algorithm, followed by convolutional neural networks (CNNs) and support vector machine (SVM).

  • The most commonly utilized AI domains were for diagnosis and treatment planning (33 studies), automated anatomic landmark detection and/or analyses (19), assessment of growth and development (4), and evaluation of treatment outcome (2).