Introduction
In recent years, artificial intelligence (AI) has progressed by leaps and bounds in its ability to imitate human cognitive functions. This has benefitted the medical and dental community by potentially incorporating AI in diagnosis, treatment planning and treatment simulations and, in some instances, making decisions on modes of therapy. AI has steadfastly emerged as a cutting-edge technology laced with tremendous potential to change the healthcare landscape. AI is a general term which entails several domains, such as machine learning, deep learning, semi-supervised, un-supervised or supervised learning.
Convoluted neural networks
A broad understanding of AI, machine learning, deep learning and similar domains, as well as methods and their applications in dentistry and orthodontics, is a prerequisite to comprehend contemporary developments in the field and their clinical relevance.
-
•
AI algorithm applications in orthodontic radiology, including two- and three-dimensional imaging, have substantially influenced the speed of automation, replacing human efforts in the location of landmarks and analysis.
-
•
AI has been used in developing software to assess skeletal growth and predict treatment effects in orthodontics.
-
•
AI applications are increasingly utilised to evaluate facial attractiveness, plan surgical procedures and predict outcomes.
-
•
AI applications are substantial in remote monitoring and teledentistry for orthodontic practice.
-
•
The foundation of orthodontic treatment lies in careful diagnosis, essential for achieving accurate planning and ensuring precise treatment approaches. Traditionally, training methods relied on established treatment rules and probabilistic approaches, often resulting in limited knowledge transfer and hindered critical thinking. AI can be a valuable learning tool, particularly for orthodontic residents and newly graduated orthodontists embarking on their orthodontic careers.
In the past few years, AI has grown at a rapid pace owing to three major technical advances which have championed the applications of AI in orthodontics:
-
1.
Increase in the use of ‘big data’, which implies the assessment of large datasets.
-
2.
There has been significant growth in the computational power of processors.
-
3.
The development of deep-learning systems has dramatically impacted the advancement of AI in orthodontics.
Historical aspects of AI and current techniques
Alan Turing began his investigation in 1950 to explore whether machines could possess human level thinking abilities. He introduced the ‘Turing’s test’ to determine a machine’s intelligence, which has since become a cornerstone of AI. Professor Cahit Arf discussed the possibility of creating a self-thinking machine at the 1958 conference, asserting its feasibility. Turing’s proposal of the Turing’s test and Arf’s ideas laid the foundation for exploring machines exhibiting human-like intelligence. John McCarthy coined the term artificial intelligence to describe the ability of machines to execute tasks categorised as intelligent.
AI is a field focused on automating the intellectual tasks that are performed by the human brain. Machine learning and deep learning are notable methods for achieving this goal.
Machine learning utilises the learning aspect of AI and works by developing algorithms that best represent a set of data. Machine learning was coined by Arthur Samuel in 1959 and differs from symbolic AI in that it enables machines to acquire knowledge from data rather than relying on human-devised rules. In classical programming, the coding of an algorithm is performed by utilising the known features. Whereas, in machine learning, subsets of the data can be used to create an algorithm that may utilise novel, new or different combinations of features and weights that can be derived from the first principles. The ultimate goal of machine learning is to enable machines to learn from records and find solutions independently without human intervention.
Machine learning contains many types of learning models and techniques, as illustrated in Fig. 98.1 .
A flowchart describing a few data science techniques.
Artificial intelligence is classified under data science and consists of classical programming and machine learning. Machine learning contains many types of learning models and techniques, including deep learning and artificial neural networks.
Supervised learning: Three separate datasets are used in supervised learning: (1) training dataset, (2) validation dataset and (3) test dataset. The model uses the training dataset to train the algorithm to identify the patterns and relationships between the features and the target. Validation datasets are used to further finetune or refine the algorithm. Finally, the test datasets are used to determine the performance of the algorithm.
In an unsupervised learning model, the algorithm is developed to identify the patterns in a dataset and categorise individual instances into categories. , Unsupervised learning differs from supervised learning in that the patterns that may or may not exist in a dataset are not informed by a target and are left to be determined by the algorithm. ,
Some examples of unsupervised learning tasks include clustering, association and anomaly detection.
Semi-supervised learning models lie between the previous models: supervised and unsupervised. The semi-supervised learning approach is advantageous for datasets containing both labelled and unlabelled data, which means that all the features are present in the data, but not all features have associated targets. This model is typically used for medical, dental or orthodontic radiographic images in which the labelling process can be time-consuming and expensive. With semi-supervised learning, an expert (an orthodontist, dentist or physician) would label a small set of images. These images would be used to train the model, which then classifies the rest of the unlabelled images in the dataset. The resultant labelled dataset is then used to prepare the working model, which, in theory, would outperform the unsupervised learning model.
Reinforcement learning is a model in which a modified algorithm is created by the model based on previous versions to enhance its performance. , In this model, the human learning experience is replicated in that the model learns from trial and error rather than learning only from data. The six main techniques of machine learning methods are highlighted in Fig. 98.2 . , , ,
Different types of machine learning methods.
Artificial intelligence and its impact on orthodontics
Orthodontic radiology: Cephalometry
The major application of AI in orthodontic radiology has revolved around the automated identification of landmarks in two-dimensional cephalometric images. Many different AI systems have been employed to assess lateral cephalometric images.
The main approaches or techniques for the automated identification of landmarks on cephalograms can be broadly classified into four different categories.
-
1.
Image filtering and knowledge-based systems: Image segmentation and feature extraction. ,
-
2.
Model-based approach: Pattern matching algorithms. , ,
-
3.
Hybrid systems. , , ,
-
4.
Deep learning.
The accuracy of an AI model for automatically locating landmarks is evaluated by the error in its identification of landmarks in comparison to the actual location or the “gold standard” location of those landmarks. However, the errors in the automated location of landmarks have also been reported with a standard deviation of >2 mm, which makes them unsuited for robust clinical applications. Nevertheless, the application of deep learning has resulted in drastic improvement from previous algorithms, and with constant improvement in the technology, the future seems promising for the automated digitisation of lateral cephalograms. The application of an accurate automated cephalometric analysis saves time for the orthodontist in tracing the cephalogram and assists the orthodontist in rendering an optimum diagnosis and treatment plan instantly.
Knowledge-based systems
Knowledge-based systems utilise the knowledge of human experts to perform decision-making. It typically consists of a knowledge base and an inference engine. The knowledge base consists of a collection of information in a specific field location, for example, a pogonion (Pg) point. The inference engine utilises the information in the knowledge base and gathers insights and inferences. In a few systems, the knowledge from human experts is coded as rules and are known as rule-based systems. , The disadvantage of knowledge-based systems is that experts would have to re-establish rules for adding new landmarks to the model. Additionally, the quality of input images plays a significant role in the successful performance of the model in locating the landmarks because of the rigidity of the rules. ,
Pattern-matching algorithms
Pattern-matching algorithms utilise the morphology of structures to perform the automated identification of landmarks. The algorithm works by identifying several structures around the landmark so that it can identify the patterns and match the relation of the landmark to the structures in question. The disadvantage is that this model can only be used for the location of a relatively few landmarks. Additionally, structures similar in morphology but not located near the landmark can result in identification errors. A combination approach with a knowledge-based system and pattern-matching algorithms can be used to overcome some of these disadvantages.
Digital cephalograms use edge-detection techniques, also known as line-detection modules to detect a set of distinct lines within the cephalogram. This process consists of four steps: edge detection, contour segmentation, segment joining and reference line detection. This technique has been shown to provide varying success rates ranging from 65% for landmarks with inadequate contrast, such as Orbitale, to 100% for distinct landmarks, such as maxillary incisor tip.
Hybrid approaches
These approaches utilise different methods, such as image-filtering, knowledge-based systems and model-based approaches, to mitigate the disadvantages of the individual methods. A few landmarks in cephalograms pose more difficulty than other landmarks, such as Porion, Gonion and A-point. In a hybrid approach, the shape of the adjacent structures is first noted, and then reference lines are generated before defining the location of the landmarks. The different types of hybrid approaches and their percentage rates of success in the identification of cephalometric landmarks are shown in Table 98.1 . , , ,
TABLE 98.1
Accuracy of different approaches for automated location of landmarks for the mean radial error of 2 mm
| Type of approach | Description | % rates of success |
|---|---|---|
| Combination of random forest regression and sparse shape composition , | The first step in this method is landmark detection, which uses a random forest model. The second step in this method is landmark modification by adding shape constraints using shape composition. | 44.11% |
| Data-driven image displacement estimation method | In this method, the training set is modelled by manually sampling several square patches around the true landmark location. Then, the calculations of the displacements to the landmarks of these square patches are included in the training data. In the test set, the square patches are generated automatically, and the combined response of all such patches is used to calculate the automated location of landmarks. | 43.89% |
| Automatic globally optimal pictorial structures with random decision forest | The intensity of the pixels, the spatial location of pixels and the filtered response are utilised in this method. In the training set, the features are manually identified around the actual landmark location as true positive samples and other locations as true negative samples. | 62.32% |
| Machine learning tree-based approach , | The landmarks are recognised as binary classification problems in a machine learning tree-based approach model. A pixel is identified as a positive class if it falls in the input range of the landmark. Otherwise, it is identified as a negative class. The final landmark location is detected by identifying the median location of the pixels, which is classified as positive with the highest confidence. | 70.26% |
| Game theory and random forest technique , | In this model, the intensity of landmarks is identified by Harr-like features, and then the random forests model is used to identify the location of landmarks. The spatial associations between the landmarks are then identified by using Gaussian kernel density estimation. | 72.74% |
Based on the data derived from references , , , .
Performance evaluation
Almost all the techniques listed here cannot reach the desired accuracy for optimal orthodontic diagnosis. The errors in the identification of cephalometric landmarks should be around 0.59 mm on the x-axis (horizontal error) and 0.56 mm in the y-axis (vertical error). In current orthodontic literature, a 2 mm difference between the AI model and human operator is deemed by most authors to be successful, and a upto 4 mm error is considered acceptable , ( Fig. 98.3 ).
The radius of error in the automated location of landmarks: 2 mm is identified as successful; 2–4 mm is identified as acceptable and >4 mm is identified as unacceptable.
Deep learning
Deep learning includes AI models in which several layers of processing nodes are used by training algorithms with a large amount of data to identify complex patterns and structures in labelled/unlabelled data. The term deep neural networks imply artificial neural networks (ANN) with at least two or more hidden layers. In deep learning, ‘deep’ refers to multiple layers through which the data are transformed.
The main aim of deep neural networks is to acquire high-level features from the input data. The training method uses several approaches: self-improvement, supervised, unsupervised and/or semi-supervised training from the input datasets. The output obtained from every layer is utilised as input for subsequent layers. The output from the final layers is supervised and utilised as input for the final supervised layer used for finetuning and training the whole network from the initial layer to the final layer. Table 98.2 shows the four main principles deep learning image analysis depends on—four V’s. ,
TABLE 98.2
The four main principles (4 V’s) for image analysis based on deep learning
| Principle | Description |
|---|---|
| Volume | A high volume of data is needed to build a robust model. |
| Variety | This refers to the heterogeneity of the data. Variety is crucial for minimising errors. |
| Veracity | Veracity is influenced by two main variables—(i) image quality, including variables such as resolution, brightness and contrast, and other features (ii) label quality, which relies on the accuracy of experts labelling the data. |
| Velocity | Velocity refers to the proficiency of the processor, Graphics Processing Unit (GPU). |
Convoluted neural networks
A convolutional neural network (CNN) is a type of deep learning system used to analyse images by assigning significance to various aspects of the image and distinguishing between them. The term convolution refers to both the result function and the computing process. It excels in image processing by learning different features of an image and is more adept at handling image complexity compared to regular classification algorithms.
CNN is modelled on the lines of the human neural network and consists of an abstracted model of interconnected neurons with linkages and arrangements that can be used to solve a wide array of problems. CNN belongs to the multilayer perceptron method of deep learning technique. A perceptron is analogous to a biological neuron in a mathematical form. In a human neuron, the dendrites receive electric signals from the axons of other neurons. In a perceptron, these electric signals are represented as numerical values. When multiple perceptron layers are connected, a multilayer perceptron is created and trained using a supervised learning technique called backpropagation. CNN has been quite successful in cephalometric image analysis, and one of the prime reasons is the ability of CNN to identify specific appearance patterns or image ‘patches’ relating to the location of the anatomical landmarks. CNN utilises a hierarchical arrangement in which image patches are used in the first layer, and the information from the image patches is propagated to the subsequent layers while maintaining the spatial local relationship between them. , , Usually, the CNN architecture consists of a repetitive application of at least three layers in addition to multiple hidden layers:
-
(1)
Convolution layer: It is the first step in CNN, and its function is to extract specific details, such as gradients or edges, from the image through mathematical transformations.
-
(2)
Non-linear activation layer: The non-linear activation layer allows for backpropagation in a CNN model. It functions to map the input and output signals.
-
(3)
Pooling layer: It summarises the features in a region of a feature map generated by a convolutional layer by decreasing the number of parameters to learn and the amount of computation performed in the network ( Fig. 98.4 ). Furthermore, data augmentation methods, such as zoom, shear, rotate and elastic transformation, could be performed on the input data images to enlarge the dataset.
Figure 98.4 Convolutional neural network (CNN) architecture for landmark identification.
(A) Input image from which image patches are identified, (B) identification of the pixels, (C) data augmentation, (D) CNN for identification of the landmark Point A, (E) patches classified as Point A, (F) scatter plot and distribution for ‘Point A’, (G) shape-based model refinements, (H) final out of landmark identification.
Various methods have been proposed for automated landmark location in lateral cephalograms using AI. The main purpose of these models is in their applications in orthodontic diagnosis with stringent emphasis placed in achieving high accuracy in identification of cephalometric landmarks. The methods include knowledge-based systems, pattern-matching algorithms, digital cephalograms utilising edge-detection techniques and hybrid approaches combining different methodologies. While each approach has its advantages, the main limitation is that none of these methods have been able to achieve accuracy that can be considered appropriate for clinical use. For that reason, deep learning methods have been investigated in recent years. The initial results from deep learning approaches seem promising but more research still needs to be performed with studies designed similar to actual clinically validated models.
Applications of artificial intelligence in cone-beam computed tomography (CBCT)
Automation in the detection of landmarks on volumetric images generated from MDCT or CBCT (interpretation and analysis) have recently gained momentum. Automation in the process of the volume of nasopharyngeal and airway spaces has been reported by Neelapu et al. CBCT scans provide a three-dimensional visualisation of the head and neck structures. The analysis and interpretation of CBCT scans are complex, time-intensive and require meticulous attention to detail. Therefore, the automated analysis of CBCT scans would benefit orthodontists and radiologists alike. Numerous techniques have been used for this purpose.
-
Feature-based and voxel similarity: Feature-based and voxel similarity-based method for the automated location of landmarks in CBCT scans has been used by Sahidi et al. However, the accuracy has been described to be low, with a standard deviation greater than 3.4 mm compared to that measured by human experts.
-
Reeb graphs method has shown a higher accuracy of about 90%. This technique extracts a three-dimensional mesh database, the landmarks are localised and the images are transformed into a reference shape model. Even though the Reeb graphs method has been reported to be accurate for most landmarks, it only works somewhat accurately to identify specific landmarks such as Sella and posterior nasal spine (PNS).
-
Hybrid approach using the active shape models has been developed to increase the accuracy of landmark identification in a CBCT. The advantage of this model is that it is time-efficient and has lowered the error rate to about 2.5 mm. , However, the error rate is greater than 2 mm, and significant improvements are still needed to implement this technology for everyday use.
-
CNN: Recently, efforts have been concentrated on analysing CBCT scans using three-dimensional CNN. A typical three-dimensional CNN architecture includes four main components, as shown in Fig. 98.5 . The three-dimensional CNN architecture is similar to two-dimensional CNN, but there are a few differences, as highlighted in Table 98.3 . A recent technique based on deep geodesic learning has been reported for automated segmentation of mandible from the CBCT scan and automatic location of landmarks on the segmented mandible. Three essential steps of this technique are:
-
1.
Application of a unified algorithm by a combination of U-net and DenseNET (densely connected networks) with a high number of layers, which is known as Tiramisu for the segmentation of mandible.
-
2.
Application of geodesic learning to locate some important landmarks such as Menton.
-
3.
Long short-term memory (LSTM) is used to locate landmarks like B point, Pogonion and similar landmarks based on the position of Menton. This method has shown good accuracy in an initial study in which 30 CBCT scans of the patients were included. However, further research needs to be done in this direction for the evaluation of the accuracy of automated segmentation and location of landmarks in CBCT volume.
-
1.
