Skip to main content

Table 2 Summary of the included studies

From: Performance of artificial intelligence on cervical vertebral maturation assessment: a systematic review and meta-analysis

Author, year

Data modality

Data set size (train/valid/test)

Inclusion and exclusion criteria (if any)

AI type

Labeling procedure

Pre-processing

Augmentation

Model structure

Performance measurements

Outcome

Akay, G., et al. 2023 [27]

Cephalograms

588 (447/141)

Inclusion: patients between 8–22 years, clear C2, C3 and C4, images with no artifacts and distortions

Deep learning

By two dentomaxilofacial radiologists, final decision but agreement among observers

Cropping and labeling, size reduction

NA

CNN

Kappa coefficient,

0.88

Precision (per class)

(0.47–0.82)

Recall (per class)

(0.37–0.74)

F1 score (per class)

(0.44–0.76)

Accuracy

0.57

Khazaei M. et al. 2023 [28]

Cephalograms

1846 (1477/-/369)

Inclusion: age between 5–18 years, clear C2, C3 and C4, no trauma or surgery in head and neck area, no orthodontic treatment, no medical condition affecting bone development, no systemic disease, no growth delay, no craniofacial anomalies, no growth disorders and no growth hormone therapy

Deep learning

By two orthodontists

Cropping

Translation, rotation, zoom, intensity shift, normalization

ConvNextBase-296

Accuracy

0.82

F-score

0.81

EfficientNetB3-386

Accuracy

0.81

F-score

0.80

DenseNet-121

Accuracy

0.80

F-score

0.80

DenseNet-169

Accuracy

0.67

F-score

0.66

VGG-16

Accuracy

0.75

F-score

0.74

VGG-19

Accuracy

0.68

F-score

0.67

ResNet-101

Accuracy

0.65

F-score

0.65

ResNet50

Accuracy

0.65

F-score

0.65

Li H., et al. 2022 [29]

Cephalograms

10,200 (7111/1544/1545)

Inclusion: No congenital or acquired malformation of the cervical vertebrae, no trauma or operation in the head and neck area, no disorder affecting bone development, no systemic disease, no growth and development retardation, no congenital acquired malformations in the head and neck region, clear C2, C3, and C4

Deep learning

By two orthodontists in case of disagreement the third orthodontist was consulted

Automatic ROI extraction using YOLOv3 and shape recognition network

NA

ConvNet

Precision (per class)

(0.57–0.85)

Recall (per class)

(0.63–0.81)

F1 score (per class)

(0.60–0.81)

Accuracy

0.70

AUC

0.94

ICC

0.94

Makaremi, M., et al. (2019) [30]

Cephalograms

600 (300/ 200/ 100) and 900 and 1900

NA

Deep learning

By a radiologic technician

Cropping, Sobel filtering, entropy filter

NA

CNN

Precision (per class) 6 layer

(0.59–0.99)

Recall (per class) 6 layers

(0.67–0.99)

F1 score (per class) 6 layers

(0.74–0.92)

Recall (per class) 7 layers

(0.67–0.99)

Precision (per class) 7 layer

(0.59–0.99)

F1 score (per class) 7 layers

(0.74–0.92)

Zhou, J., et al. 2021 [21]

Cephalograms

1080 (980/ -/ 100)

Inclusion: clear contour of c2, c3, c4, 6–22 years

Deep learning

By two examiners in case of disagreement the third examiner was consulted

Cropping, extracting and crafting the features (measurement between landmarks)

NA

CNN

ICC

0.98

exclusion: congenital disease

Accuracy

0.71

Kim, E.-G et al. 2021 [31]

Cephalograms

600 (fivefold cross validation)

Inclusion: 6–18 years

Deep learning

By two specialists

Automated ROI extraction using U-net

Rotation, horizontal and vertical flip, changes in brightness, saturation, contrast and hue

CNN

Accuracy

0.62

Makaremi, M., et al. (2020) [32]

Cephalograms

600 (300/ 200/ 100) and (200/200/200)

NA

Deep learning

By an expert

Cropping, Sobel filtering

NA

CNN

Accuracy

0.90

Kok, H., et al. 2020 [33]

Cephalograms

360 (fivefold cross validation)

Inclusion: 8–17 years

Deep learning, Machine learning

By an orthodontist

Extracting and crafting the features (measurement between landmarks)

NA

ANN

Accuracy

0.94

Precision (per class)

(0.83–1.0)

Recall (per class)

(0.83–1.0)

F1 score (per class)

(0.83–0.97)

Kappa coefficient

0.95

Naïve Bayes model

Accuracy

0.68

Exclusion: disease preventing bone development, systemic diseases

and syndromes, growth and development retardation, an anomaly with prevention of craniofacial growth, endocrine disorders

or malnutrition, long-term infectious disease

Kappa coefficient

0.61

Precision (per class)

(0.25–1.0)

Recall (per class)

(0.05–1.0)

F1 score (per class)

(0.08–0.90)

Kok, H., et al. 2021 [34]

Cephalograms

419 patients (293/ 63/63)

Inclusion: 8–17 years

Deep learning

By an experienced researcher

Extracting and crafting the features (measurement between landmarks)

NA

ANN

Accuracy

0.94

Sensitivity (per class)

(0.88–1.0)

Specificity (per class)

(0.97–1.0)

F1 score (per class)

(0.90–1.0)

Amasya, H., et al. 2020 [35]

Cephalograms

647 (498/ -/ 149)

Inclusion: no congenital or acquired malformation of the cervical

vertebrae, proper visualization of C2, C3, C4 and C5, age between 10 and 30

Deep learning, Machine learning

By a software and two radiologists

Extracting and crafting the features (measurement between landmarks)

NA

ANN

Agreement

0.86

Kappa coefficient (wk)

0.92

LR

Agreement

0.78

Kappa coefficient (wk)

0.86

SVM

Agreement

0.81

Kappa coefficient (wk)

0.87

RF

Agreement

0.82

Kappa coefficient (wk)

0.90

DT

Agreement

0.85

Kappa coefficient (wk)

0.92

Amasya H. et al. 2020 [36]

Cephalograms

647

Inclusionage between 10 and 30, no congenital or acquired malformation of the cervical vertebrae, good quality of C2, C3, C4 and C5

Exclusion: current orthodontic treatment, permanent incisors or first molars missing, erupted or supernumerary teeth overlapping incisor apex, obvious skeletal asymmetry

Deep learning

By a software and two radiologists

NA

NA

ANN

Agreement with observers

0.58

Mohammad-Rahimi, H., et al. 2022 [37]

Cephalograms

890 (692/ 99/ 99)

Inclusion: cephalograms with visible c2 to c4

Deep learning

By two orthodontists

Cropping

Random cropping, random color jitter, random affine, random gaussian noise

ResNet 101

Accuracy,

0.61

Precision (per class)

(0.25–0.88)

exclusion: images of patients wearing items, non-standard images, low quality images

Recall (per class)

(0.33–0.78)

F1 score (per class)

(0.29–0.82)

Liao, N., et al. 2022 [38]

Cephalograms

900 (fivefold cross-validation)

Inclusion: 7–25 years

Deep learning

By three orthodontics and radiologists

Cropping

Random horizontal flipping, color jittering, random rotation

Resnet 50- iCVM

Accuracy (CVM-900, CVM-900-subset)

(0.69, 0.84)

Kappa coefficient (CVM-900, CVM-900-subset)

(0.94, 0.96)

MAE (CVM-900, CVM-900-subset)

(0.33,0.16)

Li, H., et al. 2022 [39]

Cephalograms

6079 (4255/912/ 912)

Inclusion: complete medical record, qualified cephalograms, age less than 18 years old

Deep learning

By two experienced orthodontists in case of disagreement the third orthodontist was consulted

Cropping

Random translation, random rotation, adaptive histogram equalization

Resnet 152

Kappa coefficient

0.82

AUC

0.93

Accuracy

0.67

Precision (per class)

(0.52–0.77)

Recall (per class)

(0.52–0.84)

F1 score (per class)

(0.52–0.81)

VGG16

Kappa coefficient

0.79

AUC

0.92

Accuracy

0.61

GoogLeNet

Kappa coefficient

0.81

AUC

0.92

Accuracy

0.64

exclusion: syndromes, metabolic disease, special drugs, disease affecting growth and development

DenseNet161

Kappa coefficient

0.81

AUC

0.92

Accuracy

0.64

Seo, H., et al. 2021 [24]

Cephalograms

600 (480/ -/120)

Inclusion: 6–19 years

Deep learning

By a radiologist

Cropping

Rotation, horizontal and vertical translation, horizontal and vertical scaling

Inception-Resnet v2

Accuracy

0.94 ± 0.018

Precision

0.84 ± 0.064

Recall

0.84 ± 0.061

F1 score

0.84 ± 0.051

ResNet-18

Accuracy

0.92 ± 0.025

Precision

0.80 ± 0.094

Recall

0.80 ± 0.065

F1 score

0.80 ± 0.074

MobileNet-v2

Accuracy

0.91 ± 0.022

Precision

0.77 ± 0.111

Recall

0.77 ± 0.040

F1 score

0.77 ± 0.070

ResNet-50

Accuracy

0.92 ± 0.025

Precision

0.80 ± 0.096

Recall

0.80 ± 0.068

F1 score

0.80 ± 0.075

ResNet-101

Accuracy

0.93 ± 0.020

Precision

0.82 ± 0.113

Recall

0.83 ± 0.096

F1 score

0.82 ± 0.054

Inception-v3

Accuracy

0.93 ± 0.027

Precision

0.82 ± 0.119

Recall

0.83 ± 0.100

F1 score

0.82 ± 0.082

Atici, S. F., et al. 2023 [40]

Cephalograms

1012 (823/-/189)

Inclusion: clear and visible C2, C3 and C4,

Deep learning

By an orthodontist

Segmentation and cropping

Random translation, rotation and auto contrast

Aggregate net

Accuracy

(male:0.75, female:0.82)

exclusion: abnormalities of head and neck, low image quality

Intra-examiner agreement (wk)

0.95

Inter-examiner agreement (wk)

0.90

Atici, SF., et al. 2022 [25]

Cephalograms

1018 (761/ -/ 257)

Inclusion: age between 4 and 29, adequate quality, clear c2/c3/c4,

Deep learning, Machine learning

By an expert Orthodontist Scientist and by an oral and maxillofacial surgeon

Automatic ROI extraction by Aggregate channel features object detector

Not needed

CNN

Accuracy

0.84

Intra-examiner agreement (wk)

0.95

Inter-examiner agreement

0.90

Recall (per class)

(0.52–0.77)

Precision (per class)

(0.55–0.78)

F1 score (per class)

(0.55–0.76)

MobileNet V2

Accuracy (with directional filters)

0.69

exclusion: poor quality, head and neck malformation

ResNet101

Accuracy (with directional filters)

0.68

Xception

Accuracy (with directional filters)

0.71

SVM

Accuracy (with directional filters)

0.60

Radwan, M., et al. 2022 [41]

Cephalograms

1501 (1201/150/150)

Inclusion: patients between 7–25 years

Deep learning

By an orthodontic resident

Automatic ROI extraction using U-net

NA

Alex-net

ICC

0.97

exclusion: artifacts, incomplete C2, C3 or C4, syndromes affecting maxillofacial, incorrect head position

Kappa coefficient

0.87 ± 0.027

Accuracy (per class)

(0.80–0.91)

Sensitivity (per class)

(0.45–0.98)

Specificity (per class)

(0.75–0.94)

F1 score (per class)

(0.57–0.90)

Xie, L., et al. 2021 [42]

CBCT

231

Inclusion: no history of systemic or physiological disorders, no history of trauma or surgery in the dentofacial region and reliable CBCT scans, female, 7–17 years old

Machine learning

By three orthodontists

Reorientation, MPR mode, extracting and crafting the features (measurement between landmarks)

NA

LR

Accuracy

0.87

AUC

0.94

Kok, H., et al. 2019 [13]

Cephalograms

300 (fivefold cross validation)

Inclusion: 8–17 years,balance quality,clear c2/c3/c4, no trauma, operation, congenital or acquired malformations

in the head and neck area, no history of orthodontic treatment, no disorder interposing with bone development, no systemic disease or growth and development retardation

Deep learning, Machine learning

By an orthodontist

Extracting and crafting the features (measurement between landmarks)

NA

DT

Classification accuracy (per class)

(0.83–0.99)

AUC (per class)

(0.71–0.98)

F1 score (per class)

(0.42–0.97)

Precision (per class)

(0.40–0.97)

Recall (per class)

(0.40–0.98)

kNN

Classification accuracy (per class)

(0.81–0.92)

AUC (per class)

(0.81–0.95)

F1 score (per class)

(0.44–0.82)

Precision (per class)

(0.48–0.78)

Recall (per class)

(0.38–0.86)

SVM

Classification accuracy (per class)

(0.88–0.95)

AUC (per class)

(0.90–0.99)

F1 score (per class)

(0.50–0.91)

Precision (per class)

(0.51–0.84)

Recall (per class)

(0.50–0.98)

RF

Classification accuracy (per class)

(0.83–0.97)

AUC (per class)

(0.84–0.99)

F1 score (per class)

(0.39–0.95)

Precision (per class)

(0.40–0.91)

Recall (per class)

(0.38–0.98)

Neural network

Classification accuracy (per class)

(0.85–0.97)

AUC (per class)

(0.90–0.99)

F1 score (per class)

(0.48–0.95)

Precision (per class)

(0.47–0.93)

Recall (per class)

(0.50–0.97)

Naïve Bayes

Classification accuracy (per class)

(0.95–0.83)

AUC (per class)

(0.85–0.98)

F1 score (per class)

(0.38–0.88)

Precision (per class)

(0.44–0.92)

Recall (per class)

(0.33–0.85)

LR

Classification accuracy (per class)

(0.81–0.90)

AUC (per class)

(0.81–0.96)

F1 score (per class)

(0.25–0.75)

Precision (per class)

(0.36–0.75)

Recall (per class)

(0.19–0.98)

Sokic, E., et al. 2012 [43]

Cephalograms

211

Inclusion: 8–16 years

Machine learning

By orthodontists

Prescaling, bilinear projective transformation, special markers, extracting and crafting the features (measurement between landmarks)

NA

Fuzzy C means

Accuracy

0.70

Xie, L., et al. 2022 [44]

CBCT

709 (447/-/262)

Inclusion: 7–19 years, no history of systemic or physiological syndromes, no history of trauma or surgery in the dentofacial area, dependable and suitable CBCT scans

Statistical modeling

By three orthodontists

Reorientation, MPR mode, extracting and crafting the features (measurement between landmarks)

NA

LR

Agreement percentage

0.88

Kappa coefficient

0.90

AUC

0.96

ICC (range)

(0.94–0.99)

Yang, Y. M., et al. (2014) [45]

CBCT

121

Inclusion: 6–18 years

Statistical modeling

By an investigator

NA

NA

Regression models

R \(2\) (Female-male)

(0.84–0.9)

exclusion: cleft lip and/or palate, trauma, or syndromes

Baptisa, R. S., et al. 2012 [46]

Cephalograms

188 (tenfold cross validation)

NA

Machine learning

By specialist in orthodontics and radiology and a specialist in orthodontics and then by an examiner using a software

Extracting and crafting the features (measurement between landmarks), cropping

NA

Naïve bayes 1

Kappa coefficient

0.99 ± 0.019

Accuracy

0.90

Feng, X., et al. (2021) [47]

Cephalograms and CBCT

60

Inclusion: 8–16 years, in the age of growth and development; Exclusion: unclear CBCT, incomplete C2 and C4, history of craniofacial deformity, syndrome affecting the shape of the cervical spine, intense systemic STDs

Rule-based AI

By a researcher with three years of experience in CVM assessment

Otsu’s method, three-dimensional least squares method, superpixel segmentation, And marking the selected points automatically with morphological algorithm and manual method, extracting and crafting the features (measurement between landmarks)

NA

Decision Tree

Kappa coefficient

0.87

Gamma value

0.99

  1. NA Not assigned, CNN Convolutional neural network, ICC Intraclass correlation coefficient, ROI Region of interest, ANN Artificial neural network, MAE Mean Absolute Error, CVM Cervical vertebrae maturation, AUC Area under curve, WK Weighted kappa, MPR Multiplanar reformation, CBCT Cone beam computed tomography, STD Sexually transmitted diseases, AI Artificial intelligence, LR Likelihood ratio, DT Decision tree, RF Random forest, SVM Support vector machine, KNN K-nearest neighbors