Skip to main content

Table 1 Performance of large language models’ on DUS oral pathology questions

From: Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis

 

Correct

Incorrect

P*

Gemini 1.5

81

19

0.000

Gemini 2

82

18

Copilot

61

39

Deepseek

82

18

Claude

84

16

ChatGPT 4o

79

21

ChatGPT 4

69

31

ChatGPT o1

96

4

  1. *Pearson Chi Square. The statistical significance level was set at P ≤ 0.05