LLM | Case-Based Questions | Knowledge-Based Questions | |||||
---|---|---|---|---|---|---|---|
Correct | Incorrect | P* | Correct | Incorrect | P* | P** | |
Gemini 1.5 | 25 | 4 | 0.034 | 56 | 15 | 0.000 | 0,292 |
Gemini 2 | 24 | 5 | 58 | 13 | 0,574 | ||
Copilot | 22 | 7 | 39 | 32 | 0,041 | ||
Deepseek | 21 | 8 | 61 | 10 | 0,098 | ||
Claude | 26 | 3 | 58 | 13 | 0,252 | ||
ChatGPT 4o | 22 | 7 | 57 | 14 | 0,404 | ||
ChatGPT 4 | 17 | 12 | 52 | 19 | 0,117 | ||
ChatGPT o1 | 27 | 2 | 69 | 2 | 0,330 |