Comparison of Chat GPT and Gemini in Neurosurgical Evaluation Questions

Abstract

Introduction: Artificial intelligence (AI) represents a transformative technology that emulates human intelligence through computer systems. ChatGPT and Gemini, two prominent conversational AI models developed by OpenAI and Google AI. we sought to evaluate the appropriateness of GPT 4 and Geminin’s response to neurosurgical evaluation questions. Methods: The questions used in this study consist of 40 randomly selected questions from the board exam prepared by an internationally recognized neurosurgical institution. The study used the free version of the Gemini artificial intelligence system developed by Google and the free version of the ChatGPT 4.o multilingual algorithm system developed by OpenAI. Questions in four different categories were asked to two artificial intelligence programs in different orders on 10 different days. The results were recorded and the correct and incorrect answers were displayed. Results ChatGPT 4.0 was significantly more successful than Gemini in the anatomic and radiologic evaluation of the models (p<0.001). When compared with the answers given by the Gemini artificial intelligence system, a significant difference was found between the test classes (F=26.33, p<0.001). Conclusion: In our study, it was seen that ChatGPT 4.0 was more successful than Gemini in anatomical and radiological evaluation of brain surgery questions