Evaluating the Performance of ChatGPT on Board-Style Examination Questions in Ophthalmology: A Meta-Analysis

Du HQ, Dai Q, Zhang ZH, et al. Artificial intelligence-aided diagnosis and treatment in the field of optometry. Int J Ophthalmol. 2023;16(9):1406–1416. doi:https://doi.org/10.18240/ijo.2023.09.06

Article  PubMed  PubMed Central  Google Scholar 

Zhu S, Liu X, Lu Y, et al. Application and visualization study of an intelligence-assisted classification model for common eye diseases using B-mode ultrasound images. Front Neurosci. 2024;18:1339075. doi:https://doi.org/10.3389/fnins.2024.1339075

Article  PubMed  PubMed Central  Google Scholar 

Gong D, Li WT, Li XM, et al. Development and research status of intelligent ophthalmology in China. Int J Ophthalmol. 2024;17(12):2308–2315. doi:https://doi.org/10.18240/ijo.2024.12.20

Article  PubMed  PubMed Central  Google Scholar 

Bhattacharya P, Prasad VK, Verma A, et al. Demystifying ChatGPT: An In-depth Survey of OpenAI’s Robust Large Language Models. Arch Computat Methods Eng. 2024;31(8):4557–4600. doi:https://doi.org/10.1007/s11831-024-10115-5

Article  Google Scholar 

Chotcomwongse P, Ruamviboonsuk P, Grzybowski A. Utilizing Large Language Models in Ophthalmology: The Current Landscape and Challenges. Ophthalmol Ther. 2024;13(10):2543–2558. doi:https://doi.org/10.1007/s40123-024-01018-6

Article  PubMed  PubMed Central  Google Scholar 

Benichou L, ChatGPT. The role of using ChatGPT AI in writing medical scientific articles. J Stomatol Oral Maxillofac Surg. 2023;124(5):101456. doi:https://doi.org/10.1016/j.jormas.2023.101456

Article  CAS  PubMed  Google Scholar 

Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. doi:https://doi.org/10.1371/journal.pdig.0000198

Article  PubMed  PubMed Central  Google Scholar 

Cappellani F, Card KR, Shields CL, Pulido JS, Haller JA. Reliability and accuracy of artificial intelligence ChatGPT in providing information on ophthalmic diseases and management to patients. Eye (Lond). 2024;38(7):1368–1373. doi:https://doi.org/10.1038/s41433-023-02906-0

Article  PubMed  Google Scholar 

Chan KS, Zary N. Applications and Challenges of Implementing Artificial Intelligence in Medical Education: Integrative Review. JMIR Med Educ. 2019;5(1):e13930. doi:https://doi.org/10.2196/13930

Article  PubMed  PubMed Central  Google Scholar 

F GG, S GA, L GA, et al. Evaluating the Efficacy of ChatGPT in Navigating the Spanish Medical Residency Entrance Examination (MIR): Promising Horizons for AI in Clinical Medicine. Clinics and practice. 2023;13(6). doi:https://doi.org/10.3390/clinpract13060130

Sandmann S, Riepenhausen S, Plagwitz L, Varghese J. Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks. Nat Commun. 2024;15(1):2050. doi:https://doi.org/10.1038/s41467-024-46411-8

Article  CAS  PubMed  PubMed Central  Google Scholar 

Betzler BK, Chen H, Cheng CY, et al. Large language models and their impact in ophthalmology. Lancet Digit Health. 2023;5(12):e917-e924. doi:https://doi.org/10.1016/S2589-7500(23)00201-7

Article  PubMed  PubMed Central  Google Scholar 

Joseph G, Bhatti N, Mittal R, Bhatti A. Current Application and Future Prospects of Artificial Intelligence in Healthcare and Medical Education: A Review of Literature. Cureus. 2025;17(1):e77313. doi:https://doi.org/10.7759/cureus.77313

Article  PubMed  PubMed Central  Google Scholar 

Narayanan S, Ramakrishnan R, Durairaj E, Das A. Artificial Intelligence Revolutionizing the Field of Medical Education. Cureus. 15(11):e49604. doi:https://doi.org/10.7759/cureus.49604

Ghorashi N, Ismail A, Ghosh P, Sidawy A, Javan R. AI-Powered Chatbots in Medical Education: Potential Applications and Implications. Cureus. 15(8):e43271. doi:https://doi.org/10.7759/cureus.43271

Tan TF, Quek C, Wong J, Ting DSW. A look at the emerging trends of large language models in ophthalmology. Curr Opin Ophthalmol. 2025;36(1):83–89. doi:https://doi.org/10.1097/ICU.0000000000001097

Article  PubMed  Google Scholar 

Sevgi M, Antaki F, Keane PA. Medical education with large language models in ophthalmology: custom instructions and enhanced retrieval capabilities. Br J Ophthalmol. 2024;108(10):1354–1361. doi:https://doi.org/10.1136/bjo-2023-325046

Article  PubMed  Google Scholar 

Zong H, Wu R, Cha J, et al. Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis. J Med Internet Res. 2024;26:e66114. doi:https://doi.org/10.2196/66114

Article  PubMed  PubMed Central  Google Scholar 

Wu JH, Nishida T, Liu TYA. Accuracy of large language models in answering ophthalmology board-style questions: A meta-analysis. Asia-Pacific Journal of Ophthalmology. 2024;13(5):100106. doi:https://doi.org/10.1016/j.apjo.2024.100106

Article  PubMed  Google Scholar 

Agnihotri AP, Nagel ID, Artiaga JCM, Guevarra MCB, Sosuan GMN, Kalaw FGP. Large Language Models in Ophthalmology: A Review of Publications from Top Ophthalmology Journals. Ophthalmol Sci. 2025;5(3):100681. doi:https://doi.org/10.1016/j.xops.2024.100681

Article  PubMed  Google Scholar 

Sensoy E, Citirik M. Assessing the Competence of Artificial Intelligence Programs in Pediatric Ophthalmology and Strabismus and Comparing their Relative Advantages. Rom J Ophthalmol. 2023;67(4):389–393. doi:https://doi.org/10.22336/rjo.2023.61

Article  PubMed  Google Scholar 

Sensoy E, Citirik M. A comparative study on the knowledge levels of artificial intelligence programs in diagnosing ophthalmic pathologies and intraocular tumors evaluated their superiority and potential utility. Int Ophthalmol. 2023;43(12):4905–4909. doi:https://doi.org/10.1007/s10792-023-02893-x

Article  PubMed  Google Scholar 

Ling Q, Xu ZS, Zeng YM, et al. Assessing the possibility of using large language models in ocular surface diseases. Int J Ophthalmol. 2025;18(1):1–8. doi:https://doi.org/10.18240/ijo.2025.01.01

Article  PubMed  PubMed Central  Google Scholar 

McInnes MDF, Moher D, Thombs BD, et al. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA. 2018;319(4):388–396. doi:https://doi.org/10.1001/jama.2017.19163

Article  PubMed  Google Scholar 

Whiting PF, Rutjes AWS, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–536. doi:https://doi.org/10.7326/0003-4819-155-8-201110180-00009

Article  PubMed  Google Scholar 

Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. J Biomed Inform. 2024;151:104620. doi:https://doi.org/10.1016/j.jbi.2024.104620

Article  PubMed  Google Scholar 

Cai LZ, Shaheen A, Jin A, et al. Performance of Generative Large Language Models on Ophthalmology Board-Style Questions. Am J Ophthalmol. 2023;254:141–149. doi:https://doi.org/10.1016/j.ajo.2023.05.024

Article  PubMed  Google Scholar 

Lin JC, Younessi DN, Kurapati SS, Tang OY, Scott IU. Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination. Eye (Lond). 2023;37(17):3694–3695. doi:https://doi.org/10.1038/s41433-023-02564-2

Article  PubMed  Google Scholar 

Mihalache A, Huang RS, Popovic MM, Muni RH. Performance of an Upgraded Artificial Intelligence Chatbot for Ophthalmic Knowledge Assessment. JAMA Ophthalmol. 2023;141(8):798–800. doi:https://doi.org/10.1001/jamaophthalmol.2023.2754

Article  PubMed  PubMed Central  Google Scholar 

Mihalache A, Popovic MM, Muni RH. Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment. JAMA Ophthalmol. 2023;141(6):589–597. doi:https://doi.org/10.1001/jamaophthalmol.2023.1144

Article  PubMed  PubMed Central  Google Scholar 

Moshirfar M, Altaf AW, Stoakes IM, Tuttle JJ, Hoopes PC. Artificial Intelligence in Ophthalmology: A Comparative Analysis of GPT-3.5, GPT-4, and Human Expertise in Answering StatPearls Questions. Cureus. 2023;15(6):e40822. doi:https://doi.org/10.7759/cureus.40822

Article  PubMed  PubMed Central  Google Scholar 

Sakai D, Maeda T, Ozaki A, Kanda GN, Kurimoto Y, Takahashi M. Performance of ChatGPT in Board Examinations for Specialists in the Japanese Ophthalmology Society. Cureus. 2023;15(12):e49903. doi:

Comments (0)

No login
gif