The applications of ChatGPT and other large language models in anesthesiology and critical care: a systematic review

Jackson A. ChatGPT one year on: adoption, transformation, regulation; 2024. Available from URL: https://aimagazine.com/machine-learning/chatgpt-one-year-on-adoption-transformation-regulation (accessed March 2025).

Shahzad T, Mazhar T, Tariq MU, Ahmad W, Ouahada K, Hamam H. A comprehensive review of large language models: issues and solutions in learning environments. Discov Sustain 2025; 6. https://doi.org/10.1007/s43621-025-00815-8

Article  Google Scholar 

Muftić F, Kadunić M, Mušinbegović A, Almisreb AA. Exploring medical breakthroughs: a systematic review of ChatGPT applications in healthcare; 2023. Available from URL: https://www.researchgate.net/publication/370832949_Exploring_Medical_Breakthroughs_A_Systematic_Review_of_ChatGPT_Applications_in_Healthcare#fullTextFileContent (accessed May 2025).

Caruccio L, Cirillo S, Polese G, Solimando G, Sundaramurthy S, Tortora G. Can ChatGPT provide intelligent diagnoses? A comparative study between predictive models and ChatGPT to define a new medical diagnostic bot. Expert Syst Appl 2024; 235: 121186. https://doi.org/10.1016/j.eswa.2023.121186

Article  Google Scholar 

Omar M, Soffer S, Charney AW, Landi I, Nadkarni GN, Klang E. Applications of large language models in psychiatry: a systematic review. Front Psychiatry 2024; 15: 1422807. https://doi.org/10.3389/fpsyt.2024.1422807

Article  PubMed  PubMed Central  Google Scholar 

Lucas HC, Upperman JS, Robinson JR. A systematic review of large language models and their implications in medical education. Med Educ 2024; 58: 1276–85. https://doi.org/10.1111/medu.15402

Article  PubMed  Google Scholar 

Kim JH, Kim H, Jang JS, et al. Development and validation of a difficult laryngoscopy prediction model using machine learning of neck circumference and thyromental height. BMC Anesthesiol 2021; 21: 125. https://doi.org/10.1186/s12871-021-01343-4

Article  PubMed  PubMed Central  Google Scholar 

Tavolara TE, Gurcan MN, Segal S, Niazi MK. Identification of difficult to intubate patients from frontal face images using an ensemble of deep learning models. Comput Biol Med 2021; 136: 104737. https://doi.org/10.1016/j.compbiomed.2021.104737

Article  PubMed  PubMed Central  Google Scholar 

Afshar S, Boostani R, Sanei S. A combinatorial deep learning structure for precise depth of anesthesia estimation from EEG signals. IEEE J Biomed Health Inform 2021; 25: 3408–15. https://doi.org/10.1109/jbhi.2021.3068481

Article  PubMed  Google Scholar 

Park Y, Han SH, Byun W, Kim JH, Lee HC, Kim SJ. A real-time depth of anesthesia monitoring system based on deep neural network with large EDO tolerant EEG analog front-end. IEEE Trans Biomed Circuits Syst 2020; 14: 825–37. https://doi.org/10.1109/tbcas.2020.2998172

Article  PubMed  Google Scholar 

Tewfik G, Naftalovich R, Kaila J, Adaralegbe A. ChatGPT and its potential implications for clinical practice: an anesthesiology perspective. Biomed Instrum Technol 2023; 57: 26–30. https://doi.org/10.2345/0899-8205-57.1.26

Article  PubMed  PubMed Central  Google Scholar 

Haltaufderheide J, Ranisch R. The ethics of ChatGPT in medicine and healthcare: a systematic review on large language models (LLMs). NPJ Digit Med 2024; 7: 183. https://doi.org/10.1038/s41746-024-01157-x

Article  PubMed  PubMed Central  Google Scholar 

Schiavo JH. PROSPERO: an international register of systematic review protocols. Med Ref Serv Q 2019; 38: 171–80. https://doi.org/10.1080/02763869.2019.1588072

Article  PubMed  Google Scholar 

Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Br Med J 2021; 372: n71. https://doi.org/10.1136/bmj.n71

Article  Google Scholar 

Sterne JA, Hernán MA, Reeves BC, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. Br Med J 2016; 355: i4919. https://doi.org/10.1136/bmj.i4919

Article  Google Scholar 

Munn Z, Barker TH, Moola S, et al. Methodological quality of case series studies: an introduction to the JBI critical appraisal tool. JBI Evid Synth 2020; 18: 2127–33. https://doi.org/10.11124/jbisrir-d-19-00099

Article  PubMed  Google Scholar 

Falagas ME, Kouranos VD, Arencibia-Jorge R, Karageorgopoulos DE. Comparison of SCImago Journal Rank indicator with journal impact factor. FASEB J 2008; 22: 2623–8. https://doi.org/10.1096/fj.08-107938

Article  CAS  PubMed  Google Scholar 

Akhondi-Asl A, Yang Y, Luchette M, Burns JP, Mehta NM, Geva A. Comparing the quality of domain-specific versus general language models for artificial intelligence-generated differential diagnoses in PICU patients. Pediatr Crit Care Med 2024; 25: e273–82. https://doi.org/10.1097/pcc.0000000000003468

Article  PubMed  Google Scholar 

Amacher SA, Arpagaus A, Sahmer C, et al. Prediction of outcomes after cardiac arrest by a generative artificial intelligence model. Resusc Plus 2024; 18: 100587. https://doi.org/10.1016/j.resplu.2024.100587

Article  PubMed  PubMed Central  Google Scholar 

Ando K, Sato M, Wakatsuki S, et al. A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions. BJA Open 2024; 10: 100296. https://doi.org/10.1016/j.bjao.2024.100296

Article  PubMed  PubMed Central  Google Scholar 

Blacker SN, Kang M, Chakraborty I, et al. Utilizing artificial intelligence and chat generative pretrained transformer to answer questions about clinical scenarios in neuroanesthesiology. J Neurosurg Anesthesiol 2023; 36: 346–51. https://doi.org/10.1097/ana.0000000000000949

Article  PubMed  Google Scholar 

Choi J, Oh AR, Park J, et al. Evaluation of the quality and quantity of artificial intelligence-generated responses about anesthesia and surgery: using ChatGPT 3.5 and 4.0. Front Med (Lausanne) 2024; 11: 1400153. https://doi.org/10.3389/fmed.2024.1400153

Article  PubMed  Google Scholar 

Chung P, Fong CT, Walters AM, Aghaeepour N, Yetisgen M, O’Reilly-Shah VN. Large language model capabilities in perioperative risk prediction and prognostication. JAMA Surg 2024; 159: 928–37. https://doi.org/10.1001/jamasurg.2024.1621

Article  PubMed  PubMed Central  Google Scholar 

Cruz G, Pedroza S, Ariza F. ChatGPT’s learning and reasoning capacity in anesthesiology. Colomb J Anesthesiol 2024; 52: e1092. https://doi.org/10.5554/22562087.e1092

Article  Google Scholar 

Gakuba C, Le Barbey C, Sar A, et al. Evaluation of ChatGPT in predicting 6-month outcomes after traumatic brain injury. Crit Care Med 2024; 52: 942–50. https://doi.org/10.1097/ccm.0000000000006236

Article  PubMed  Google Scholar 

Gondode P, Duggal S, Garg N, Sethupathy S, Asai O, Lohakare P. Comparing patient education tools for chronic pain medications: artificial intelligence chatbot versus traditional patient information leaflets. Indian J Anaesth 2024; 68: 631–6. https://doi.org/10.4103/ija.ija_204_24

Article  PubMed  PubMed Central  Google Scholar 

Guthrie E, Levy D, Del Carmen G. The Operating and Anesthetic Reference Assistant (OARA): a fine-tuned large language model for resident teaching. Am J Surg 2024; 234: 28–34. https://doi.org/10.1016/j.amjsurg.2024.02.016

Article  PubMed  Google Scholar 

Hurley NC, Gupta RK, Schroeder KM, Hess AS. Danger, danger, Gaston Labat! Does zero-shot artificial intelligence correlate with anticoagulation guidelines recommendations for neuraxial anesthesia? Reg Anesth Pain Med 2024; 49: 661–7. https://doi.org/10.1136/rapm-2023-104868

Article  PubMed  Google Scholar 

Khan AA, Yunus R, Sohail M, et al. Artificial intelligence for anesthesiology board-style examination questions: role of large language models. J Cardiothorac Vasc Anesth 2024; 38: 1251–9. https://doi.org/10.1053/j.jvca.2024.01.032

Article  PubMed  Google Scholar 

Levin C, Kagan T, Rosen S, Saban M. An evaluation of the capabilities of language models and nurses in providing neonatal clinical decision support. Int J Nurs Stud 2024; 155: 104771. https://doi.org/10.1016/j.ijnurstu.2024.104771

Article  PubMed  Google Scholar 

Liu T, Duan Y, Li Y, Hu Y, Su L, Zhang A. ChatGPT achieves comparable accuracy to specialist physicians in predicting the efficacy of high-flow oxygen therapy. Heliyon 2024; 10: e31750. https://doi.org/10.1016/j.heliyon.2024.e31750

Article  CAS  PubMed  PubMed Central  Google Scholar 

Mootz AA, Carvalho B, Sultan P, Nguyen TP, Reale SC. The accuracy of ChatGPT-generated responses in answering commonly asked patient questions about labor epidurals: a survey-based study. Anesth Analg 2024; 138: 1142–4. https://doi.org/10.1213/ane.0000000000006801

Comments (0)

No login
gif