AgentMRI: A Vison Language Model-Powered AI System for Self-regulating MRI Reconstruction with Multiple Degradations

Vlaardingerbroek MT, Boer JA: Magnetic Resonance Imaging: Theory and Practice, Berlin: Springer, 2013

Hore PJ: Nuclear Magnetic Resonance, Oxford: Oxford University Press, 2015

Deshmane A, Gulani V, Griswold MA, Seiberlich N: Parallel MR imaging. J Magn Reson Imaging 36:55–72, 2012

Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P: SENSE: Sensitivity encoding for fast MRI. Magn Reson Med 42:952–962, 1999

Lustig M, Donoho DL, Santos JM, Pauly JM: Compressed sensing MRI. IEEE Signal Process Mag 25:72–82, 2008

Haldar JP: Low-rank modeling of local \(k\)-space neighborhoods (LORAKS) for constrained MRI. IEEE Trans Med Imaging 33:668–681, 2013

Liang D, Cheng J, Ke Z, Ying L: Deep magnetic resonance image reconstruction: Inverse problems meet neural networks. IEEE Signal Process Mag 37:141–151, 2020

Yamashita R, Nishio M, Do RKG, Togashi K: Convolutional neural networks: An overview and application in radiology. Insights Imaging 9:611–629, 2018

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020

Cui L, Song Y, Wang Y, Wang R, Wu D, Xie H, Li J, Yang G: Motion artifact reduction for magnetic resonance imaging with deep learning and k-space analysis. PLoS One 18:e0278668, 2023

Manso Jimeno M, Ravi KS, Fung M, Oyekunle D, Ogbole G, Vaughan JT, Geethanath S: Automated detection of motion artifacts in brain MR images using deep learning. NMR Biomed 38:e5276, 2025

Latif S, Asim M, Usman M, Qadir J, Rana R: Automating motion correction in multishot MRI using generative adversarial networks. arXiv preprint arXiv:1811.09750, 2018

Ilicak E, Saritas E, Cukur T: Automated parameter selection for accelerated MRI reconstruction via low-rank modeling of local k-space neighborhoods. Z Med Phys 33:203–219, 2023

Toma TT, Weller DS: Fast automatic parameter selection for MRI reconstruction. In: Proc IEEE Int Symp Biomed Imaging (ISBI), pp. 1078–1081, 2020

Mathew RS, Paul JS: Automated regularization parameter selection using continuation based proximal method for compressed sensing MRI. IEEE Trans Comput Imaging 6:1309–1319, 2020

Ramani S, Liu Z, Rosen J, Nielsen JF, Fessler JA: Regularization parameter selection for nonlinear iterative image restoration and MRI reconstruction using GCV and SURE-based methods. IEEE Trans Image Process 21:3659–3672, 2012

Okinaka A, Saju G, Chang Y: Automating kernel size selection in MRI reconstruction via a transparent and interpretable search approach. In: Int Symp Vis Comput, pp. 420–430, 2023

Lin FH, Kwong KK, Belliveau JW, Wald LL: Parallel imaging reconstruction using automatic regularization. Magn Reson Med 51:559–567, 2004

Reishofer G, Koschutnig K, Enzinger C, Ischebeck A, Keeling S, Stollberger R, Ebner F: Automated macrovessel artifact correction in dynamic susceptibility contrast magnetic resonance imaging using independent component analysis. Magn Reson Med 65:848–857, 2011

Yang J, Jin H, Tang R, Han X, Feng Q, Jiang H, Zhong S, Yin B, Hu X: Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. ACM Trans Knowl Discov Data 18:1–32, 2024

Naveed H, Khan A, Qiu S, Saqib M, Anwar S, Usman M, Akhtar N, Barnes N, Mian A: A comprehensive overview of large language models. arXiv preprint arXiv:2307.06435, 2023

Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, Min Y, Zhang B, Zhang J, Dong Z, Du Y: A survey of large language models. arXiv preprint arXiv:2303.18223, 2023

Zhang J, Huang J, Jin S, Lu S: Vision-language models for vision tasks: A survey. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/TPAMI.2024.3371387,February 26, 2024

Wu J, Gan W, Chen Z, Wan S, Philip SY: Multimodal large language models: A survey. In: Proc IEEE Int Conf Big Data (BigData), pp. 2247–2256, 2023

Wang L, Ma C, Feng X, Zhang Z, Yang H, Zhang J, Chen Z, Tang J, Chen X, Lin Y, Zhao WX: A survey on large language model based autonomous agents. Front Comput Sci 18:186345, 2024

Ge Y, Hua W, Mei K, Tan J, Xu S, Li Z, Zhang Y: OpenAGI: When LLM meets domain experts. Adv Neural Inf Process Syst 36:5539–5568, 2023

Thirunavukarasu AJ, Ting DS, Elangovan K, Gutierrez L, Tan TF, Ting DS: Large language models in medicine. Nat Med 29:1930–1940, 2023

Tian D, Jiang S, Zhang L, Lu X, Xu Y: The role of large language models in medical image processing: A narrative review. Quant Imaging Med Surg 14:1108, 2023

Hu M, Qian J, Pan S, Li Y, Qiu RL, Yang X: Advancing medical imaging with language models: Featuring a spotlight on ChatGPT. Phys Med Biol 69:10TR01, 2024

Nakaura T, Ito R, Ueda D, Nozaki T, Fushimi Y, Matsui Y, Yanagawa M, Yamada A, Tsuboyama T, Fujima N, Tatsugami F: The impact of large language models on radiology: A guide for radiologists on the latest innovations in AI. Jpn J Radiol 1–2, 2024

Baumli K, Baveja S, Behbahani F, Chan H, Comanici G, Flennerhag S, Gazeau M, Holsheimer K, Horgan D, Laskin M, Lyle C: Vision-language models as a source of rewards. arXiv preprint arXiv:2312.09187, 2023

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ: Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67, 2020

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S: Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901, 2020

Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J: Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst 35:27730–27744, 2022

Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P, Chung HW, Sutton C, Gehrmann S, Schuh P: PaLM: Scaling language modeling with pathways. J Mach Learn Res 24:1–113, 2023

Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, Bashlykov N, Batra S, Bhargava P, Bhosale S, Bikel D: LLaMA 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023

Zhang J, Huang J, Jin S, Lu S: Vision-language models for vision tasks: A survey. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/TPAMI.2024.3371387,2024

Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G: Learning transferable visual models from natural language supervision. In: Proc Int Conf Mach Learn (ICML), pp. 8748–8763, 2021

Yao L, Huang R, Hou L, Lu G, Niu M, Xu H, Liang X, Li Z, Jiang X, Xu C: FILIP: Fine-grained interactive language-image pre-training. arXiv preprint arXiv:2111.07783, 2021

Yu J, Wang Z, Vasudevan V, Yeung L, Seyedhosseini M, Wu Y: CoCa: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917, 2022

Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, Altenschmidt J, Altman S, Anadkat S, Avila R: GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023

Team G, Anil R, Borgeaud S, Alayrac JB, Yu J, Soricut R, Schalkwyk J, Dai AM, Hauth A, Millican K, Silver D: Gemini: A family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023

Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G: Learning transferable visual models from natural language supervision. In: Proc Int Conf Mach Learn (ICML), pp. 8748–8763, 2021

Li J, Li D, Xiong C, Hoi S: BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In: Proc Int Conf Mach Learn (ICML), pp. 12888–12900, 2022

Xi Z, Chen W, Guo X, He W, Ding Y, Hong B, Zhang M, Wang J, Jin S, Zhou E, Zheng R: The rise and potential of large language model based agents: A survey. Sci China Inf Sci 68:121101, 2025

Davidson D: Actions, reasons, and causes. J Philos 60:685–700, 1963

Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D: Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst 35:24824–24837, 2022

Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM: Learning to compare: Relation network for few-shot learning. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), pp. 1199–1208, 2018

Xian Y, Schiele B, Akata Z: Zero-shot learning—the good, the bad and the ugly. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), pp. 4582–4591, 2017

Nakano R, Hilton J, Balaji S, et al.: WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332, 2021

Yao S, Zhao J, Yu D, et al.: ReAct: Synergizing reasoning and acting in language models. In: Proc Int Conf Learn Represent (ICLR), Kigali, Rwanda, May 1–5, 2023

Schick T, Dwivedi-Yu J, Dessì R, et al.: Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023

Zhai Y, Bai H, Lin Z, Pan J, Tong S, Zhou Y, Suhr A, Xie S, LeCun Y, Ma Y, Levine S: Fine-tuning large vision-language models as decision-making agents via reinforcement learning. arXiv preprint arXiv:2405.10292, 2024

Niu R, Li J, Wang S, Fu Y, Hu X, Leng X, Kong H, Chang Y, Wang Q: ScreenAgent: A vision language model-driven computer control agent. arXiv preprint arXiv:2402.07945, 2024

Zhou G, Hong Y, Wang Z, Wang XE, Wu Q: NavGPT-2: Unleashing navigational reasoning capability for large vision-language models. In: Proc Eur Conf Comput Vis (ECCV), pp. 260–278, 2024

Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, Kiefer B, Haase A: Generalized autocalibrating partially parallel acquisitions (GRAPPA). Magn Reson Med 47:1202–1210, 2002

Vogel CR, Oman ME: Iterative methods for total variation denoising. SIAM J Sci Comput 17:227–238, 1996

Chang Y, Li Z, Saju G, Mao H, Liu T: Deep learning-based rigid motion correction for magnetic resonance imaging: A survey. Meta-Radiology 100001, 2023

Russell SJ, Norvig P: Artificial Intelligence: A Modern Approach, 3rd ed., Boston: Pearson, 2016

Zhu JY, Park T, Isola P, Efros AA: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proc IEEE Int Conf Comput Vis (ICCV), pp. 2223–2232, 2017

Knoll F, Jure Z, Anuroop S, Matthew M, Mary B, Aaron D, Marc P et al. "fastMRI: A publicly available raw k-space and DICOM dataset of knee images for accelerated MR image reconstruction using machine learning." Radiol Artif Intell 2, no. 1 (2020): e190007.

Lee S, Jung S, Jung KJ, Kim DH: Deep learning in MR motion correction: A brief review and a new motion simulation tool (view2Dmotion). Invest Magn Reson Imaging 24:196–206, 2020

Wang X, Zhang Y, Zohar O, Yeung-Levy S: VideoAgent: Long-form video understanding with large language model as agent. In: Proc Eur Conf Comput Vis (ECCV), pp. 58–76, 2024

View original article

JOURNAL OF DIGITAL IMAGING

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

AgentMRI: A Vison Language Model-Powered AI System for Self-regulating MRI Reconstruction with Multiple Degradations

Comments (0)