Hyperbolic vision language representation learning on chest radiology images

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), 2016; pp. 770–778.

Tarvainen A, Valpola H. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Red Hook: Curran Associates Inc.; 2017. p. 1195–204.

MATH Google Scholar

Qiao Z, Bae A, Glass LM, Xiao C, Sun J. Flannel (focal loss based neural network ensemble) for Covid-19 detection. J Am Med Inform Assoc. 2020;28(3):444–52.

Article Google Scholar

Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation; 2015.

He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners; 2021.

Qiao Z, Ouyang H, Chu D, Yuan H, Zhen X, Dong P, Qian Z. Coarse-fine view attention alignment-based GAN for CT reconstruction from biplanar X-rays. In: 2023 IEEE international conference on bioinformatics and biomedicine (BIBM); 2023. p. 2175–2178.

Peterson JC, Battleday RM, Griffiths TL, Russakovsky O. Human uncertainty makes classification more robust. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV); 2019.

Peters B, Kriegeskorte N. Capturing the objects of vision with neural networks; 2021.

etc HT. Llama 2: Open foundation and fine-tuned chat models; 2023.

Zhang H, Chen J, Jiang F, Yu F, Chen Z, Li J, Chen G, Wu X, Zhang Z, Xiao Q, Wan X, Wang B, Li H. HuatuoGPT, towards taming language model to be a doctor; 2023.

Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th international conference on machine learning; 2021. p. 8748–63.

Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. Proc 37th Int Conf Mach Learn. 2020;119:1597–607.

Google Scholar

Zbontar J, Jing L, Misra I, LeCun Y, Deny S. Barlow Twins: self-supervised learning via redundancy reduction; 2021.

Wu Q, Tan H, Qiao Z, Dong P, Shen D, Wang M, Xue Z. Cross-view contrastive mutual learning across masked autoencoders for mammography diagnosis. In: Machine learning in medical imaging. Cham: Springer; 2024. p. 74–83.

Chapter MATH Google Scholar

Chambon P, Bluethgen C, Delbrouck J-B, Sluijs RV, Połacin M, Chaves JMZ, Abraham TM, Purohit S, Langlotz CP, Chaudhari A. RoentGen: vision-language foundation model for chest X-ray generation; 2022.

Li B, Weinberger KQ, Belongie S, Koltun V, Ranftl R. Language-driven semantic segmentation; 2022.

Vinker Y, Pajouheshgar E, Bo JY, Bachmann RC, Bermano AH, Cohen-Or D, Zamir A, Shamir A. CLIPasso: semantically-aware object sketching; 2022.

Zhao Z, Wang S, Gu J, Zhu Y, Mei L, Zhuang Z, Cui Z, Wang Q, Shen D. Chatcad+: towards a universal and reliable interactive cad using LLMS. IEEE Trans Med Imaging; 2024:1

Eslami S, Melo G, Meinel C. Does CLIP benefit visual question answering in the medical domain as much as it does in the general domain? 2021.

Kim C, Gadgil SU, DeGrave AJ, Omiye JA, Cai ZR, Daneshjou R, Lee S-I. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat Med. 2024.

Lauritsen SM, Kristensen M, Olsen MV, Larsen MS, Lauritsen KM, Jørgensen, MJ, Lange J, Thiesson B. Explainable artificial intelligence model to predict acute critical illness from electronic health records; 2019.

Xie X, Niu J, Liu X, Chen Z, Tang S, Yu S. A survey on incorporating domain knowledge into deep learning for medical image analysis. Med Image Anal. 2021;69:101985.

Article Google Scholar

Luo L, Chen H, Xiao Y, Zhou Y, Wang X, Vardhanabhuti V, Wu M, Han C, Liu Z, Fang XHB, Tsougenis E, Lin H, Heng P-A. Rethinking annotation granularity for overcoming shortcuts in deep learning-based radiograph diagnosis: a multicenter study; 2021.

Mehta S, Mercan E, Bartlett J, Weave D, Elmore JG, Shapiro L. Y-Net: Joint segmentation and classification for diagnosis of breast biopsy images; 2018.

Qu C, Zhang T, Qiao H, Liu J, Tang Y, Yuille AL, Zhou Z. Abdomenatlas-8k: annotating 8,000 CT volumes for multi-organ segmentation in three weeks. In: Proceedings of the 37th international conference on neural information processing systems, NIPS ’23. Red Hook: Curran Associates Inc; 2024.

Google Scholar

Huang S-C, Shen L, Lungren MP, Yeung S. Gloria: a multimodal global-local representation learning framework for label-efficient medical image recognition. In: 2021 IEEE/CVF international conference on computer vision (ICCV); 2021. p. 3922–31.

Vendrov I, Kiros R, Fidler S, Urtasun R. Order-embeddings of images and language; 2016.

Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. In: ICCV; 2017. p. 618–26.

Nickel M, Kiela D. Poincaré embeddings for learning hierarchical representations. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Red Hook: Curran Associates Inc; 2017. p. 6341–50.

MATH Google Scholar

Nickel M, Kiela D. Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. Proc 35th Int Conf Mach Learn. 2018;80:3779–88.

MATH Google Scholar

Xu Ys, Wang D, Chen B, Lu R, Duan Z, Zhou M. Hyperminer: Topic taxonomy mining with hyperbolic embedding. Adv Neural Inf Process Syst. 2022;35:31557–70.

Google Scholar

Xu S-L, Sun Y, Zhang F, Xu A, Wei X-S, Yang Y. Hyperbolic space with hierarchical margin boosts fine-grained learning from coarse labels. Adv Neural Inf Process Syst. 2023;36:71263–74.

MATH Google Scholar

Fu X, Wei Y, Sun Q, Yuan H, Wu J, Peng H, Li J. Hyperbolic geometric graph representation learning for hierarchy-imbalance node classification. In: Proceedings of the ACM web conference 2023, WWW ’23. New York: Association for Computing Machinery; 2023. p. 460–8.

Chapter Google Scholar

Qiao Z, Han L, Zhen X, Gao J-H, Qian Z. HYDEN: hyperbolic density representations for medical images and reports; 2024.

Desai K, Nickel M, Rajpurohit T, Johnson J, Vedantam SR. Hyperbolic image-text representations. Proc 40th Int Conf Mach Learn. 2023;202:7694–731.

Google Scholar

Cheng P, Lin L, Lyu J, Huang Y, Luo W, Tang X. Prior: prototype representation joint learning from medical images and reports. In: 2023 IEEE/CVF international conference on computer vision (ICCV); 2023. p. 21304–14

Liu B, Lu D, Wei D, Wu X, Wang Y, Zhang Y, Zheng Y. Improving medical vision-language contrastive pretraining with semantics-aware triage. IEEE Trans Med Imaging. 2023;42(12):3579–89.

Article MATH Google Scholar

Amiri Z, Heidari A, Navimipour NJ, Esmaeilpour M, Yazdani Y. The deep learning applications in IoT-based bio- and medical informatics: a systematic literature review. Neural Comput Appl. 2024;36(11):5757–97.

Article Google Scholar

Amiri Z, Heidari A, Zavvar M, Navimipour NJ, Esmaeilpour M. The applications of nature-inspired algorithms in internet of things-based healthcare service: a systematic literature review. Trans Emerg Telecommun Technol. 2024;35(6):4969.

Article Google Scholar

Wang F, Zhou Y, Wang S, Vardhanabhuti V, Yu L. Multi-granularity cross-modal alignment for generalized medical visual representation learning. In: Proceedings of the 36th international conference on neural information processing systems; 2022.

Socher R, Karpathy A, Le QV, Manning CD, Ng A. Grounded compositional semantics for finding and describing images with sentences. Trans Assoc Comput Linguistics. 2014;2:207–18.

Article Google Scholar

Ma L, Lu Z, Shang L, Li H. Multimodal convolutional neural networks for matching image and sentence. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. USA: IEEE Computer Society; 2015. p. 2623–31.

Chapter MATH Google Scholar

Socher R, Chen D, Manning CD, Ng AY. Reasoning with neural tensor networks for knowledge base completion. In: Procings of the 26th international conference on neural information processing systems, NIPS’13, vol. 1. Red Hook: Curran Associates Inc; 2013. p. 926–34.

MATH Google Scholar

Le M, Roller S, Papaxanthos L, Kiela D, Nickel M. Inferring concept hierarchies from text corpora via hyperbolic embeddings. ACL; 2019.

Book Google Scholar

Su Y, Fang T, Xiao H, Wang W, Song Y, Zhang T, Chen L. Entaile: Introducing textual entailment in commonsense knowledge graph completion. ArXiv 2024

Chen D, Li Y, Yang M, Zheng H-T, Shen Y. Knowledge-aware textual entailment with graph attention network. In: Proceedings of the 28th ACM international conference on information and knowledge management, CIKM ’19. New York: Association for Computing Machinery; 2019. p. 2145–8.

MATH Google Scholar

Khrulkov V, Mirvakhabova L, Ustinova E, Oseledets I, Lempitsky V. Hyperbolic image embeddings. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR); 2020. p. 6417–27.

Zhang Y, Jiang H, Miura Y, Manning CD, Langlotz CP. Contrastive learning of medical visual representations from paired images and text; 2022.

Boecking B, Usuyama N, Bannur S, Castro DC, Schwaighofer A, Hyland S, Wetscherek M, Naumann T, Nori A, Alvarez-Valle J, et al. Making the most of text semantics to improve biomedical vision-language processing. In: European conference on computer vision. Springer; 2022. p. 1–21.

Google Scholar

Huang X, Fang Y, Lu M, Yan F, Yang J, Xu Y. Dual-ray net: automatic diagnosis of thoracic diseases using frontal and lateral chest X-rays. J Med Imaging Health Inform. 2020;10:348–55.

Article MATH Google Scholar

Wu C, Zhang X, Zhang Y, Wang Y, Xie W. Medklip: medical knowledge enhanced language-image pre-training for X-ray diagnosis. In: 2023 IEEE/CVF international conference on computer vision (ICCV), 2023; pp. 21315–26.

Wu JT, Agu NN, Lourentzou I, Sharma A, Paguio JA, Yao JS, Dee EC, Mitchell W, Kashyap S, Giovannini A, Celi LA, Moradi M. Chest imagenome dataset for clinical reasoning. CoRR; 2021.

Google Scholar

Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:267–70.

Article MATH Google Scholar

Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, McDermott MBA. Publicly available clinical BERT embeddings; 2019.

Shen L, Johnson AEW, Pollard TJ, Lehman L-W, Feng M, Ghassemi MM, Moody BE, Szolovits P, Celi LAG, Mark RG. MIMIC-III, a freely accessible critical care database; 2016.

Cheng P, Lin L, Lyu J, Huang Y, Luo W, Tang X. Prior: prototype representation joint learning from medical images and reports. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), 2023. pp. 21361–71

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need, NIPS’17. Red Hook: Curran Associates Inc.; 2017. p. 6000–10.

Google Scholar

Mu N, Kirillov A, Wagner D, Xie S. Slip: Self-supervision meets language-image pre-training. In: Computer vision—ECCV 2022: 17th European conference. Tel Aviv, Israel, 23–27 Oct 2022, Proceedings, Part XXVI. Berlin, Heidelberg: Springer; 2022. p. 529–44.

Rahman T, Khandakar A, Kadir MA, Islam KR, Islam KF, Mazhar R, Hamid T, Islam MT, Kashem S, Mahbub ZB, Ayari MA, Chowdhury MEH. Reliable tuberculosis detection using chest x-ray with deep learning, segmentation and visualization. IEEE Access. 2020;8:191586–601.

Article MATH Google Scholar

Kaggle: Society for imaging informatics in medicine: SIIM-ACR pneumothorax segmentation; 2019.

George Shih SSH, Wu Carol C. Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia. Radiol Artif Intell; 2019.

Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. Chestx-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR); 2017. p. 3462–71

Bdeir A, Schwethelm K, Landwehr N. Fully hyperbolic convolutional neural networks for computer vision. In: ICLR2024; 2024.

Amiri Z, Heidari A, Darbandi M, Yazdani Y, Navimipour N, Esmaeilpour M, Sheykhi F, Unal M. The personal health applications of machine learning techniques in the internet of behaviors. Sustainability. 2023;3:4.

Google Scholar

Amiri Z. Leveraging AI-enabled information systems for healthcare management. J Comput Inf Syst. 2024;0(0):1–28.

MATH Google Scholar

Luo Y, Shi M, Khan MO, Afzal MM, Huang H, Yuan S, Tian Y, Song L, Kouhana A, Elze T, Fang Y, Wang M. FairCLIP: harnessing fairness in vision-language learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2024. p. 12289–301.

View original article

HEALTH INFORMATION SCIENCE AND SYSTEMS

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Hyperbolic vision language representation learning on chest radiology images

Comments (0)