Hyperbolic vision language representation learning on chest radiology images

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), 2016; pp. 770–778.

Tarvainen A, Valpola H. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Red Hook: Curran Associates Inc.; 2017. p. 1195–204.

MATH  Google Scholar 

Qiao Z, Bae A, Glass LM, Xiao C, Sun J. Flannel (focal loss based neural network ensemble) for Covid-19 detection. J Am Med Inform Assoc. 2020;28(3):444–52.

Article  Google Scholar 

Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation; 2015.

He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners; 2021.

Qiao Z, Ouyang H, Chu D, Yuan H, Zhen X, Dong P, Qian Z. Coarse-fine view attention alignment-based GAN for CT reconstruction from biplanar X-rays. In: 2023 IEEE international conference on bioinformatics and biomedicine (BIBM); 2023. p. 2175–2178.

Peterson JC, Battleday RM, Griffiths TL, Russakovsky O. Human uncertainty makes classification more robust. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV); 2019.

Peters B, Kriegeskorte N. Capturing the objects of vision with neural networks; 2021.

etc HT. Llama 2: Open foundation and fine-tuned chat models; 2023.

Zhang H, Chen J, Jiang F, Yu F, Chen Z, Li J, Chen G, Wu X, Zhang Z, Xiao Q, Wan X, Wang B, Li H. HuatuoGPT, towards taming language model to be a doctor; 2023.

Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th international conference on machine learning; 2021. p. 8748–63.

Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. Proc 37th Int Conf Mach Learn. 2020;119:1597–607.

Google Scholar 

Zbontar J, Jing L, Misra I, LeCun Y, Deny S. Barlow Twins: self-supervised learning via redundancy reduction; 2021.

Wu Q, Tan H, Qiao Z, Dong P, Shen D, Wang M, Xue Z. Cross-view contrastive mutual learning across masked autoencoders for mammography diagnosis. In: Machine learning in medical imaging. Cham: Springer; 2024. p. 74–83.

Chapter  MATH  Google Scholar 

Chambon P, Bluethgen C, Delbrouck J-B, Sluijs RV, Połacin M, Chaves JMZ, Abraham TM, Purohit S, Langlotz CP, Chaudhari A. RoentGen: vision-language foundation model for chest X-ray generation; 2022.

Li B, Weinberger KQ, Belongie S, Koltun V, Ranftl R. Language-driven semantic segmentation; 2022.

Vinker Y, Pajouheshgar E, Bo JY, Bachmann RC, Bermano AH, Cohen-Or D, Zamir A, Shamir A. CLIPasso: semantically-aware object sketching; 2022.

Zhao Z, Wang S, Gu J, Zhu Y, Mei L, Zhuang Z, Cui Z, Wang Q, Shen D. Chatcad+: towards a universal and reliable interactive cad using LLMS. IEEE Trans Med Imaging; 2024:1

Eslami S, Melo G, Meinel C. Does CLIP benefit visual question answering in the medical domain as much as it does in the general domain? 2021.

Kim C, Gadgil SU, DeGrave AJ, Omiye JA, Cai ZR, Daneshjou R, Lee S-I. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat Med. 2024.

Lauritsen SM, Kristensen M, Olsen MV, Larsen MS, Lauritsen KM, Jørgensen, MJ, Lange J, Thiesson B. Explainable artificial intelligence model to predict acute critical illness from electronic health records; 2019.

Xie X, Niu J, Liu X, Chen Z, Tang S, Yu S. A survey on incorporating domain knowledge into deep learning for medical image analysis. Med Image Anal. 2021;69:101985.

Article  Google Scholar 

Luo L, Chen H, Xiao Y, Zhou Y, Wang X, Vardhanabhuti V, Wu M, Han C, Liu Z, Fang XHB, Tsougenis E, Lin H, Heng P-A. Rethinking annotation granularity for overcoming shortcuts in deep learning-based radiograph diagnosis: a multicenter study; 2021.

Mehta S, Mercan E, Bartlett J, Weave D, Elmore JG, Shapiro L. Y-Net: Joint segmentation and classification for diagnosis of breast biopsy images; 2018.

Qu C, Zhang T, Qiao H, Liu J, Tang Y, Yuille AL, Zhou Z. Abdomenatlas-8k: annotating 8,000 CT volumes for multi-organ segmentation in three weeks. In: Proceedings of the 37th international conference on neural information processing systems, NIPS ’23. Red Hook: Curran Associates Inc; 2024.

Google Scholar 

Huang S-C, Shen L, Lungren MP, Yeung S. Gloria: a multimodal global-local representation learning framework for label-efficient medical image recognition. In: 2021 IEEE/CVF international conference on computer vision (ICCV); 2021. p. 3922–31.

Vendrov I, Kiros R, Fidler S, Urtasun R. Order-embeddings of images and language; 2016.

Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. In: ICCV; 2017. p. 618–26.

Nickel M, Kiela D. Poincaré embeddings for learning hierarchical representations. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Red Hook: Curran Associates Inc; 2017. p. 6341–50.

MATH  Google Scholar 

Nickel M, Kiela D. Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. Proc 35th Int Conf Mach Learn. 2018;80:3779–88.

MATH  Google Scholar 

Xu Ys, Wang D, Chen B, Lu R, Duan Z, Zhou M. Hyperminer: Topic taxonomy mining with hyperbolic embedding. Adv Neural Inf Process Syst. 2022;35:31557–70.

Google Scholar 

Xu S-L, Sun Y, Zhang F, Xu A, Wei X-S, Yang Y. Hyperbolic space with hierarchical margin boosts fine-grained learning from coarse labels. Adv Neural Inf Process Syst. 2023;36:71263–74.

MATH  Google Scholar 

Fu X, Wei Y, Sun Q, Yuan H, Wu J, Peng H, Li J. Hyperbolic geometric graph representation learning for hierarchy-imbalance node classification. In: Proceedings of the ACM web conference 2023, WWW ’23. New York: Association for Computing Machinery; 2023. p. 460–8.

Chapter  Google Scholar 

Qiao Z, Han L, Zhen X, Gao J-H, Qian Z. HYDEN: hyperbolic density representations for medical images and reports; 2024.

Desai K, Nickel M, Rajpurohit T, Johnson J, Vedantam SR. Hyperbolic image-text representations. Proc 40th Int Conf Mach Learn. 2023;202:7694–731.

Google Scholar 

Cheng P, Lin L, Lyu J, Huang Y, Luo W, Tang X. Prior: prototype representation joint learning from medical images and reports. In: 2023 IEEE/CVF international conference on computer vision (ICCV); 2023. p. 21304–14

Liu B, Lu D, Wei D, Wu X, Wang Y, Zhang Y, Zheng Y. Improving medical vision-language contrastive pretraining with semantics-aware triage. IEEE Trans Med Imaging. 2023;42(12):3579–89.

Article  MATH  Google Scholar 

Amiri Z, Heidari A, Navimipour NJ, Esmaeilpour M, Yazdani Y. The deep learning applications in IoT-based bio- and medical informatics: a systematic literature review. Neural Comput Appl. 2024;36(11):5757–97.

Article  Google Scholar 

Amiri Z, Heidari A, Zavvar M, Navimipour NJ, Esmaeilpour M. The applications of nature-inspired algorithms in internet of things-based healthcare service: a systematic literature review. Trans Emerg Telecommun Technol. 2024;35(6):4969.

Article  Google Scholar 

Wang F, Zhou Y, Wang S, Vardhanabhuti V, Yu L. Multi-granularity cross-modal alignment for generalized medical visual representation learning. In: Proceedings of the 36th international conference on neural information processing systems; 2022.

Socher R, Karpathy A, Le QV, Manning CD, Ng A. Grounded compositional semantics for finding and describing images with sentences. Trans Assoc Comput Linguistics. 2014;2:207–18.

Article  Google Scholar 

Ma L, Lu Z, Shang L, Li H. Multimodal convolutional neural networks for matching image and sentence. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. USA: IEEE Computer Society; 2015. p. 2623–31.

Chapter  MATH  Google Scholar 

Socher R, Chen D, Manning CD, Ng AY. Reasoning with neural tensor networks for knowledge base completion. In: Procings of the 26th international conference on neural information processing systems, NIPS’13, vol. 1. Red Hook: Curran Associates Inc; 2013. p. 926–34.

MATH  Google Scholar 

Le M, Roller S, Papaxanthos L, Kiela D, Nickel M. Inferring concept hierarchies from text corpora via hyperbolic embeddings. ACL; 2019.

Book  Google Scholar 

Su Y, Fang T, Xiao H, Wang W, Song Y, Zhang T, Chen L. Entaile: Introducing textual entailment in commonsense knowledge graph completion. ArXiv 2024

Chen D, Li Y, Yang M, Zheng H-T, Shen Y. Knowledge-aware textual entailment with graph attention network. In: Proceedings of the 28th ACM international conference on information and knowledge management, CIKM ’19. New York: Association for Computing Machinery; 2019. p. 2145–8.

MATH  Google Scholar 

Khrulkov V, Mirvakhabova L, Ustinova E, Oseledets I, Lempitsky V. Hyperbolic image embeddings. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR); 2020. p. 6417–27.

Zhang Y, Jiang H, Miura Y, Manning CD, Langlotz CP. Contrastive learning of medical visual representations from paired images and text; 2022.

Boecking B, Usuyama N, Bannur S, Castro DC, Schwaighofer A, Hyland S, Wetscherek M, Naumann T, Nori A, Alvarez-Valle J, et al. Making the most of text semantics to improve biomedical vision-language processing. In: European conference on computer vision. Springer; 2022. p. 1–21.

Google Scholar 

Huang X, Fang Y, Lu M, Yan F, Yang J, Xu Y. Dual-ray net: automatic diagnosis of thoracic diseases using frontal and lateral chest X-rays. J Med Imaging Health Inform. 2020;10:348–55.

Article  MATH  Google Scholar 

Wu C, Zhang X, Zhang Y, Wang Y, Xie W. Medklip: medical knowledge enhanced language-image pre-training for X-ray diagnosis. In: 2023 IEEE/CVF international conference on computer vision (ICCV), 2023; pp. 21315–26.

Wu JT, Agu NN, Lourentzou I, Sharma A, Paguio JA, Yao JS, Dee EC, Mitchell W, Kashyap S, Giovannini A, Celi LA, Moradi M. Chest imagenome dataset for clinical reasoning. CoRR; 2021.

Google Scholar 

Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:267–70.

Article  MATH  Google Scholar 

Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, McDermott MBA. Publicly available clinical BERT embeddings; 2019.

Shen L, Johnson AEW, Pollard TJ, Lehman L-W, Feng M, Ghassemi MM, Moody BE, Szolovits P, Celi LAG, Mark RG. MIMIC-III, a freely accessible critical care database; 2016.

Cheng P, Lin L, Lyu J, Huang Y, Luo W, Tang X. Prior: prototype representation joint learning from medical images and reports. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), 2023. pp. 21361–71

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need, NIPS’17. Red Hook: Curran Associates Inc.; 2017. p. 6000–10.

Google Scholar 

Mu N, Kirillov A, Wagner D, Xie S. Slip: Self-supervision meets language-image pre-training. In: Computer vision—ECCV 2022: 17th European conference. Tel Aviv, Israel, 23–27 Oct 2022, Proceedings, Part XXVI. Berlin, Heidelberg: Springer; 2022. p. 529–44.

Rahman T, Khandakar A, Kadir MA, Islam KR, Islam KF, Mazhar R, Hamid T, Islam MT, Kashem S, Mahbub ZB, Ayari MA, Chowdhury MEH. Reliable tuberculosis detection using chest x-ray with deep learning, segmentation and visualization. IEEE Access. 2020;8:191586–601.

Article  MATH  Google Scholar 

Kaggle: Society for imaging informatics in medicine: SIIM-ACR pneumothorax segmentation; 2019.

George Shih SSH, Wu Carol C. Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia. Radiol Artif Intell; 2019.

Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. Chestx-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR); 2017. p. 3462–71

Bdeir A, Schwethelm K, Landwehr N. Fully hyperbolic convolutional neural networks for computer vision. In: ICLR2024; 2024.

Amiri Z, Heidari A, Darbandi M, Yazdani Y, Navimipour N, Esmaeilpour M, Sheykhi F, Unal M. The personal health applications of machine learning techniques in the internet of behaviors. Sustainability. 2023;3:4.

Google Scholar 

Amiri Z. Leveraging AI-enabled information systems for healthcare management. J Comput Inf Syst. 2024;0(0):1–28.

MATH  Google Scholar 

Luo Y, Shi M, Khan MO, Afzal MM, Huang H, Yuan S, Tian Y, Song L, Kouhana A, Elze T, Fang Y, Wang M. FairCLIP: harnessing fairness in vision-language learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2024. p. 12289–301.

Comments (0)

No login
gif