Author: Monajatipoor, Masoud; Rouhsedaghat, Mozhdeh; Li, Liunian Harold; Chien, Aichi; Kuo, C.-C. Jay; Scalzo, Fabien; Chang, Kai-Wei
                    Title: BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease Diagnosis  Cord-id: shpudmuv  Document date: 2021_8_10
                    ID: shpudmuv
                    
                    Snippet: Vision-and-language(V&L) models take image and text as input and learn to capture the associations between them. Prior studies show that pre-trained V&L models can significantly improve the model performance for downstream tasks such as Visual Question Answering (VQA). However, V&L models are less effective when applied in the medical domain (e.g., on X-ray images and clinical notes) due to the domain gap. In this paper, we investigate the challenges of applying pre-trained V&L models in medical
                    
                    
                    
                     
                    
                    
                    
                    
                        
                            
                                Document: Vision-and-language(V&L) models take image and text as input and learn to capture the associations between them. Prior studies show that pre-trained V&L models can significantly improve the model performance for downstream tasks such as Visual Question Answering (VQA). However, V&L models are less effective when applied in the medical domain (e.g., on X-ray images and clinical notes) due to the domain gap. In this paper, we investigate the challenges of applying pre-trained V&L models in medical applications. In particular, we identify that the visual representation in general V&L models is not suitable for processing medical data. To overcome this limitation, we propose BERTHop, a transformer-based model based on PixelHop++ and VisualBERT, for better capturing the associations between the two modalities. Experiments on the OpenI dataset, a commonly used thoracic disease diagnosis benchmark, show that BERTHop achieves an average Area Under the Curve (AUC) of 98.12% which is 1.62% higher than state-of-the-art (SOTA) while it is trained on a 9 times smaller dataset.
 
  Search related documents: 
                                Co phrase  search for related documents- abnormality detection and magnetic resonance: 1
  - abnormality identify and magnetic resonance: 1
  - low quality and magnetic resonance: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
  
 
                                Co phrase  search for related documents, hyperlinks ordered by date