Author: Xuehai He; Xingyi Yang; Shanghang Zhang; Jinyu Zhao; Yichen Zhang; Eric Xing; Pengtao Xie
Title: Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans Document date: 2020_4_17
ID: l3f469ht_3
Snippet: In this work, we aim to address these two problems by (1) building a publicly-available dataset containing hundreds of CT scans that are positive for COVID-19 and (2) developing sample-efficient deep learning methods that can achieve high diagnosis accuracy of COVID-19 from CT scans even when the number of training CT images are limited. We first collect the COVID19-CT dataset, which contains 349 CT images with clinical findings of 216 COVID-19 p.....
Document: In this work, we aim to address these two problems by (1) building a publicly-available dataset containing hundreds of CT scans that are positive for COVID-19 and (2) developing sample-efficient deep learning methods that can achieve high diagnosis accuracy of COVID-19 from CT scans even when the number of training CT images are limited. We first collect the COVID19-CT dataset, which contains 349 CT images with clinical findings of 216 COVID-19 patient cases. The images are collected from medRxiv and bioRxiv papers about COVID-19. CTs containing COVID-19 abnormalities are selected by reading the figure captions in the papers. We manually remove artifacts in the original images, such as texts, numbers, arrows, etc. Figure 1 shows some examples of the COVID-19 CT scans. To our best knowledge, it is the largest COVID-19 CT dataset to date. And all the images are open to the public for research purpose. Given this dataset, we develop deep learning (DL) methods to perform CT-based diagnosis of COVID-19. Though largest among its kind, COVID19-CT is still limited in image number. DL models are data-hungry, which have high risk of overfitting when trained on smallsized dataset. To address this problem, we develop sampleefficient methods to train highly-performant DL model in spite of data deficiency. Specifically, we investigate two paradigms of learning approaches for mitigating data deficiency: transfer learning and self-supervised learning.
Search related documents:
Co phrase search for related documents- high risk and original image: 1
- high risk and patient case: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72
- high risk and problem address: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
- high risk and publicly available dataset: 1, 2, 3
- high risk and research purpose: 1, 2, 3, 4, 5, 6
- high risk and transfer learning: 1, 2, 3, 4, 5, 6, 7, 8
- image number and original image: 1, 2, 3
- image number and problem address: 1
- image number and transfer learning: 1, 2, 3, 4
Co phrase search for related documents, hyperlinks ordered by date