Author: Saporta, A.; Gui, X.; Agrawal, A.; Pareek, A.; Truong, S. Q.; Nguyen, C. D.; Ngo, V.-D.; Seekins, J.; Blankenberg, F. G.; Ng, A.; Lungren, M. P.; Rajpurkar, P.
Title: Deep learning saliency maps do not accurately highlight diagnostically relevant regions for medical image interpretation Cord-id: mp77i8j7 Document date: 2021_3_2
ID: mp77i8j7
Snippet: Deep learning has enabled automated medical image interpretation at a level often surpassing that of practicing medical experts. However, many clinical practices have cited a lack of model interpretability as reason to delay the use of "black-box" deep neural networks in clinical workflows. Saliency maps, which "explain" a model's decision by producing heat maps that highlight the areas of the medical image that influence model prediction, are often presented to clinicians as an aid in diagnosti
Document: Deep learning has enabled automated medical image interpretation at a level often surpassing that of practicing medical experts. However, many clinical practices have cited a lack of model interpretability as reason to delay the use of "black-box" deep neural networks in clinical workflows. Saliency maps, which "explain" a model's decision by producing heat maps that highlight the areas of the medical image that influence model prediction, are often presented to clinicians as an aid in diagnostic decision-making. In this work, we demonstrate that the most commonly used saliency map generating method, Grad-CAM, results in low performance for 10 pathologies on chest X-rays. We examined under what clinical conditions saliency maps might be more dangerous to use compared to human experts, and found that Grad-CAM performs worse for pathologies that had multiple instances, were smaller in size, and had shapes that were more complex. Moreover, we showed that model confidence was positively correlated with Grad-CAM localization performance, suggesting that saliency maps were safer for clinicians to use as a decision aid when the model had made a positive prediction with high confidence. Our work demonstrates that several important limitations of interpretability techniques for medical imaging must be addressed before use in clinical workflows.
Search related documents:
Co phrase search for related documents- Try single phrases listed below for: 1
Co phrase search for related documents, hyperlinks ordered by date