CUED Publications database

Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training

Margeloiu, A and Simidjievski, N and Jamnik, M and Weller, A Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training. (Unpublished)

Full text not available from this repository.


We investigate the influence of adversarial training on the interpretability of convolutional neural networks (CNNs), specifically applied to diagnosing skin cancer. We show that gradient-based saliency maps of adversarially trained CNNs are significantly sharper and more visually coherent than those of standardly trained CNNs. Furthermore, we show that adversarially trained networks highlight regions with significant color variation within the lesion, a common characteristic of melanoma. We find that fine-tuning a robust network with a small learning rate further improves saliency maps' sharpness. Lastly, we provide preliminary work suggesting that robustifying the first layers to extract robust low-level features leads to visually coherent explanations.

Item Type: Article
Uncontrolled Keywords: cs.LG cs.LG
Divisions: Div F > Computational and Biological Learning
Depositing User: Cron Job
Date Deposited: 11 Dec 2020 20:04
Last Modified: 01 Jul 2021 09:43