Building Trust in Deep Learning Models via a Self- Interpretable Visual Architecture

Building Trust in Deep Learning Models via a Self- Interpretable Visual Architecture
Author	Weimin Zhao Qusay Mahmoud Sanaa Alwidian
Abstract	Deep learning models are being utilized and further developed in many application domains, but challenges still exist regarding their interpretability and consistency. Interpretability is important to provide users with transparent information that enhances the trust between the user and the learning model. It also gives developers feedback to improve the consistency of their deep learning models. In this paper, we present a novel architectural design to embed interpretation into the architecture of the deep learning model. We apply dynamic pixel-wised weights to input images and produce a highly correlated feature map for classification. This feature map is useful for providing interpretation and transparent information about the decision-making of the deep learning model while keeping full context about the relevant feature information compared to previous interpretation algorithms. The proposed model achieved 92\% accuracy for CIFAR 10 classifications without finetuning the hyperparameters. Furthermore, it achieved a 20\% accuracy under 8/255 PGD adversarial attack for 100 iterations without any defense method, indicating extra natural robustness compared to other Convolutional Neural Network (CNN) models. The results demonstrate the feasibility of the proposed architecture.
Year of Publication	2023
Date Published	aug
URL	https://ieeexplore.ieee.org/document/10320191
DOI	10.1109/PST58708.2023.10320191
Google Scholar \| BibTeX \| DOI