Building Trust in Deep Learning Models via a Self- Interpretable Visual Architecture
Author
Abstract

Deep learning models are being utilized and further developed in many application domains, but challenges still exist regarding their interpretability and consistency. Interpretability is important to provide users with transparent information that enhances the trust between the user and the learning model. It also gives developers feedback to improve the consistency of their deep learning models. In this paper, we present a novel architectural design to embed interpretation into the architecture of the deep learning model. We apply dynamic pixel-wised weights to input images and produce a highly correlated feature map for classification. This feature map is useful for providing interpretation and transparent information about the decision-making of the deep learning model while keeping full context about the relevant feature information compared to previous interpretation algorithms. The proposed model achieved 92\% accuracy for CIFAR 10 classifications without finetuning the hyperparameters. Furthermore, it achieved a 20\% accuracy under 8/255 PGD adversarial attack for 100 iterations without any defense method, indicating extra natural robustness compared to other Convolutional Neural Network (CNN) models. The results demonstrate the feasibility of the proposed architecture.

Year of Publication
2023
Date Published
aug
URL
https://ieeexplore.ieee.org/document/10320191
DOI
10.1109/PST58708.2023.10320191
Google Scholar | BibTeX | DOI