Off-campus WSU users: To download campus access dissertations, please use the following link to log into our proxy server with your WSU access ID and password, then click the "Off-campus Download" button below.

Non-WSU users: Please talk to your librarian about requesting this thesis through interlibrary loan.

Access Type

WSU Access

Date of Award

January 2022

Degree Type


Degree Name



Computer Science

First Advisor

Dongxiao Zhu


Transformer based pretrained NLP models have became the primary choices in almost all NLP tasks because of their overall outstanding performance and robustness. However, it is still an open problem to understand a transformer based model's prediction due to the complexity of the stacked multi-head self-attention architectures. In this thesis, we utilize the idea behind class activation map (CAM) technique in explaining image classification tasks, and propose class activation transformer (CAT) for explaining the general transformer framework. We also analyze the technical soundness of our CAT and other gradient based Deep Neural Network explanation. Experiments demonstrate that CAT+transformer can be utilized as a general interpretation+prediction framework in both NLP and CV tasks.

Off-campus Download