Selected Publications

Influence Patterns for Explaining Information Flow in BERT [NeurIPS 2021]

Lu, Kaiji, Zifan Wang, Piotr Mardziel, and Anupam Datta. "Influence Patterns for Explaining Information Flow in BERT." NeurIPS, 2021.

Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models. [ACL 2020]

Lu, Kaiji, Piotr Mardziel, Klas Leino, Matt Fredrikson, and Anupam Datta. "Influence Paths for Characterizing Subject-VerbNumber Agreement in LSTM Language Models.." ACL 2020

Gender Bias in Neural Natural Language Processing. [Springer 2020]

Lu, Kaiji, Piotr Mardziel, Fangjing Wu, Preetam Amancharla, and Anupam Datta. "IGender Bias in Neural Natural Language Processing. ." Springer 2020

Machine Learning Explainability and Robustness: Connected at the Hip. [KDD 2020]

Anupam Datta, Matt Fredrikson, Klas Leino, Lu, Kaiji, Shayak Sen, and Zifan Wang. " Machine Learning Explainability and Robustness: Connected at the Hip." KDD 2020