Reading Group | Biagio (Mattia) La Rosa

The Brainstorm XAI Reading Group is an international group of researchers who enjoy discussing and brainstorming about scientific papers. The group fosters a friendly and drama-free atmosphere. We encourage members to express their opinions freely and celebrate the diversity of perspectives and cultural backgrounds.

The primary goal of each meeting is to understand the papers and brainstorm their weak points, potential extensions, strengths, and applications. The group is open to anyone (students, researchers, or professors) who enjoys sharing ideas, brainstorm collaboratively, and critically analyze papers. To join, simply fill out this form: LINK. Once registered, you’ll be added to the mailing list and given access to calendar events.

Active since 2023, the group focuses on papers related to Explainable AI (XAI). We assume members have a basic understanding of XAI, and discussions span a wide range of domains, including vision, graphs, NLP, reinforcement learning, and classical AI.

Below, you can find a list of past presentations. Currently, we meet every other Tuesday at 6:30 PM CET / 9:30 AM Los Angeles Time.

Past presentations:

Explanations of Deep Language Models Explain Language Representations in the Brain. Maryam Rahimi, Yadollah Yaghoobzadeh, Mohammad Reza Daliri
Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference. Jorge Garcia-Carrasco, Alejandro Mate, Juan Trujillo.
GraphTrail: Translating GNN Predictions into Human-Interpretable Logical Rules. Burouj Armgaan, Manthan Dalmia, Sourav Medya, and Sayan Ranu
Linear Explanations for Individual Neurons. Tuomas Oikarinen, Tsui-Wei Weng
MambaLRP: Explaining Selective State Space Sequence Models. Arnoush Rezaei Jafari, Gregoire Montavon, Klaus-Robert Muller, and Oliver Eberle
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts. Andong Tan, Fengtao Zhou, and Hao Chen
Concept Learning for Interpretable Multi-Agent Reinforcement Learning. Renos Zabounidis, Joseph Campbell, Simon Stepputtis, Dana Hughes, Katia Sycara
Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents. Quentin Delfosse, Sebastian Sztwiertnia, Mark Rothermel, Wolfgang Stammer, Kristian Kersting
IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers. Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva
This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations. Chiyu Ma, Brandon Zhao, Chaofan Chen, Cynthia Rudin
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. Tuomas Oikarinen, Tsui-Wei Weng
Everybody Needs a Little HELP: Explaining Graphs via Hierarchical Concepts. Jonas Jürß, Lucie Charlotte Magister, Pietro Barbiero, Pietro Liò, Nikola Simidjievski
Concept Bottleneck Generative Models. Aya Abdelsalam Ismail, Julius Adebayo, Hector Corrada Bravo, Stephen Ra, Kyunghyun Cho
Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents. Quentin Delfosse, Sebastian Sztwiertnia, Mark Rothermel, Wolfgang Stammer, Kristian Kersting
KAN: Kolmogorov-Arnold Networks. Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Trenton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermyn, Shan Carter, Chris Olah, Tom Henighan
Interpreting Language Models with Contrastive Explanations. Kayo Yin, Graham Neubig
Locating and Editing Factual Associations in GPT. Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov
From attribution maps to human-understandable explanations through Concept Relevance Propagation. *Reduan Achtibat, Maximilian Dreyer, Ilona Eisenbraun, Sebastian Bosse, Thomas Wiegand, Wojciech Samek & Sebastian Lapuschkin
What s in the Box? Exploring the Inner Life of Neural Networks with Robust Rules. Jonas Fischer, Anna Olah, Jilles Vreeken
Labeling Neural Representations with Inverse Recognition. Kirill Bykov, Laura Kopf, Shinichi Nakajima, Marius Kloft, Marina M.-C. Höhne