Explaining Neurons | Biagio (Mattia) La Rosa

This page provides a summary sheet that includes the general goal, reference papers (both mine and external) for an overview of the topic, as well as the domains explored so far. We are also interested in extending the applications of these techniques beyond their traditional domains. If you have expertise in other areas (e.g., neuroscience, gaming, or audio/speech modeling), we would be happy to explore potential extensions into those fields.

Goal: The goal of this research area is to understand what deep neural networks learn during the training process. Recently, this field has been categorized under the umbrella term Mechanistic Interpretability. My research projects usually focus on analyzing the behavior of individual neurons and groups of neurons, identifying the concepts they learn to recognize, and understanding the relationships between these concepts. They typically combine tools from classical AI (e.g., heuristic search and clustering), statistical analysis, and recent advancements in AI to explore this direction.

Domains: NLP, Vision.

Reference Papers:

Logical (Compositional) Explanations: [(La Rosa et al., 2023)] [(Makinwa et al., 2022)]
Linear Explanations: [Link]
Circuits (chain of neurons): [Link]

References

2023

Conference

Towards a fuller understanding of neurons with Clustered Compositional Explanations

Biagio La Rosa, Leilani H. Gilpin, and Roberto Capobianco

In Thirty-seventh Conference on Neural Information Processing Systems, 2023

Bib Official PDF Code

@inproceedings{LaRosa2023Towards,
  title = {Towards a fuller understanding of neurons with Clustered Compositional Explanations},
  author = {{La Rosa}, Biagio and Gilpin, Leilani H. and Capobianco, Roberto},
  booktitle = {Thirty-seventh Conference on Neural Information Processing Systems},
  year = {2023},
  url = {https://openreview.net/forum?id=51PLYhMFWz},
}

2022

Conference

Detection Accuracy for Evaluating Compositional Explanations of Units

Sayo M. Makinwa, Biagio La Rosa, and Roberto Capobianco

In AIxIA 2021 - Advances in Artificial Intelligence, 2022

Bib DOI Official PDF Code

@incollection{Makinwa2022,
  author = {Makinwa, Sayo M. and {La Rosa}, Biagio and Capobianco, Roberto},
  booktitle = {AIxIA 2021 - Advances in Artificial Intelligence},
  publisher = {Springer International Publishing},
  title = {Detection Accuracy for~Evaluating Compositional Explanations of~Units},
  year = {2022},
  pages = {550--563},
  doi = {10.1007/978-3-031-08421-8_38},
}