Neuron Alignment
This page provides a summary sheet that includes the general goal, reference papers (both mine and external) for an overview of the topic, as well as the domains explored so far. We are also interested in extending the applications of these techniques beyond their traditional domains. If you have expertise in other areas (e.g., neuroscience, gaming, or audio/speech modeling), we would be happy to explore potential extensions into those fields.
Goal: The goal of this research area is to understand what deep neural networks learn during the training process. My research focuses on capturing the alignment between neurons activations and human-defined knowledge (e.g., concepts). These methods typically combine tools from classical AI (e.g., heuristic search and clustering), statistical analysis, and recent advancements in AI.
Domains: NLP, Vision.
Reference Papers:
- Compositional Explanations: [Seminal Paper]
- Clustered Compositional Explanations: [(La Rosa et al., 2023)]
- Open Vocabulary Compoisitional Explanations: [(La Rosa & Gilpin, 2025)]
- Optimal Compositional Explanations: [(La Rosa & Gilpin, 2025)]
- Metrics : [(Makinwa et al., 2022)]
References
2025
- Pre-Print
- Pre-Print
2023
- ConferenceTowards a fuller understanding of neurons with Clustered Compositional ExplanationsIn Thirty-seventh Conference on Neural Information Processing Systems, 2023
2022
- ConferenceDetection Accuracy for Evaluating Compositional Explanations of UnitsIn AIxIA 2021 - Advances in Artificial Intelligence, 2022