Neuron Alignment

This page provides a summary sheet that includes the general goal, reference papers (both mine and external) for an overview of the topic, as well as the domains explored so far. We are also interested in extending the applications of these techniques beyond their traditional domains. If you have expertise in other areas (e.g., neuroscience, gaming, or audio/speech modeling), we would be happy to explore potential extensions into those fields.

Goal: The goal of this research area is to understand what deep neural networks learn during the training process. My research focuses on capturing the alignment between neurons activations and human-defined knowledge (e.g., concepts). These methods typically combine tools from classical AI (e.g., heuristic search and clustering), statistical analysis, and recent advancements in AI.

Domains: NLP, Vision.

Reference Papers:

  1. Compositional Explanations: [Seminal Paper]
  2. Clustered Compositional Explanations: [(La Rosa et al., 2023)]
  3. Open Vocabulary Compoisitional Explanations: [(La Rosa & Gilpin, 2025)]
  4. Optimal Compositional Explanations: [(La Rosa & Gilpin, 2025)]
  5. Metrics : [(Makinwa et al., 2022)]

References

2025

  1. Pre-Print
    Open Vocabulary Compositional Explanations for Neurons
    Biagio La Rosa, and Leilani H. Gilpin
    2025
  2. Pre-Print
    Guaranteed Optimal Compositional Explanations for Neurons
    Biagio La Rosa, and Leilani H. Gilpin
    2025

2023

  1. Conference
    Towards a fuller understanding of neurons with Clustered Compositional Explanations
    Biagio La Rosa, Leilani H. Gilpin, and Roberto Capobianco
    In Thirty-seventh Conference on Neural Information Processing Systems, 2023

2022

  1. Conference
    Detection Accuracy for Evaluating Compositional Explanations of Units
    Sayo M. Makinwa, Biagio La Rosa, and Roberto Capobianco
    In AIxIA 2021 - Advances in Artificial Intelligence, 2022