Feature Geometry in Attention Heads

When and why the singular vectors of attention matrices align with the features a model uses.

Several studies have noticed that you can often read a model’s features off the singular vectors of its attention matrices (including my own work), but it was not clear why this happens. In this ICML 2026 paper, we give an answer (Franco et al., 2026). We first show that singular vectors reliably align with features in a setting where the features can be observed directly, and then prove that this alignment is expected under a range of conditions. We also identify sparse attention decomposition as a testable signature of the alignment and find it in real models.

In a controlled setting, the singular vectors of an attention head come to align with the model's features over the course of training.

Code is available at svf-alignment.

References

2026

  1. ICML
    Singular Vectors of Attention Heads Align with Features
    Gabriel Franco, Carson Loughridge, and Mark Crovella
    Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026