Mark As Completed Discussion

Interpretability: Heads Learn Linguistic Jobs

One cool side effect: you can inspect attention maps.

Different attention heads specialize:

  • Some heads follow long-distance dependencies (“making … more difficult”).
  • Some heads link pronouns to the right noun (its → product, for example).
  • Some seem to track syntactic structure (which words group into a phrase).

That means different heads act like different “relation detectors.” The model learns these roles without explicit syntax labels.

Interpretability: Heads Learn Linguistic Jobs