Mark As Completed Discussion

Beyond Translation: Parsing

They tested English constituency parsing (turning a sentence into a full syntax tree). This task has tricky long-range structure.

They trained a 4-layer Transformer (with d_model = 1024) on:

  • Just the Penn Treebank WSJ (~40K sentences), and
  • A semi-supervised setup with millions more high-confidence parse trees.

Result:

  • Even with limited data, the Transformer was competitive with strong parsers.
  • With semi-supervised data, it surpassed many previous approaches, showing the architecture generalizes beyond translation.