Optimizers: How Variables Learn
Variables are tensors you can update. Optimizers compute updates from gradients:
SGD:w ← w − η gMomentum: adds velocity to smooth updatesAdam: adaptive learning rates per parameter (mean + variance estimates)
Under the hood, these are additional ops in the graph that read gradients and write new values.

