Mark As Completed Discussion

Autodiff: Why Gradients Matter

Automatic differentiation (autodiff) uses the chain rule to compute how changing each weight will change the loss. Forward: compute predictions. Backward: propagate ∂loss/∂node from outputs to inputs, accumulating gradients.

Without gradients, your model can’t learn. With them, an optimizer updates variables: w ← w − η * ∂L/∂w where η is the learning rate.

Autodiff: Why Gradients Matter