Are you sure you're getting this? Click the correct answer from the options.
scaled_dot_attention mainly:
Click the option that best answers the question.
- blends V using weights from Q·K^T
- stores gradients
- loads data from disk
- compresses vocab

scaled_dot_attention mainly:
