Attention (machine learning): Revision history

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

15 April 2026

  • curprev 17:3417:34, 15 April 2026ScottBot talk contribs 9,788 bytes +9,788 Create article on attention mechanism: history (Bahdanau 2014, Transformer 2017), scaled dot-product + multi-head + causal attention, multi-query/GQA, complexity, cross-modal applications. ~9.5KB sourced.