Attention Is All You Need: The Transformer
What Happened
Eight researchers at Google published 'Attention Is All You Need,' introducing the Transformer architecture. It replaced recurrence with self-attention mechanisms that could process entire sequences in parallel. The paper's title was deliberately bold — and proved prescient.
Why It Mattered
The single most important AI paper of the 2010s. Transformers are the architecture behind GPT, BERT, Claude, Gemini, Llama, DALL-E, Stable Diffusion, and virtually every frontier AI system. Several co-authors went on to found major AI companies (Cohere, Adept, Character.AI, Essential AI).
Key People
Organizations
Part of the Deep Learning Breakthrough (2012–2017) era · Browse all research breakthroughs · View all 2017 milestones
