Research Breakthroughs

38 milestones in AI history

The Birth of AI (19561969)

John McCarthy, organizer of the Dartmouth Conference

The Dartmouth Conference

A two-month workshop at Dartmouth College where the term 'Artificial Intelligence' was officially coined. The proposal stated: 'Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.' This gathering brought together the founders of the field.

John McCarthyMarvin MinskyDartmouth CollegeMIT
Frank Rosenblatt, inventor of the Perceptron

The Perceptron

Frank Rosenblatt built the Mark I Perceptron, the first hardware implementation of an artificial neural network. It could learn to classify simple visual patterns. The New York Times reported it as an 'Electronic Brain' that the Navy expected would 'be able to walk, talk, see, write, reproduce itself and be conscious of its existence.'

Frank RosenblattCornell Aeronautical Laboratory
ELIZA chatbot conversation example

ELIZA: The First Chatbot

Joseph Weizenbaum created ELIZA, a program that simulated a Rogerian psychotherapist using simple pattern matching. Despite being purely rule-based with no understanding, users became emotionally attached to it and insisted it truly understood them — a phenomenon Weizenbaum found deeply disturbing.

Joseph WeizenbaumMIT

SHRDLU: Natural Language Understanding

Terry Winograd created SHRDLU, a program that could understand and respond to English commands about a simulated 'blocks world.' Users could ask it to move objects, answer questions about their arrangement, and even understand pronouns and context within its limited domain.

Terry WinogradMIT
Shakey the robot at the Computer History Museum

Shakey the Robot

Shakey was the first mobile robot that could reason about its actions. It combined computer vision, natural language processing, and planning to navigate rooms, push objects, and solve simple tasks. It used the A* search algorithm and STRIPS planner.

Charles RosenNils NilssonStanford Research Institute

DENDRAL: The First Expert System

DENDRAL automated chemical structure determination from mass spectrometry data. It used heuristic rules from domain experts to solve problems that normally required PhD-level expertise. Its successor Meta-DENDRAL could even generate new rules automatically.

Edward FeigenbaumJoshua LederbergStanford University

Deep Learning Breakthrough (20122017)

AlexNet deep neural network architecture diagram

AlexNet: The ImageNet Moment

AlexNet, a deep convolutional neural network, won the ImageNet competition by a staggering margin — reducing the error rate from 26% to 16%. Trained on two NVIDIA GTX 580 GPUs, it was dramatically deeper and more powerful than previous entries. The AI community was stunned.

Alex KrizhevskyIlya SutskeverUniversity of Toronto
Tomáš Mikolov, lead author of Word2Vec

Word2Vec: Words as Vectors

Google researchers published Word2Vec, showing that relatively small neural networks could efficiently learn meaningful vector representations of words from large text corpora. The famous example `king - man + woman ≈ queen` made the idea vivid: semantic relationships could be captured geometrically in vector space.

Tomas MikolovGoogle
Google DeepMind logo

DeepMind's DQN Masters Atari Games

DeepMind demonstrated a deep reinforcement learning agent (Deep Q-Network) that learned to play Atari 2600 games directly from pixel inputs, achieving superhuman performance on many games with no task-specific engineering. Google acquired DeepMind for ~$500 million shortly after.

Volodymyr MnihDemis HassabisDeepMind
Generative Adversarial Network architecture diagram

Generative Adversarial Networks (GANs)

Ian Goodfellow introduced GANs — two neural networks (generator and discriminator) competing against each other, one creating fake data and the other trying to detect it. The concept allegedly came to him during a bar conversation. Yann LeCun called GANs 'the most interesting idea in the last 10 years in ML.'

Ian GoodfellowUniversité de Montréal
Residual network skip connection block diagram

ResNet: Deeper Than Ever

Microsoft Research introduced ResNet with skip connections (residual connections), enabling the training of networks with 152+ layers — 8x deeper than previous networks. ResNet won ImageNet 2015 with 3.57% error, surpassing human-level performance (5.1%) for the first time.

Kaiming HeXiangyu ZhangMicrosoft Research
The Transformer model architecture diagram from Attention Is All You Need

Attention Is All You Need: The Transformer

Eight researchers at Google published 'Attention Is All You Need,' introducing the Transformer architecture. It replaced recurrence with self-attention mechanisms that could process entire sequences in parallel. The paper's title was deliberately bold — and proved prescient.

Ashish VaswaniNoam ShazeerGoogle BrainGoogle Research
Go board representing AlphaGo Zero's self-play mastery

AlphaGo Zero: Learning From Scratch

AlphaGo Zero achieved superhuman Go performance with ZERO human knowledge — no training data from human games, no hand-crafted features. It learned entirely through self-play, and within 40 days surpassed all previous versions, including the one that beat Lee Sedol.

David SilverDeepMind

The Transformer Era (20182021)

BERT: Bidirectional Language Understanding

Google published BERT (Bidirectional Encoder Representations from Transformers), which could understand language context from both directions simultaneously. BERT shattered records on 11 NLP benchmarks. Google integrated it into Search, affecting 10% of all queries.

Jacob DevlinGoogle AI
OpenAI logo

GPT-1: Generative Pre-training

OpenAI released GPT-1, demonstrating that a Transformer trained on vast amounts of text using unsupervised pre-training could then be fine-tuned for specific NLP tasks. With 117 million parameters, it showed the potential of scaling language models.

Alec RadfordOpenAI
GPT-2 language model generating text about itself

GPT-2: 'Too Dangerous to Release'

OpenAI announced GPT-2 (1.5 billion parameters) but initially refused to release the full model, calling it 'too dangerous' due to its ability to generate convincing fake text. The decision was controversial — some praised the caution, others called it a publicity stunt. The full model was eventually released in November 2019.

Alec RadfordOpenAI
OpenAI logo

GPT-3: The 175 Billion Parameter Leap

OpenAI released GPT-3 with 175 billion parameters — 100x larger than GPT-2. Without any fine-tuning, GPT-3 could write essays, code, poetry, translate languages, and answer questions through 'few-shot learning' (learning from just a few examples in the prompt). The API launched in beta, enabling thousands of applications.

Tom BrownOpenAI
Protein structure visualization representing AlphaFold's predictions

AlphaFold 2: Protein Folding Solved

DeepMind's AlphaFold 2 solved the 50-year-old protein structure prediction problem, achieving accuracy comparable to experimental methods at CASP14. It could predict how proteins fold from their amino acid sequences — a problem that had stumped biologists for half a century.

John JumperDemis HassabisDeepMind
AI-generated image by DALL-E

DALL-E: Text to Image Generation

OpenAI unveiled DALL-E, a model that could generate images from text descriptions — 'an armchair in the shape of an avocado' became iconic. Built on GPT-3's architecture adapted for images, it showed that language models could bridge the gap between text and visual creativity.

OpenAI

Generative AI Revolution (20222024)

OpenAI logo

GPT-4: Multimodal Intelligence

OpenAI released GPT-4, a multimodal model that could understand both text and images. It passed the bar exam (90th percentile), scored 1410 on the SAT, and demonstrated remarkably nuanced reasoning. It was a massive leap from GPT-3.5 in accuracy, safety, and capability.

OpenAI
OpenAI logo, creators of Sora

Sora: AI Video Generation

OpenAI previewed Sora, a model that could generate photorealistic videos up to a minute long from text descriptions. The quality stunned the world — realistic physics, complex camera movements, and coherent scenes that looked like professional cinematography.

OpenAI
Google Gemini logo

Gemini 1.5 Pro: Million-Token Context

Google released Gemini 1.5 Pro with a 1 million token context window (later extended to 2M) — able to process entire codebases, books, or hours of video in a single prompt. It could find a needle in a haystack across millions of tokens with near-perfect recall.

Google DeepMind
OpenAI logo, creators of o1

OpenAI o1: Reasoning Models

OpenAI released o1, a model trained to 'think before it speaks' using chain-of-thought reasoning at inference time. It could solve complex math, coding, and science problems by spending more compute thinking through multi-step solutions — trading speed for accuracy on hard problems.

OpenAI
Nobel Prize medal

Nobel Prizes Awarded for AI Work

The 2024 Nobel Prize in Physics went to Geoffrey Hinton and John Hopfield for foundational work on neural networks and machine learning. The Nobel Prize in Chemistry went to Demis Hassabis and John Jumper (AlphaFold) alongside David Baker for computational protein design. AI research received the highest scientific recognition.

Geoffrey HintonJohn HopfieldNobel CommitteeDeepMind

Popular Topics

Other Categories