TD-Gammon: Reinforcement Learning Plays Backgammon

Impact

At a glance

Date: 1992
Era: Second AI Winter (1988–1993)
Category: Research Breakthroughs
Impact: 2 / 5
Key people: Gerald Tesauro
Organizations: IBM

What Happened

Gerald Tesauro created TD-Gammon, a neural network that learned to play backgammon at expert level through self-play using temporal difference reinforcement learning. It discovered novel strategies that surprised human experts.

Why It Mattered

Pioneering demonstration of reinforcement learning + neural networks. Foreshadowed AlphaGo's self-play approach by 24 years.

Key People

Gerald Tesauro

Organizations

Frequently asked questions

When did TD-Gammon: Reinforcement Learning Plays Backgammon happen?+

TD-Gammon: Reinforcement Learning Plays Backgammon took place in 1992.

Who was behind TD-Gammon: Reinforcement Learning Plays Backgammon?+

For TD-Gammon: Reinforcement Learning Plays Backgammon, key people included Gerald Tesauro and organizations involved were IBM.

Why was TD-Gammon: Reinforcement Learning Plays Backgammon important?+

Pioneering demonstration of reinforcement learning + neural networks. Foreshadowed AlphaGo's self-play approach by 24 years.

Which era of AI history does TD-Gammon: Reinforcement Learning Plays Backgammon belong to?+

TD-Gammon: Reinforcement Learning Plays Backgammon is part of the Second AI Winter era (1988–1993) — a notable step in the research breakthroughs category.

Related Milestones

1952Research

Samuel's Checkers Program

Arthur Samuel created a checkers-playing program at IBM that could learn from experience, improving its play over time. He coined the term 'machine learning' to describe programs that learn without being explicitly programmed.

Arthur SamuelIBM

2013.12Research

DeepMind's DQN Masters Atari Games

DeepMind demonstrated a deep reinforcement learning agent (Deep Q-Network) that learned to play Atari 2600 games directly from pixel inputs, achieving superhuman performance on many games with no task-specific engineering. Google acquired DeepMind for ~$500 million shortly after.

Volodymyr MnihDemis HassabisDeepMind

1987Research

NETtalk: Neural Network Learns to Speak

NETtalk was a neural network that learned to pronounce English text aloud, starting from babbling sounds and gradually becoming intelligible — mimicking how a child learns to speak. It captured public imagination and demonstrated backpropagation's potential.

Terrence SejnowskiCharles RosenbergJohns Hopkins University

1989Research

LeNet: Convolutional Neural Networks

Yann LeCun demonstrated that convolutional neural networks (CNNs) could be trained with backpropagation to recognize handwritten digits. The refined LeNet-5 (1998) achieved 99%+ accuracy on MNIST and was deployed by banks to read checks — running in ATMs for years.

Yann LeCunAT&T Bell Labs

1997Research

Long Short-Term Memory (LSTM)

Hochreiter and Schmidhuber published the LSTM architecture, solving the vanishing gradient problem that plagued recurrent neural networks. LSTMs could learn long-range dependencies in sequential data by maintaining a memory cell with gates that controlled information flow.

Sepp HochreiterJürgen SchmidhuberTechnical University of Munich

Get the latest AI milestones as they happen

Join the newsletter. No spam, just signal.