TD-Gammon: Reinforcement Learning Plays Backgammon
At a glance
- Date
- 1992
- Era
- Second AI Winter (1988–1993)
- Category
- Research Breakthroughs
- Impact
- 2 / 5
- Key people
- Gerald Tesauro
- Organizations
- IBM
What Happened
Gerald Tesauro created TD-Gammon, a neural network that learned to play backgammon at expert level through self-play using temporal difference reinforcement learning. It discovered novel strategies that surprised human experts.
Why It Mattered
Pioneering demonstration of reinforcement learning + neural networks. Foreshadowed AlphaGo's self-play approach by 24 years.
Key People
Organizations
Part of the Second AI Winter (1988–1993) era · Browse all research breakthroughs · View all 1992 milestones
