TD-Gammon: Reinforcement Learning Plays Backgammon
What Happened
Gerald Tesauro created TD-Gammon, a neural network that learned to play backgammon at expert level through self-play using temporal difference reinforcement learning. It discovered novel strategies that surprised human experts.
Why It Mattered
Pioneering demonstration of reinforcement learning + neural networks. Foreshadowed AlphaGo's self-play approach by 24 years.
Key People
Organizations
Part of the Second AI Winter (1988–1993) era · Browse all research breakthroughs · View all 1992 milestones
