AlphaGo — LLMpedia

AlphaGo
Name	AlphaGo
Developer	DeepMind
Released	2015
Genre	Artificial intelligence
Platforms	Google Cloud Platform

Contents

Development and history
Technical architecture
Match against Lee Sedol
Impact and legacy
Subsequent systems and evolution

AlphaGo. It is a computer program developed by DeepMind that became the first to defeat a professional human player at the full, unfettered game of Go. This achievement, long considered a "holy grail" of artificial intelligence research, was realized through a novel combination of Monte Carlo tree search and deep neural networks. The program's historic victory over world champion Lee Sedol in 2016 marked a paradigm shift in the capabilities of machine learning and its potential to solve complex problems.

Development and history

The project was initiated by researchers at DeepMind, a London-based company later acquired by Google. The core challenge of Go, with its vast branching factor exceeding that of chess, had resisted traditional brute-force search methods used by earlier champions like IBM's Deep Blue. Key milestones included a 2015 victory in a closed-door match against European champion Fan Hui, which was detailed in a paper published in the journal Nature. This success demonstrated the efficacy of its neural network approach, setting the stage for a publicly broadcast contest against one of the game's greatest modern players. The team was led by figures like Demis Hassabis and David Silver, who integrated advances from reinforcement learning and supervised learning.

Technical architecture

The system's strength stemmed from its sophisticated integration of several machine learning components. It utilized two primary neural networks: a **policy network** to predict the next move, and a **value network** to evaluate board positions. These networks were first trained on a large corpus of expert games from online servers like KGS Go Server, learning through supervised learning. Subsequently, the program refined its strategy through millions of games of self-play, a form of reinforcement learning, where it competed against iterated versions of itself. During actual gameplay, these networks guided a Monte Carlo tree search algorithm, allowing it to simulate and evaluate promising sequences of moves far more efficiently than previous programs. This architecture ran on hardware utilizing both CPUs and TPUs.

Match against Lee Sedol

The five-game match, held in Seoul in March 2016, was a landmark event in the history of both Go and artificial intelligence. Lee Sedol, an 18-time world champion, was defeated four games to one. The match featured several astonishing and creative moves, most famously **Move 37** in the second game, a seemingly unorthodox play that experts later hailed as profoundly creative. Lee Sedol did manage a single victory in the fourth game, executing a brilliant sequence that exploited a rare weakness in the program's assessment. The event was broadcast globally by YouTube and covered extensively by media outlets like The New York Times, symbolizing a dramatic leap in machine intelligence and prompting widespread discussion about the future of human–computer interaction.

Impact and legacy

The success of the program fundamentally altered the perception of artificial intelligence's potential, proving that deep learning systems could master intuitive and strategic domains. Within the Go community, it revolutionized theory and practice, introducing novel strategies and concepts that human players began to study and adopt. The underlying technologies demonstrated broad applicability beyond games, influencing subsequent research in areas like protein folding and materials science. The event also sparked significant philosophical and ethical debates about the pace of technological advancement, the nature of human creativity, and the future of work, discussed by thinkers at institutions like the Future of Humanity Institute.

Subsequent systems and evolution

Building on its architecture, DeepMind rapidly developed more advanced and generalized successors. The most direct evolution was AlphaGo Master, which defeated top professionals in online games. This was followed by AlphaGo Zero, a version that learned solely through self-play without any human data, starting from random play and surpassing all previous versions. The ultimate evolution was AlphaZero, a single algorithm that achieved superhuman performance not only in Go but also in chess and shogi by learning each game's rules from scratch. These systems demonstrated the power of reinforcement learning and general game-playing, providing a blueprint for artificial general intelligence research and applied projects like AlphaFold.

Category:Artificial intelligence Category:Computer programs Category:Board games