AlphaGo

Friday, January 29th, 2016

Researchers at DeepMind staged a machine-versus-man Go contest in October, at the company’s offices in London:

The DeepMind system, dubbed AlphaGo, matched its artificial wits against Fan Hui, Europe’s reigning Go champion, and the AI system went undefeated in five games witnessed by an editor from the journal Nature and an arbiter representing the British Go Federation. “It was one of the most exciting moments in my career, both as a researcher and as an editor,” the Nature editor, Dr. Tanguy Chouard, said during a conference call with reporters on Tuesday.

This morning, Nature published a paper describing DeepMind’s system, which makes clever use of, among other techniques, an increasingly important AI technology called deep learning. Using a vast collection of Go moves from expert players — about 30 million moves in total — DeepMind researchers trained their system to play Go on its own. But this was merely a first step. In theory, such training only produces a system as good as the best humans. To beat the best, the researchers then matched their system against itself. This allowed them to generate a new collection of moves they could then use to train a new AI player that could top a grandmaster.

“The most significant aspect of all this…is that AlphaGo isn’t just an expert system, built with handcrafted rules,” says Demis Hassabis, who oversees DeepMind. “Instead, it uses general machine-learning techniques how to win at Go.”

[...]

“Go is implicit. It’s all pattern matching,” says Hassabis. “But that’s what deep learning does very well.”

[...]

At DeepMind and Edinburgh and Facebook, researchers hoped neural networks could master Go by “looking” at board positions, much like a human plays. As Facebook showed in a recent research paper, the technique works quite well. By pairing deep learning and the Monte Carlo Tree method, Facebook beat some human players — though not Crazystone and other top creations.

But DeepMind pushes this idea much further. After training on 30 million human moves, a DeepMind neural net could predict the next human move about 57 percent of the time — an impressive number (the previous record was 44 percent). Then Hassabis and team matched this neural net against slightly different versions of itself through what’s called reinforcement learning. Essentially, as the neural nets play each other, the system tracks which move brings the most reward — the most territory on the board. Over time, it gets better and better at recognizing which moves will work and which won’t.

“AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving,” says DeepMind researcher David Silver.

According to Silver, this allowed AlphaGo to top other Go-playing AI systems, including Crazystone. Then the researchers fed the results into a second neural network. Grabbing the moves suggested by the first, it uses many of the same techniques to look ahead to the result of each move. This is similar to what older systems like Deep Blue would do with chess, except that the system is learning as it goes along, as it analyzes more data — not exploring every possible outcome through brute force. In this way, AlphaGo learned to beat not only existing AI programs but a top human as well.

Comments

  1. Ross says:

    Cool Story. Didn’t think Go AI would move quite this quickly. That said….

    Gary Marcus wrote that this Euro-Go chap is ranked something like 600. Given DeepMind’s play, it might be as good as 275 ranking, roughly. So, there’s a bit further to go (NPI) before “Skynet”.

    FWIW this is not undifferentiated pattern matching à la the current rage, “deep neural nets”. I really look forward to to reading the background but IIRC there’s a bit of tree-search blended with the hidden layers as well. That’s cool.

  2. Candide III says:

    There is both Monte Carlo tree-search using deep learning results and extensive use of fixed programmed features and patterns. The games look sane, not like CrazyStone, and beating European amateur champion is quite an impressive feat, though I’d like to go over the games with a stronger player. Still, from how the games feel, I wonder how will AlphaGo fare against a serious Korean or Chinese pro playing at tournament strength. Fan Hui became 1 dan pro at 16, achieved only 2 dan in 3 years and moved to France, effectively terminating his professional career. There are hundreds, if not thousands, of such rejects from the extremely rigorous and competitive Korean, Chinese and Japanese professional ladders, occupying a gray area between amateurs and professionals. If, say, Lee Sedol plays seriously instead of as a PR feature for the AI team (as such matches often seem to be), I bet he will crush AlphaGo.

Leave a Reply