AlphaGo « Isegoria

AlphaGo

Friday, January 29th, 2016

Researchers at DeepMind staged a machine-versus-man Go contest in October, at the company’s offices in London:

The DeepMind system, dubbed AlphaGo, matched its artificial wits against Fan Hui, Europe’s reigning Go champion, and the AI system went undefeated in five games witnessed by an editor from the journal Nature and an arbiter representing the British Go Federation. “It was one of the most exciting moments in my career, both as a researcher and as an editor,” the Nature editor, Dr. Tanguy Chouard, said during a conference call with reporters on Tuesday.

This morning, Nature published a paper describing DeepMind’s system, which makes clever use of, among other techniques, an increasingly important AI technology called deep learning. Using a vast collection of Go moves from expert players — about 30 million moves in total — DeepMind researchers trained their system to play Go on its own. But this was merely a first step. In theory, such training only produces a system as good as the best humans. To beat the best, the researchers then matched their system against itself. This allowed them to generate a new collection of moves they could then use to train a new AI player that could top a grandmaster.

“The most significant aspect of all this…is that AlphaGo isn’t just an expert system, built with handcrafted rules,” says Demis Hassabis, who oversees DeepMind. “Instead, it uses general machine-learning techniques how to win at Go.”

[...]

“Go is implicit. It’s all pattern matching,” says Hassabis. “But that’s what deep learning does very well.”

[...]

At DeepMind and Edinburgh and Facebook, researchers hoped neural networks could master Go by “looking” at board positions, much like a human plays. As Facebook showed in a recent research paper, the technique works quite well. By pairing deep learning and the Monte Carlo Tree method, Facebook beat some human players — though not Crazystone and other top creations.

But DeepMind pushes this idea much further. After training on 30 million human moves, a DeepMind neural net could predict the next human move about 57 percent of the time — an impressive number (the previous record was 44 percent). Then Hassabis and team matched this neural net against slightly different versions of itself through what’s called reinforcement learning. Essentially, as the neural nets play each other, the system tracks which move brings the most reward — the most territory on the board. Over time, it gets better and better at recognizing which moves will work and which won’t.

“AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving,” says DeepMind researcher David Silver.

According to Silver, this allowed AlphaGo to top other Go-playing AI systems, including Crazystone. Then the researchers fed the results into a second neural network. Grabbing the moves suggested by the first, it uses many of the same techniques to look ahead to the result of each move. This is similar to what older systems like Deep Blue would do with chess, except that the system is learning as it goes along, as it analyzes more data — not exploring every possible outcome through brute force. In this way, AlphaGo learned to beat not only existing AI programs but a top human as well.

Posted in Games, Technology | 2 Comments »

Comments

Ross says:

January 29, 2016 at 5:37 pm

Cool Story. Didn’t think Go AI would move quite this quickly. That said….

Gary Marcus wrote that this Euro-Go chap is ranked something like 600. Given DeepMind’s play, it might be as good as 275 ranking, roughly. So, there’s a bit further to go (NPI) before “Skynet”.

FWIW this is not undifferentiated pattern matching à la the current rage, “deep neural nets”. I really look forward to to reading the background but IIRC there’s a bit of tree-search blended with the hidden layers as well. That’s cool.
Candide III says:

January 30, 2016 at 10:59 am

There is both Monte Carlo tree-search using deep learning results and extensive use of fixed programmed features and patterns. The games look sane, not like CrazyStone, and beating European amateur champion is quite an impressive feat, though I’d like to go over the games with a stronger player. Still, from how the games feel, I wonder how will AlphaGo fare against a serious Korean or Chinese pro playing at tournament strength. Fan Hui became 1 dan pro at 16, achieved only 2 dan in 3 years and moved to France, effectively terminating his professional career. There are hundreds, if not thousands, of such rejects from the extremely rigorous and competitive Korean, Chinese and Japanese professional ladders, occupying a gray area between amateurs and professionals. If, say, Lee Sedol plays seriously instead of as a PR feature for the AI team (as such matches often seem to be), I bet he will crush AlphaGo.

Bob Sykes: While both Russia and China are authoritarian (but nowhere near Stalinist or Maoist), their Elites are both more representative of their people and work more in their people’s interests than do the US, UK, or EU regimes. The late Senator Graham is a good example of just how hostile to the American people our Epstein Rulers really are.
Jim: As far as I can tell, “robber-baron capitalism” was a rhetorical cudgel used by the power center that established the Federal Reserve System, an hereditary banking oligarchy, against ordinary self-made industrial magnates.
Gaikokumaniakku: The 19th century glorified robber-baron capitalism along with other myths of individual heroes. They forgot that Newton said he was standing on the shoulders of giants — and of course they also forgot that when Newton said that, he was referencing a long string of writers who had used the phrase for hundreds of years, back to Bernard of Chartres: “We are like dwarfs perched on the shoulders of giants, so that we can see more than they, and things at a greater distance, not by...
Harper’s Notes: RAND developed the Delphi Method during the early Cold War for estimating thermonuclear warfare casualties among other things. It works well under the right circumstances. Uses anonymity and sequential polling. But the right circumstances are generally difficult and rare. Most recently the Super-forecaster Project (Tetlock) has resulted in several the high-scorers participating in (betting) prediction markets, in which there are both advantages and disadvantages in non-anonymity.
Gaikokumaniakku: I greatly enjoyed both Runaway and Looker in the 1980s. Marginally relevant link: Starring the Computer: Computers in Movies and Television Starring the Computer: Computers in Runaway (1984)
Gaikokumaniakku: This is why crazy paradigm-breakers have so much potential for improvements to the collective system. Sadly, the rewards usually don’t work out. Crazy, dishonest charlatans are often rewarded, and crazy honest autistic crusaders for truth are usually burned at the stake. I think it was Colin Wilson who claimed the crucial distinction between real thinkers and followers was that real thinkers were capable of breaking away from socially acceptable paradigms.
Isegoria: It looks like Looker is on YouTube!
TRX: Crichton was also a novelist — a fairly good one – and wrote almost all of his own screenplays. He should have known the screenplay for Runaway stank. Watching Runaway a few months ago, I had a hard time believing it was a Crichton movie; it’s a mess. I didn’t like all of Crichton’s movies, but they were put together better than that. While on the Critchton subject, I’d like to make a plug for Looker, done about the same time as Runaway, but much better. Crichton had a...
Isegoria: Lower attendance is what we’re going for.
Bob Sykes: The problem facing all colleges and universities is that the number of white 18 year-olds, the primary consumers of college, is declining rapidly both relative to other races and absolutely. Many small liberal arts colleges are decidedly second rate academically, and so are the students they cater to. So, neither the loss of the schools nor the loss of the students is really a big deal. The health of the college system and the meaningfulness of the degrees awarded is actually better off...
Isegoria: Rising Sun came a decade later. I remember reading the novel right around when I read Jurassic Park. The Terminator, on the other hand, came out the same year as Runaways and was a much bigger deal.
Kentucky Headhunter: Huh, I remember Runaway being a fairly frequent Saturday afternoon movie option on cable. Not to the level of Rising Sun, but it was on at least once every three or four months. Now, unlike Rising Sun, I never actually left it on…
TRX: Crichton usually got his computer stuff correct, though. He picked up a degree in “computer graphics” while he was getting his M.D. at Harvard. When he decided he didn’t care for doctoring, he went to Hollywood and made more computer-ish movies than doctor-ish ones. I only discovered “Runaway” earlier this year; I thought I was familiar with all of Crichton’s movies, but apparently not. I don’t remember ever seeing any mention of it anywhere.
Isegoria: Crichton clearly had little interest in the details of weapons. In the movie, a household robot goes rogue and acquires a revolver — which makes a pump-action shotgun racking sound before each shot and leaves a ragged two-inch hole in the drywall. Sigh. So I’m not surprised he gets his warships mixed up.
Lucklucky: “battleship Sheffield” It was a mere destroyer not a battleship…
Jim: Equanimous Independence Day!
Buckethead: Adjacent to Atomic Rockets is ToughSF. Well researched and fascinating speculation on space. He posts only every so often, but he did do an interesting series on stealth – and piracy — in space.
Isegoria: Thanks for putting in the work, George. Grok also kept pointing to this blog. Apparently AI struggles with comments repeated across multiple pages.
George: Gemini claims (and I haven’t confirmed) that it’s: …a classic historical description written by the Scottish physician and traveler Dr. John Macculloch in his 1824 book, The Highlands and Western Isles of Scotland.He used this vivid phrase to describe the famous and treacherous pass of Glencroe… After searching all three volumes as PDFs, I’m pretty sure Gemini is hallucinating. And substantial time spent searching keeps leading me back to this blog. Cough up the...
Isegoria: I don’t think you’re alone in your struggle, Handle.

Isegoria

AlphaGo

Comments

Leave a Reply

Search

Recent Comments

Categories