Go engine

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang1, Chris J. Maddison, Arthur Guez, Laurent Sifre1, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe,
John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach1, Koray Kavukcuoglu,
Thore Graepel1, Demis Hassabis

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm,our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
Continue reading “Go engine”

Solved game

A solved game is a game whose outcome (win, lose, or draw) can be correctly predicted from any position, given that both players play perfectly. Games which have not been solved are said to be “unsolved”. Games for which only some positions have been solved are said to be “partially solved”. This article focuses on two-player games that have been solved.

A two-player game can be “solved” on several levels:[1][2]

Ultra-weak

Prove whether the first player will win, lose, or draw from the initial position, given perfect play on both sides. This can be a non-constructive proof (possibly involving astrategy-stealing argument) that need not actually determine any moves of the perfect play.

Weak

Provide an algorithm that secures a win for one player, or a draw for either, against any possible moves by the opponent, from the beginning of the game. That is, produce at least one complete ideal game (all moves start to end) with proof that each move is optimal for the player making it. It does not necessarily mean a computer program using the solution will play optimally against an imperfect opponent. For example, the checkers program Chinook will never turn a drawn position into a losing position (since the weak solution of checkers proves that it is a draw), but it might possibly turn a winning position into a drawn position because Chinook does not expect the opponent to play a move that will not win but could possibly lose, and so it does not analyze such moves completely.

Strong

Provide an algorithm that can produce perfect play (moves) from any position, even if mistakes have already been made on one or both sides.

Despite the name, many game theorists believe that “ultra-weak” are the deepest, most interesting and valuable proofs. “Ultra-weak” proofs require a scholar to reason about the abstract properties of the game, and show how these properties lead to certain outcomes if perfect play is realized.[citation needed]

By contrast, “strong” proofs often proceed by brute force — using a computer to exhaustively search a game tree to figure out what would happen if perfect play were realized. The resulting proof gives an optimal strategy for every possible position on the board. However, these proofs aren’t as helpful in understanding deeper reasons why some games are solvable as a draw, and other, seemingly very similar games are solvable as a win.

Given the rules of any two-person game with a finite number of positions, one can always trivially construct a minimax algorithm that would exhaustively traverse the game tree. However, since for many non-trivial games such an algorithm would require an infeasible amount of time to generate a move in a given position, a game is not considered to be solved weakly or strongly unless the algorithm can be run by existing hardware in a reasonable time. Many algorithms rely on a huge pre-generated database, and are effectively nothing more.

As an example of a strong solution, the game of tic-tac-toe is solvable as a draw for both players with perfect play (a result even manually determinable by schoolchildren). Games like nim also admit a rigorous analysis using combinatorial game theory.

Whether a game is solved is not necessarily the same as whether it remains interesting for humans to play. Even a strongly solved game can still be interesting if its solution is too complex to be memorized; conversely, a weakly solved game may lose its attraction if the winning strategy is simple enough to remember (e.g. Maharajah and the Sepoys). An ultra-weak solution (e.g. Chomp or Hex on a sufficiently large board) generally does not affect playability.

In non-perfect information games, one also has the notion of essentially weakly solved[3]. A game is said to be essentially weakly solved if a human lifetime of play is not sufficient to establish with statistical significance that the strategy is not an exact solution. As an example, the poker variation heads-up limit Texas hold ’em have been essentially weakly solved by the poker bot Cepheus[3][4][5].

Perfect play

In game theory, perfect play is the behavior or strategy of a player that leads to the best possible outcome for that player regardless of the response by the opponent. Based on the rules of a game, every possible final position can be evaluated (as a win, loss or draw). By backward reasoning, one can recursively evaluate a non-final position as identical to that of the position that is one move away and best valued for the player whose move it is. Thus a transition between positions can never result in a better evaluation for the moving player, and a perfect move in a position would be a transition between positions that are equally evaluated. As an example, a perfect player in a drawn position would always get a draw or win, never a loss. If there are multiple options with the same outcome, perfect play is sometimes considered the fastest method leading to a good result, or the slowest method leading to a bad result.

Perfect play can be generalized to non-perfect information games, as the strategy that would guarantee the highest minimal expected outcome regardless of the strategy of the opponent. As an example, the perfect strategy for Rock, Paper, Scissors would be to randomly choose each of the options with equal (1/3) probability. The disadvantage in this example is that this strategy will never exploit non-optimal strategies of the opponent, so the expected outcome of this strategy versus any strategy will always be equal to the minimal expected outcome.

Although the optimal strategy of a game may not (yet) be known, a game-playing computer might still benefit from solutions of the game from certain endgame positions (in the form of endgame tablebases), which will allow it to play perfectly after some point in the game. Computer chess programs are well known for doing this.

Solved games

Awari (a game of the Mancala family)
The variant of Oware allowing game ending “grand slams” was strongly solved by Henri Bal and John Romein at the Vrije Universiteit in Amsterdam, Netherlands (2002). Either player can force the game into a draw.
Checkers
See “Draughts, English”
Chopsticks
The second player can always force a win.[6]
Connect Four
Solved first by James D. Allen (Oct 1, 1988), and independently by Victor Allis (Oct 16, 1988).[7] First player can force a win. Strongly solved by John Tromp’s 8-ply database[8](Feb 4, 1995). Weakly solved for all boardsizes where width+height is at most 15[7] (Feb 18, 2006).
Draughts, English (Checkers)
This 8×8 variant of draughts was weakly solved on April 29, 2007 by the team of Jonathan Schaeffer, known for Chinook, the “World Man-Machine Checkers Champion“. From the standard starting position, both players can guarantee a draw with perfect play.[9] Checkers is the largest game that has been solved to date, with a search space of 5×1020.[10] The number of calculations involved was 1014, which were done over a period of 18 years. The process involved from 200 desktop computers at its peak down to around 50.[11]

The game of checkers has roughly 500 billion billion possible positions (5 × 1020). The task of solving the game, determining the final result in a game with no mistakes made by either player, is daunting. Since 1989, almost continuously, dozens of computers have been working on solving checkers, applying state-of-the-art artificial intelligence techniques to the proving process. This paper announces that checkers is now solved: Perfect play by both sides leads to a draw. This is the most challenging popular game to be solved to date, roughly one million times as complex as Connect Four. Artificial intelligence technology has been used to generate strong heuristic-based game-playing programs, such as Deep Blue for chess. Solving a game takes this to the next level by replacing the heuristics with perfection.

Google Hummingbird

Google Hummingbird is a search algorithm used by Google. To celebrate their 15th birthday, on September 27, 2013 Google launched [1] a new “Hummingbird” algorithm,[2] claiming that Google search can be a more human way to interact with users and provide a more direct answer.[3]

Google started using Hummingbird about 30 August 2013,[4] it said. Google only announced the change on September 26.

 

What type of “new” search activity does Hummingbird help?

Conversational search” is one of the biggest examples Google gave. People, when speaking searches, may find it more useful to have a conversation.

I thought Google did this conversational search stuff already!

It does (see Google’s Impressive “Conversational Search” Goes Live On Chrome), but it had only been doing it really within its Knowledge Graph answers. Hummingbird is designed to apply the meaning technology to billions of pages from across the web, in addition to Knowledge Graph facts, which may bring back better results.

How do you know all this stuff?

Google shared some of it at its press event today, and then I talked with two of Google’s top search execs, Amit Singhal and Ben Gomes, after the event for more details. I also hope to do a more formal look at the changes from those conversations in the near future. But for now, hopefully you’ve found this quick FAQ based on those conversations to be helpful.

By the way, another term for the “meaning” connections that Hummingbird does is “entity search,” and we have an entire panel on that at our SMX East search marketing show in New York City, next week. The Coming “Entity Search” Revolution session is part of an entire “Semantic Search” track that also gets into ways search engines are discovering meanings behind words. Learn more about the track and the entire show on the agenda page.