Go engine

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang1, Chris J. Maddison, Arthur Guez, Laurent Sifre1, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe,
John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach1, Koray Kavukcuoglu,
Thore Graepel1, Demis Hassabis

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm,our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
Continue reading “Go engine”

Anki

Anki is a program which makes remembering things easy. Because it’s a lot more efficient than traditional study methods, you can either greatly decrease your time spent studying, or greatly increase the amount you learn.

Anyone who needs to remember things in their daily life can benefit from Anki. Since it is content-agnostic and supports images, audio, videos and scientific markup (via LaTeX), the possibilities are endless.
For example:

  • Learning a language
  • Studying for medical and law exams
  • Memorizing people’s names and faces
  • Brushing up on geography
  • Mastering long poems
  • Even practicing guitar chords!

Solved game

A solved game is a game whose outcome (win, lose, or draw) can be correctly predicted from any position, given that both players play perfectly. Games which have not been solved are said to be “unsolved”. Games for which only some positions have been solved are said to be “partially solved”. This article focuses on two-player games that have been solved.

A two-player game can be “solved” on several levels:[1][2]

Ultra-weak

Prove whether the first player will win, lose, or draw from the initial position, given perfect play on both sides. This can be a non-constructive proof (possibly involving astrategy-stealing argument) that need not actually determine any moves of the perfect play.

Weak

Provide an algorithm that secures a win for one player, or a draw for either, against any possible moves by the opponent, from the beginning of the game. That is, produce at least one complete ideal game (all moves start to end) with proof that each move is optimal for the player making it. It does not necessarily mean a computer program using the solution will play optimally against an imperfect opponent. For example, the checkers program Chinook will never turn a drawn position into a losing position (since the weak solution of checkers proves that it is a draw), but it might possibly turn a winning position into a drawn position because Chinook does not expect the opponent to play a move that will not win but could possibly lose, and so it does not analyze such moves completely.

Strong

Provide an algorithm that can produce perfect play (moves) from any position, even if mistakes have already been made on one or both sides.

Despite the name, many game theorists believe that “ultra-weak” are the deepest, most interesting and valuable proofs. “Ultra-weak” proofs require a scholar to reason about the abstract properties of the game, and show how these properties lead to certain outcomes if perfect play is realized.[citation needed]

By contrast, “strong” proofs often proceed by brute force — using a computer to exhaustively search a game tree to figure out what would happen if perfect play were realized. The resulting proof gives an optimal strategy for every possible position on the board. However, these proofs aren’t as helpful in understanding deeper reasons why some games are solvable as a draw, and other, seemingly very similar games are solvable as a win.

Given the rules of any two-person game with a finite number of positions, one can always trivially construct a minimax algorithm that would exhaustively traverse the game tree. However, since for many non-trivial games such an algorithm would require an infeasible amount of time to generate a move in a given position, a game is not considered to be solved weakly or strongly unless the algorithm can be run by existing hardware in a reasonable time. Many algorithms rely on a huge pre-generated database, and are effectively nothing more.

As an example of a strong solution, the game of tic-tac-toe is solvable as a draw for both players with perfect play (a result even manually determinable by schoolchildren). Games like nim also admit a rigorous analysis using combinatorial game theory.

Whether a game is solved is not necessarily the same as whether it remains interesting for humans to play. Even a strongly solved game can still be interesting if its solution is too complex to be memorized; conversely, a weakly solved game may lose its attraction if the winning strategy is simple enough to remember (e.g. Maharajah and the Sepoys). An ultra-weak solution (e.g. Chomp or Hex on a sufficiently large board) generally does not affect playability.

In non-perfect information games, one also has the notion of essentially weakly solved[3]. A game is said to be essentially weakly solved if a human lifetime of play is not sufficient to establish with statistical significance that the strategy is not an exact solution. As an example, the poker variation heads-up limit Texas hold ’em have been essentially weakly solved by the poker bot Cepheus[3][4][5].

Perfect play

In game theory, perfect play is the behavior or strategy of a player that leads to the best possible outcome for that player regardless of the response by the opponent. Based on the rules of a game, every possible final position can be evaluated (as a win, loss or draw). By backward reasoning, one can recursively evaluate a non-final position as identical to that of the position that is one move away and best valued for the player whose move it is. Thus a transition between positions can never result in a better evaluation for the moving player, and a perfect move in a position would be a transition between positions that are equally evaluated. As an example, a perfect player in a drawn position would always get a draw or win, never a loss. If there are multiple options with the same outcome, perfect play is sometimes considered the fastest method leading to a good result, or the slowest method leading to a bad result.

Perfect play can be generalized to non-perfect information games, as the strategy that would guarantee the highest minimal expected outcome regardless of the strategy of the opponent. As an example, the perfect strategy for Rock, Paper, Scissors would be to randomly choose each of the options with equal (1/3) probability. The disadvantage in this example is that this strategy will never exploit non-optimal strategies of the opponent, so the expected outcome of this strategy versus any strategy will always be equal to the minimal expected outcome.

Although the optimal strategy of a game may not (yet) be known, a game-playing computer might still benefit from solutions of the game from certain endgame positions (in the form of endgame tablebases), which will allow it to play perfectly after some point in the game. Computer chess programs are well known for doing this.

Solved games

Awari (a game of the Mancala family)
The variant of Oware allowing game ending “grand slams” was strongly solved by Henri Bal and John Romein at the Vrije Universiteit in Amsterdam, Netherlands (2002). Either player can force the game into a draw.
Checkers
See “Draughts, English”
Chopsticks
The second player can always force a win.[6]
Connect Four
Solved first by James D. Allen (Oct 1, 1988), and independently by Victor Allis (Oct 16, 1988).[7] First player can force a win. Strongly solved by John Tromp’s 8-ply database[8](Feb 4, 1995). Weakly solved for all boardsizes where width+height is at most 15[7] (Feb 18, 2006).
Draughts, English (Checkers)
This 8×8 variant of draughts was weakly solved on April 29, 2007 by the team of Jonathan Schaeffer, known for Chinook, the “World Man-Machine Checkers Champion“. From the standard starting position, both players can guarantee a draw with perfect play.[9] Checkers is the largest game that has been solved to date, with a search space of 5×1020.[10] The number of calculations involved was 1014, which were done over a period of 18 years. The process involved from 200 desktop computers at its peak down to around 50.[11]

The game of checkers has roughly 500 billion billion possible positions (5 × 1020). The task of solving the game, determining the final result in a game with no mistakes made by either player, is daunting. Since 1989, almost continuously, dozens of computers have been working on solving checkers, applying state-of-the-art artificial intelligence techniques to the proving process. This paper announces that checkers is now solved: Perfect play by both sides leads to a draw. This is the most challenging popular game to be solved to date, roughly one million times as complex as Connect Four. Artificial intelligence technology has been used to generate strong heuristic-based game-playing programs, such as Deep Blue for chess. Solving a game takes this to the next level by replacing the heuristics with perfection.

Hindsight bias

Hindsight bias, also known as the knew-it-all-along effect or creeping determinism, is the inclination, after an event has occurred, to see the event as having been predictable, despite there having been little or no objective basis for predicting it, prior to its occurrence.[1][2] It is a multifaceted phenomenon that can affect different stages of designs, processes, contexts, and situations.[3] Hindsight bias may cause memory distortion, where the recollection and reconstruction of content can lead to false theoretical outcomes. It has been suggested that the effect can cause extreme methodological problems while trying to analyze, understand, and interpret results in experimental studies. A basic example of the hindsight bias is when, after viewing the outcome of a potentially unforeseeable event, a person believes he or she “knew it all along”. Such examples are present in the writings of historians describing outcomes of battles, physicians recalling clinical trials, and in judicial systems trying to attribute responsibility and predictability of accidents.

Power and dominance

Non verbal expressions of power and dominance are gestures or motions that assert one´s authority over another.

handshakes
waving
smiling

The colors one wears affect other´s perceptions of one´s authority:

purple: people of high status adorn their clothing with purple to distinguish themselves as noble or wealthy

people attribute greater authority to others wearing red

It is human to strive for power and dominance in social settings

simple gestures establish authority

A firmer handshake
Better posture
Causing slight interruptions in conversation

can rise authority in group situations

many peers view Non verbal expressions of power and dominance as manipulation for self gain

Their abuse can be disastrous

Men and women have different perceptions of Non verbal expressions of power and dominance

Nodding is misinterpreted in cross gender communication

women interpret a nod as a signal of understanding

men interpret a nod as a signal of agreement

small miscommunications and misinterpretations lead to disagreement and confrontation

Russel (as cited in Dunbar & Burgoon, 2005) describes, “the fundamental concept in social science is power, in the same way that energy is the fundamental concept in physics“. Power and dominance-submission are two key concepts in relationships, especially close relationships where individuals rely on one another to achieve their goals (Dunbar & Burgoon, 2005) and as such it is important to be able to identify indicators of dominance.

Power and dominance are different concepts yet share similarities. Power is the ability to influence behavior (Bachrach & Lawler; Berger; Burgoon et al.; Foa & Foa; French & Raven; Gray-Little & Burks; Henley; Olson & Cromwell; Rollins & Bahr, as cited in Dunbar & Burgoon, 2005) and may or may not be fully evident until challenged by an equal force (Huston, as cited in Dunbar & Burgoon, 2005). Unlike power, that may be latent, dominance is manifest reflecting individual (Komter, as cited in Dunbar & Burgoon, 2005), situational and relationship patterns where control attempts are either accepted or rejected (Rogers-Millar & Millar,as cited in Dunbar & Burgoon, 2005). Moskowitz, Suh, and Desaulniers (1994) mention two similar ways that people can relate to the world in interpersonal relationships: agency and communion. Agency includes status and is a continuum from assertiveness-dominance to passive-submissiveness – it can be measured by subtracting submissiveness from dominance. Communion is a second way to interact with others and includes love with a continuum from warm-agreeable to cold-hostile-quarrelsomeness. Power and dominance relate together in such a way that those with the greatest and least power typically do not assert dominance while those with more equal relationships make more control attempts Dunbar & Burgoon, 2005).

As one can see, power and dominance are important, intertwined, concepts that greatly impact relationships. In order to understand how dominance captures relationships one must understand the influence of gender and social roles while watching for verbal and nonverbal indicators of dominance.

wagon-wheel effect

The wagon-wheel effect (alternatively, stagecoach-wheel effectstroboscopic effect) is an optical illusion in which a spoked wheelappears to rotate differently from its true rotation. The wheel can appear to rotate more slowly than the true rotation, it can appear stationary, or it can appear to rotate in the opposite direction from the true rotation. This last form of the effect is sometimes called thereverse rotation effect.

The wagon-wheel effect is most often seen in film or television depictions of stagecoaches or wagons in Western movies, although recordings of any regularly spoked wheel will show it, such as helicopter rotors and aircraft propellers. In these recorded media, the effect is a result of temporal aliasing.[1] It can also commonly be seen when a rotating wheel is illuminated by flickering light. These forms of the effect are known as stroboscopic effects: the original smooth rotation of the wheel is visible only intermittently. A version of the wagon-wheel effect can also be seen under continuous illumination.

Rushton (1967[5]) observed the wagon-wheel effect under continuous illumination while humming. The humming vibrates the eyes in their sockets, effectively creating stroboscopic conditions within the eye. By humming at a frequency of a multiple of the rotation frequency, he was able to stop the rotation. By humming at slightly higher and lower frequencies, he was able to make the rotation reverse slowly and to make the rotation go slowly in the direction of rotation. A similar stroboscopic effect is now commonly observed by people eating crunchy foods, such as carrots, while watching TV: the image appears to shimmer.[6] The crunching vibrates the eyes at a multiple of the frame rate of the TV. Besides vibrations of the eyes, the effect can be produced by observing wheels via a vibrating mirror. Rear-view mirrors in vibrating cars can produce the effect.

Truly continuous illumination

The first to observe the wagon-wheel effect under truly continuous illumination (such as from the sun) was Schouten (1967[7]). He distinguished three forms of subjective stroboscopy which he called alpha, beta, and gamma: Alpha stroboscopy occurs at 8–12 cycles per second; the wheel appears to become stationary, although “some sectors [spokes] look as though they are performing a hurdle race over the standing ones” (p. 48). Beta stroboscopy occurs at 30–35 cycles per second: “The distinctness of the pattern has all but disappeared. At times a definite counterrotation is seen of a grayish striped pattern” (pp. 48–49). Gamma stroboscopy occurs at 40–100 cycles per second: “The disk appears almost uniform except that at all sector frequencies a standing grayish pattern is seen … in a quivery sort of standstill” (pp. 49–50). Schouten interpreted beta stroboscopy, reversed rotation, as consistent with there being Reichardt detectors in the human visual system for encoding motion. Because the spoked wheel patterns he used (radial gratings) are regular, they can strongly stimulate detectors for the true rotation, but also weakly stimulate detectors for the reverse rotation.

There are two broad theories for the wagon-wheel effect under truly continuous illumination. The first is that human visual perception takes a series of still frames of the visual scene and that movement is perceived much like a movie. The second is Schouten’s theory: that moving images are processed by visual detectors sensitive to the true motion and also by detectors sensitive to opposite motion from temporal aliasing. There is evidence for both theories, but the weight of evidence favours the latter.

Discrete frames theory

Purves, Paydarfar, and Andrews (1996[8]) proposed the discrete-frames theory. One piece of evidence for this theory comes from Dubois and VanRullen (2011[9]). They reviewed experiences of users of LSD who often report that under the influence of the drug a moving object is seen trailing a series of still images behind it. They asked such users to match their drug experiences with movies simulating such trailing images viewed when not under the drug. They found that users selected movies around 15–20 Hz. This is between Schouten’s alpha and beta rates.

Other evidence for the theory is reviewed next.

Temporal aliasing theory

Kline, Holcombe, and Eagleman (2004[10]) confirmed the observation of reversed rotation with regularly spaced dots on a rotating drum. They called this “illusory motion reversal”. They showed that these occurred only after a long time of viewing the rotating display (from about 30 seconds to as long as 10 minutes for some observers). They also showed that the incidences of reversed rotation were independent in different parts of the visual field. This is inconsistent with discrete frames covering the entire visual scene. Kline, Holcombe, and Eagleman (2006[11]) also showed that reversed rotation of a radial grating in one part of the visual field was independent of superimposed orthogonal motion in the same part of the visual field. The orthogonal motion was of a circular grating contracting so as to have the same temporal frequency as the radial grating. This is inconsistent with discrete frames covering local parts of visual scene. Kline et al. concluded that the reverse rotations were consistent with Reichardt detectors for the reverse direction of rotation becoming sufficiently active to dominate perception of the true rotation in a form of rivalry. The long time required to see the reverse rotation suggests that neural adaptation of the detectors responding to the true rotation has to occur before the weakly stimulated reverse-rotation detectors can contribute to perception.

Some small doubts about the results of Kline et al. (2004) sustain adherents of the discrete-frame theory. These doubts include Kline et al.’s finding in some observers more instances of simultaneous reversals from different parts of the visual field than would be expected by chance, and finding in some observers differences in the distribution of the durations of reversals from that expected by a pure rivalry process (Rojas, Carmona-Fontaine, López-Calderón, & Aboitiz, 2006[12]).

In 2008, Kline and Eagleman demonstrated that illusory reversals of two spatially overlapping motions could be perceived separately, providing further evidence that illusory motion reversal is not caused by temporal sampling.[13] They also showed that illusory motion reversal occurs with non-uniform and non-periodic stimuli (for example, a spinning belt of sandpaper), which also cannot be compatible with discrete sampling. Kline and Eagleman proposed instead that the effect results from a “motion during-effect”, meaning that a motion after-effect becomes superimposed on the real motion.

Dangers

Because of the illusion this can give to moving machinery, it is advised that single-phase lighting be avoided in workshops and factories. For example, a factory that is lit from a single-phase supply with basic fluorescent lighting will have a flicker of twice the mains frequency, either at 100 or 120 Hz (depending on country); thus, any machinery rotating at multiples of this frequency may appear to not be turning. Seeing that the most common types of AC motors are locked to the mains frequency, this can pose a considerable hazard to operators of lathes and other rotating equipment. Solutions include deploying the lighting over a full 3-phase supply, or by using high-frequency controllers that drive the lights at safer frequencies.[14] Traditional incandescent light bulbs, which employ filaments that glow continuously, offer another option as well, albeit at the expense of increased power consumption. Smaller incandescent lights can be used as task lighting on equipment to help combat this effect to avoid the cost of operating larger quantities of incandescent lighting in a workshop environment.

 

rotatingwheels