Solving Tic Tac Toe, Making it Interesting?

Tic Tac Toe (TTT) is a minimally complicated game, so it is relatively easy to find all possible moves and see which will lead you to a victory, tie, or loss. I’ve never solved TTT before, so I threw together some code to do that!

Solving Tic Tac Toe

Without removing boards that are effectively identical (via rotation and reflection), there are 5,478 possible board configurations. Of those, there are 626 boards where Player 1 (P1) has won, 316 for P2, and 16 where they tie.

What’s the optimal strategy? First, let’s give a point score for each outcome. A win is worth +1, a loss is -1, and a tie is 0. The basic idea is to look at the possible moves and pick the one that maximizes your points. Simple!

For that to work, we need to find the termination case probabilities p = ( probability that P1 wins, probability that P2 wins, probability of a tie ), given a board. For boards that describe a finished game, we know p by inspection. If P1 wins, p = [1,0,0]. If P2 wins, p = [0,1,0]. And if there is a tie, p= [0,0,1].

To find p for any given board requires more work. To do this practically, you need to have lists of the ‘parents’ (the boards that lead to the current one) and ‘children’ (possible boards after making a move). Then for a given board, we can find the different p for each child. The optimal strategy for P1 is then to pick the child that maximizes the expected score: 1*p(P1 wins)-p(P2 wins)+0*(p(tie)).

This analysis method lets us compare the results of different strategies, which are summarized in the next table, where the first row and first column show the strategy of the different players. For example, if both players play randomly, P1 wins 58% of the time. The effect of skill can be seen in the upper right and lower left corners. Playing against a random-chooser, the smart strategies almost always win! Of course, optimal play by both always leads to a tie: p=(0%, 0%, 100%). You probably already learned this many years ago.

	P1 Random	P1 Optimal
P2 Random	(58%, 29%, 13%)	(99.5%, 0, 0.5%)
P2 Optimal	(0.7%, 93%, 6%)	(0, 0, 100%)

Why isn’t Tic Tac Toe fun?

You probably haven’t played TTT against an adult since you were a kid, right? Once you’ve figured out that TTT always leads to a draw, it stops being interesting.

How can we fix this? We need some way to make it more complex, or non-deterministic. Ideally, this would be a minor change that doesn’t require lots of extra equipment or learning complex rules. There are a lot of different varieties of TTT (see this book, or Wikipedia), but most avoid randomness. I think that randomizing the token would be an interesting start.

Randomized TTT

Let’s randomize the token, and the first player to make a line of three will be the winner! What’s the optimal strategy for this new style?

With randomly determined tokens, we need the probability of using an X, x, and the probability of using an O, (1-x). For each board, find all the children and the token needed to reach that child. Find the probability ( pX ) for the best result when playing X, , and the same for O ( pO ). Combining the odds of pulling each token with the best odds of winning, the termination odds are p= x*pX + (1-x)*pO.

I haven’t described the how the tokens are selected! Let’s first try flipping a coin: 50/50 odds for X or O. This simple change means that the game is substantially more complex. There are now 18,753 possible boards, with 3,820 where P1 wins, 3,808 where P2 wins, and 32 where they tie. The table below shows the results of different strategies.

	P1 Random	P1 Optimal
P2 Random	(49%, 45%, 6%)	(86%, 9%, 6%)
P2 Optimal	(15%, 84%, 1%)	(69%, 27%, 4%)

That sounds like a more interesting game! There are very few ties, which I would prefer. Unskilled players have nearly equal odds of winning, so they don’t feel put-out by not going first. When an optimal player goes against a novice, the optimal player has at least a 69% higher chance of winning, regardless of going first or second. In the case where two optimal players are competing, P1 wins 42% more often, which is more interesting than the eternal ties with basic tic-tac-toe.

Other variants

How about a different scoring style? Here, although the tokens are random, P1 wins if the first line is made of Xs, and P2 wins if it is Os. Played randomly, ties are rare. There is a substantial benefit to being skilled, against random players. Optimal play leads to a smaller first-move advantage, compared to the earlier version. An interesting variant, but it doesn’t grab my interest for some reason.

	P1 Random	P1 Optimal
P2 Random	(47%, 47%, 6%)	(74%, 23%, 2%)
P2 Optimal	(24%, 73%, 3%)	(43%, 35%, 22%)

Let’s go back to the first-to-draw-a-line method, but change the distribution of tokens. What if there are only 10 tokens total, 5 Xs and 5 Os? This is equivalent to drawing colored stones from a bag. This reduces the number of possible boards to 17,713. This one has a middling level of first-move-advantage, which may keep the game more interesting at higher levels. I also like the idea of drawing colored stones from a bag.

	P1 Random	P1 Optimal
P2 Random	(46%, 41%, 13%)	(84%, 12%, 4%)
P2 Optimal	(15%, 84%, 1%)	(55%, 31%, 14%)

Future work

I’d like to take a look at even larger versions, not just 3*3, say Nx*Ny. Unfortunately, the upper limit of board configurations scales as 3^(Nx*Ny), so a 4×4 board has about 2,000 times more possible configurations than a 3×3 board. Rather than brute-forcing the solutions, I’ll have to brush off my Reinforcement Learning!

Example gameplay

Let’s look at a game of the first random version, where the token is chosen by the flip of a coin. Player 1 goes first with X, and I’ll report the expected odds after each step.

P1 plays the center. p=(69%, 27%, 4%)
. . .
.X.
. . .

P2 draws an O and places it there for some reason. p=(60%, 36%, 4%)
. . .
.XO
. . .

If P1 plays the X anywhere else, P2 has a 50% chance of winning on the next turn. p=(56%, 38%, 6%)
. . .
XXO
. . .

P2 avoids the right side of the board in order to remove the 50% win chance for P1. Unclear why the middle is better than the left. p=(50%, 38%, 13%)
. O .
XXO
. . .

P1 draws an O and plays it down there? Unclear why! p=(50%, 25%, 25%)
. O .
XXO
O . .

P2 plays an X there for some reason… p=(38%, 25%, 38%)
. O .
XXO
OX .

P1 plays the O in the corner, giving P2 a 50% chance of winning. p=(50%, 50%, 0%)
. OO
XXO
OX .

P2 did not get an O, and it has been forced to lose on the next turn. p=(100%, 0%, 0%)
. OO
XXO
OXX

P1 wins with an X, but could have won with an O.
XOO
XXO
OXX

This simple game is more thought-provoking than I expected! I am still puzzled by some of these moves.

Doing a bit more analysis, there are actually 1,359 unique boards for this version (after removing rotations, reflections, and swapping tokens). In contrast, basic TTT only has 608 unique boards.

After 10,000 simulated games, the optimal players only used 118 of possible boards. The first three moves above are the standard opening, regardless of the drawn tokens: play center, one side, then the other. This results in about 25% of games ending at move 3. After that, the game tree grows fairly quickly.

Now at least you know the standard opening, in case you play this coin-flipping Tic Tac Toe. Good luck!