I wanted to create a set of numbers that look like different numbers when they’re rotated 180°, but I especially wanted a computer to do it for me. I’ve been using TensorFlow recently, and that sounded like a good place to start.
There is a standard data set used for number identification, the MNIST database, which has 70,000 hand-written numbers and their intended values.
The first attempt trained a convolutional neural network to act as a classifier of the MNIST numbers. Applied to one image of a MNIST number, it outputs a vector that gives its percentage confidence that the picture is either [0,1,2,3,4,5,6,7,8,9]. So if it’s very sure that the image is of a 4, then the output corresponding to 4 would be greater than 0.9, and the other values would be quite small. If it’s unsure, there may be a few values with nearly the same value. To train this network, it is rewarded for having the highest confidence in the correct number.
Once that could classify numbers reasonably well, I could use it to invent drawings of digits via gradient ascent. That means that I start with an image that doesn’t look like a number, then iteratively change it slightly to increase how much it looks like the number I want. Using this method, the following picture has the average digits from the data set on top, and the generated digits on the bottom, starting from random noise:
Oh man, that’s bad. But it’s (on average) 98% sure that those random-looking patterns are the digits 0-9. Is there a way to clean this up? An easy solution is to start with the average of the MNIST data set. This beginning image is a dark blur in the middle, white at the edges, and has smooth gradients. The result:
Much better, but still not great! There are strange artifacts that don’t look at all like pen strokes: spikes and blurs. Still, these could be read as numbers.
Now I want to do the same basic approach, but have the result look like different digits when right-side-up vs up-side down. I change the gradient ascent to do that, and here we go! The left column and top row act as guides for which pair of numbers are attempted:
Hmm, that sort of works. Since the generation method is symmetric, for example, the 0/1 pair rotated 180° becomes the 1/0 pair. If you scan across each row, they sort of look like the proper number. Some of them are just terrible, though. Regardless of what we humans think, the classifier is extremely sure that it did a good job. It generally thinks these are the proper digits with 99% certainty. It thinks it did the worst on the 1-9 pair (only 92%!), which doesn’t look outstanding in any way to me.
Clearly, some changes are needed.