I watched a great video about the creation of the NATO phonetic spelling alphabet, and wondered if I could do something similar with optimization.
The video showed that the NATO alphabet was designed to make it easier to spell words over radio and telephone channels, which may have bad signal quality. There are a few requirements:
- One word per letter of the English alphabet.
- Words are easy to distinguish from each other.
- Words are fairly short.
- Words are easy to pronounce for speakers of English, French, and Spanish.
(To simplify my approach to the problem, I’m going to ignore the multilingual requirement.)
So how can we approach this algorithmically? We need a few things:
- A function that tells us the “distance” between two words, based on their pronunciation.
- A list of common words, filtered for length and complexity.
- Code that will optimize a list of words (one per letter) to minimize their similarity.
Since this has a bunch of word processing, I’ll use python.
Distance between words
What does it even mean to compute the distance between words? I want them to sound different, so I’ll need a way to turn a word into phonemes, then a metric for comparing those to another word’s phonemes.
Luckily, there are libraries to do these things! First, I’ll use pronouncing to convert the words into ARPABET phonemes, then use phonecodes to convert to the International Phonetic Alphabet, then convert each phoneme into a feature vector using panphon. A feature vector has 22 entries to describe the sound: where it’s formed in the mouth, if it’s sonorant, voiced, nasal, etcetera. Ultimately, each word becomes a list of feature vectors. (Well, if the word has multiple pronunciations, it’ll be a list of lists. But close enough.)
Great, now I can use panphon’s weighted edit distance to tell me how much editing is needed to transform one word into another. This is weighted by the feature vectors, as some sounds more easily replace each other.
Finally, I’ve got a function to compute that (bat, bat) has a distance of 0, (bat, cat) has a distance of 2.25, and (bat, dinosaur) is 39.125. Identical words have distance 0, similar words have a small distance, and very different words have a large distance. Good!
Next, I need to create a list of candidate words.
List of words
The pattern library has a useful file: a sorted list of the normalized frequency of English language words. It has 11,990 words, ranging in frequency from “the” to “zoo”. Right around the middle is “vacuum”.
I’ll filter this list to limit the syllable count, number of phonemes, remove profanity, and remove words that sound like they start with the wrong letter (like express, know, effort, colonel, etc).
Optimization
What would the best spelling alphabet look like? I want to be sure the words aren’t confused, so it should maximize the minimum distance between any words. That would ensure that no two words are similar. Then as a secondary metric, the average distance between words should be large.
Unfortunately, I’ve got so many possible combinations of words. Using the word list with 11,990 entries, each letter has between 1418 (S) and 3 (X) words. That yields 4.4×10^62 combinations. That’s absurd. Truly impossible to check each one. But that’s why we have the frequency-ordered list. We can just keep the top N most frequent words for each letter, assuming they are valid, and work with those. If we keep the top 50, that yields 9.7×10^41. Still absurd, but much smaller!
So we can’t do an exhaustive search. Time to be slightly more clever.
The simplest is to just do random search, with some narrowing over time. At the start, we’d swap 50% of the entries with a random word, and at the end just try one at at a time. At each iteration, we keep the new alphabet if it’s better than the previous one. This works, but isn’t smart.
How about starting with a better alphabet? For each letter, we could choose the word that is farthest from the words in the rest of the letters. That’s a slow initial investment (about 25*26*N^2 initial comparisons, or 6.5 million with N=100), but yields an initial alphabet that’s quite distinct.
Instead, we could go letter-by-letter and choose the word that is most distinct from the rest of the word list. That would provide a decent starting list.
Another way to improve is greedily: find the letter in the list that is the worst, then swap it with the best available word.
But I have no idea what’ll work well, so let’s try each combination of initial conditions (random choice, NATO, most dissimilar) and optimizing with (random search, greedy swap).
Results
Matching the NATO alphabet’s constraints:
I’ll limit the number of phonemes and syllables to match the NATO alphabet, and hang onto 100 words per letter (and the NATO words). How do the results look?
- The NATO alphabet is a good starting point. Farthest-start is better. Random start is bad because it has no thought behind it.
- The optimization works! In each case, the optimizer does better than the initial data. Random search works decently, but the greedy search does better.
- The best result was farthest-start with the greedy optimization
- Minimum word distance of 21 is pretty great, compared to the NATO value of 8.9.

Limiting syllables
So we can optimize for more dissimilar words, but what about more restricted cases? Like fixing the number of syllables? For these weird cases, if the filter removes all the words for a letter, I’ll use the NATO word for that one.
- I’m surprised that the optimized one-syllable word list is “better” than the original NATO alphabet.
- The syllabically-limited cases aren’t as optimal as the earlier result, but still decent.
- Oddly, the 2-syllable case is the best of these.
| Constraint | NATO original | 1 syllable | 2 syllables | 3 syllables | 4 syllables | NATO-like |
|---|---|---|---|---|---|---|
| Min Distance | 8.9 | 10.5 | 18.5 | 17.875 | 13.375 | 21.5 |
| Mean Distance | 24.9 | 22.7531 | 33.6208 | 37.6246 | 48.0515 | 34.9742 |
| A | alpha | act | anger | advantage | acknowledgement | abruptly |
| B | bravo | blind | breakdown | beautifully | bureaucratic | behind |
| C | charlie | change | cambridge | concerto | carbohydrate | closely |
| D | delta | dry | doorway | digestion | disappointment | drink |
| E | echo | ear | earthquake | erosion | eternally | east |
| F | foxtrot | french | foxtrot | factual | formulation | foxtrot |
| G | golf | glimpse | goldsmith | genuine | genuinely | genuine |
| H | hotel | her | household | hopelessly | hierarchical | heroic |
| I | india | inch | instinct | influence | indigestion | instinct |
| J | juliet | joke | judgement | journalist | jurisdiction | justify |
| K | kilo | kick | kibbutz | kilogram | kilo | kilogram |
| L | lima | lounge | livestock | likelihood | legislative | loyalty |
| M | mike | march | mainstream | murderer | menstruation | mixture |
| N | november | noon | network | nostalgia | negligible | nitrogen |
| O | oscar | ounce | opera | observer | objectively | orthodox |
| P | papa | plump | pleasure | picturesque | perimeter | power |
| Q | quebec | queue | quantum | quotation | questionable | quantity |
| R | romeo | realm | restraint | regiment | reconstruction | romeo |
| S | sierra | strength | shakespeare | scientist | subsequently | spirit |
| T | tango | twelve | transport | treasurer | triumphantly | temple |
| U | uniform | urge | unkind | unbroken | unemployment | uniform |
| V | victor | valve | viewpoint | volcano | vocational | volcano |
| W | whiskey | warmth | wildlife | warrior | wonderfully | wildlife |
| X | x-ray | x-ray | x-ray | x-ray | x-ray | x-ray |
| Y | yankee | york | yorkshire | yesterday | yankee | youngest |
| Z | zulu | zoo | zulu | zimbabwe | zulu | zimbabwe |
The worst spelling alphabet
What if I try to make the worst spelling alphabet? That’d have the words that are the easiest to confuse. In this case, I optimized for the lowest average distance between words.
Here’s a matrix showing how similar each word is to each other one, for a mean distance of 9.7. That’s much, much worse than the useful cases above.

For example, it has “tight / sight /night”. Also “bait / date /gate / hate”. Overall, just a terrible list of words. If you want to really be useless at spelling, use these.
Conclusion
Personally, I like the two-syllable word list. It feels more resistant to errors than the single-syllable list, and feels less silly than the broadly-optimized list.
It’s fun to optimize for silly things.