Best Starting Wordle
If you’re like me, your entire newsfeed has been colonized by 5x5 matrices of green, yellow, and black cells. The Times describes the lexical fad as “a love story”, but many of us are just wondering along with WIRED what’s the best starting word?
USA Today suggests ADIEU as a first word to narrow down vowels early on. This feels attractive since we know one of AEIOUY must occur in the word, except the game itself doesn’t care about vowels.
So the next idea I’ve heard is to eschew vowels and just look at overall English letter frequencies in dictionaries, with E being the most common letter (11%) followed by R (9.1%), A (7.8%), N (7.2%), and T (6.7%). Unfortunately this isn’t quite right either since Wordle has a very specific dictionary and, perhaps more importantly, some letters often appear in combination together so trying both doesn’t necessarily reveal the most information.
Instead we could try to guarantee to eliminate the most words possible at each possible guess. For any given starting word the worst-case scenario is getting zero letters correct, so we want to see what five letters eliminate the most words when absent. This is related to overall letter frequency but, as described above, the peculiarities of the game make it subtly different.
And the winner is…
AESIR, ARISE, RAISE, REAIS, or SERAI eliminate the most words even if they lack matching characters.
Sorry ADIEU (#907), INERT (#644), SLATE (#170), and other suggestions I’ve read.
I should note that, in the first round, anagrams are ranked the same since we just care about the case when nothing matches. I’m not sure it matters much to differentiate within anagrams (e.g. whether AESIR is preferable to RISEN) since the usefulness of information gained would be inversely proportional to the odds of obtaining that information. But hey, I’m lazy and not writing a formal proof.
Of course, there is one advantage to be had by RAISE and ARISE: they’re potential winners. Once every ~772 days you’ll score an elusive 1/6.
Okay, great, but what about the second word?
After solving the opening move, I next wondered whether it can be worth a gambit on the second word: a play that doesn’t include all known letters. A word known to be a loser but that might reveal more information.
I didn’t really have a gut feeling on this, but it turns out if the only match is an “R”, then none of the best second words actually contain an “R.” Better to narrow down the search space than attempt to place the “R” precisely as CYTON, DONUT, LOUND, and NOULD all provide more information.
Of course it isn’t always best to play a known loser. If “E” is the only matched letter, the best guesses become OWLET, TOWEL, or TOLED; each of which would leave you with a worst case of 10 remaining words. If you really wanted to avoid an “E”, BOULT is only slightly worse with 11 remaining words.
Oh, yeah, and for the 7.6% of the time ARISE matches nothing, just go with BLUDY.
 There are 2315 potential correct words and an additional 10656 allowed guesses. The Wordle dictionary is in its JS along with how it selects the daily word. Don’t look at it unless you want the game spoiled.
 This is intuitive but not a priori correct and a different dictionary might result in, for example, more words with “E” and without ARIS, than without any of ARISE. But, alas, for this particular dictionary, the lowest information result on a first guess is always to be had from having all wrong letters.
 I was a little surprised not to see either “T” or “N” in the top words. All I can think is they frequently occurs with other popular letters such that they don’t add as much information.
 I love that “O,” despite being more popular than “U” or “Y” as a vowel, isn’t included in this guess.