Emojle: Solving Wordle by Emoji Alone

Emojle is an emoji-based Wordle solver. Using only publicly-tweeted color square patterns and Wordle’s set of 12,972 five-letter English words, this tool narrows the set of words down, ideally ending on the one and only word that could have produced the combination of tweeted emoji patterns.

You can do a Twitter search to find the latest puzzle, then hone the search to the day’s quoted puzzle (such as "Wordle 225").

If you just want to see how it works, try these: YG..G, YYYG., GYGYY, GGGYY, and .G.YG. Based on those five emoji strings, the tool can uniquely identify the word “posse”: those guesses must have belonged to players who had guessed words like “solve”, “sposh”, “pesos”, “poses”, and “loupe” respectively. If you skip the first guess YG..G or remove it using the remove button to the right, you’d see that “alien” is also a candidate, in which case the guesses must have been words like “lines”, “anile”, “aline”, and “sloan”.

Background and Motivation

Josh Wardle’s game Wordle, originating at powerlanguage.co.uk, and now hosted at the New York Times, is a word guessing game. Players have six tries to guess the five-letter English word; each guess must be a valid word. The letters are then marked: yellow (blue in color-blind mode) for a letter that is in the word but in the wrong position, and green (orange in color-blind mode) for a correct letter in the right position. Only one puzzle is available per day, though all 2,315 puzzles (October 2021 to October 2027) are known via the game’s posted code.

Wordle’s popularity is due in part to the clever social media trick of encouraging users to post their spoiler-free guesses: rather than posting the words, the app encourages sharing a grid of emoji. In this way, as The Verge’s James Vincent points out, “each grid tells a story with wonderful concision. With just 30 squares and three colors, Wordle’s emoji results convey narratives of luck, frustration, perseverance, and failure; each grid a miniature story, like a landscape painted in a matchbox.” From the shape of the squares, a reader can tell how quickly a solution came—or how close the player came without a victory.

My strategy of starting with a different word each day is beginning to show its flaws. But if I’ve learned anything from being an American, it’s never to change course no matter how catastrophic a policy proves to be.

Wordle 224 4/6

⬜⬜⬜⬜⬜
⬜🟩⬜🟩⬜
⬜🟩🟩🟩🟩
🟩🟩🟩🟩🟩

— John Green (@johngreen) February 1, 2022

Of course, some players post their unredacted games or screenshots, and others post their commentary or oblique hints. Clever readers can also infer from the scores whether a word is common or obscure, or based on a known player’s strategy or common starting word they could infer something about that first guess. But what about the emoji itself: If nobody gave any hints other than emoji, could you still guess the word?

The emoji do convey information. Of the 12,972 valid input words, a result like ”⬜🟩🟩🟩🟩” (.GGGG) could only come from the 8,385 words where, if you replace the first letter, it could produce another valid word. A word like “churn” would be impossible, though “frail” and “grail” are both possible. Likewise, ”🟩🟨🟨🟩🟩” (GYYGG) narrows the set to the 458 words where the second and third letters could be swapped to produce another valid word. With the data from enough interesting guesses, you could determine which single word is the only given word that could produce the visible emoji strings.

How it works

In an emoji pattern, there are 3⁵ possible emoji: each of 5 characters can be in any of 3 states. In practice, some of those are impossible; any pattern with four green characters and one yellow is nonsense, since the yellow character has no open position to swap to. Likewise, some of them have no value: all-white and all-green patterns exist for every single word. For this prototype, I assume that we care about the full set of 243 patterns per word, in part because that makes it very easy to convert from pattern to number and back again.

With that in mind, the concept is simple:

Precalculate which of the 243 patterns are possible for each word in the English language. I wrote a short Node script and Wordle evaluator for this. For each word A, loop through every word B, calculate the pattern, and set the corresponding bit in that word’s 243-bit (31-byte) bitfield. (Conveniently, the result will be the same for B’s evaluation of A, but I didn’t take advantage of that.) The bitfield is stored as base64 in a text file along with the dictionary, resulting in a ~660KB dictionary file after about 2 minutes of computation on my laptop.
In a given search session, start with the full set of 12,972 words.
Each time a guess is submitted, remove any words from the current open set that do not have that pattern’s bit set. Equivalently, the union of the guesses form a bit mask where all bits of the mask must be a 1 (wordBits & mask === mask) or else the word is eliminated from the search.
To find “examples” of how an arbitrary word fits an arbitrary pattern, ignore the bitfield and loop through the words.

Results

The good news is: After computing the dictionary, each word had a unique bitfield. None of the 12,972 words had identical bitfields.

The bad news is, that alone isn’t always enough to identify the words: Even though every word has a unique bitfield, some of those bitfields are subsets of other words’ bitfields. In all, 4,465 of 12,972 (34.4%) of words are subsets of other words, meaning that those words could never be identified as the given word using this tool. If word A is a subset of word B, then all possible patterns that work for word A also work for word B, so no pattern could eliminate B in favor of A (but some pattern would necessarily eliminate A in favor of B, since no two words have purely identical bitfields). If we could be confident that 100% of the 12,972 input words were tried and posted, we could conclude that the absence of certain guesses meant that those guesses were impossible, but that’s not the case here: you can’t use the absence of a tweet to infer the impossibility of a pattern.

In this sense, the speed and accuracy is also constrained by the breadth of tweets and the breadth of the dictionary. Wordle’s popularity is relatively new, and older puzzles don’t have as extensive of a backlog of tweets. Even then, though Wordle’s puzzle dictionary is famously human-curated, its input dictionary contains some obscure words and initialisms (“aahed”, “mensa”, “motis”). The word “grail” might be solvable with only gyggy and ygygg, but only if a human tried the rare words “glair” and “argil” and then tweeted about it in a searchable location.

Still, the tool works: It has solved some recent puzzles, and it can reduce many others to a dozen candidate words or fewer. In practice it seems to take about 30-40 human-curated (“interesting”) lines to get down to 2-5 answers.

Analyses

Using the database of patterns can yield some unique analyses, particuarly when compared to other analyses regarding entropy. In some ways these analyses converge with the other recommendations, but other measures diverge.

Words with the most patterns

tares 211
teras 210
pores 205
pares 205
pelas 204
tears 203
cares 202
tales 201
dares 201
teals 200
rates 200
pears 200
dores 200
bares 200

Words with the least patterns

jazzy 69
huzzy 69
fluff 69
phpht 68
ayaya 68
jiffy 67
fuffy 66
queue 65
oxbow 65
xylyl 64
pzazz 61
jujus 61
jeeze 59
qajaq 47

Because Emojle calculates with bitsets, some of the most clear analyses check the cardinality of the sets. Which words have the most bits set (i.e. match the most output patterns), and which have the least set (i.e. match the fewest output patterns)? “Tares” is the clear winner with 211: the dictionary is capable of producing 211 different emoji patterns for “tares”, whereas “qajaq” is capable of producing only 47 patterns. Intuitively, this makes sense: the words with the most pattern diversity have unique common letters, and the ones with the least pattern diversity have repeated rare letters.

Incidentally, it’s worth noting here that patterns are symmetric: Guessing “tares” against an arbitrary target word will produce the same pattern as guessing that same word against “tares”. If “tares” were ever picked as the word to find, then it would be easy, as most guesses would produce a valuable pattern.

This analysis also overlaps with the assertion that “tares” produces the highest entropy: By being capable of the most patterns, a first guess of “tares” will minimize the average number of words that remain in play, giving the player the smallest remaining set to continue to search.

Most-popular patterns

GGGGG 12972
Y.... 12972
.Y... 12972
..YY. 12972
..Y.. 12972
...Y. 12972
....Y 12972
..... 12972
Y.Y.. 12971
Y..Y. 12970
..G.. 12970
...G. 12968
..Y.Y 12967
YY... 12966

Least-popular patterns

GGYYY 430
YGGYG 378
YGYYG 352
YGYGY 348
GYGYG 336
YGGGY 254
YYGGG 252
GGYGY 236
YGGYY 222
YYGGY 214
YYGYG 165
GYGYY 156
GYGGY 138
GYYGY 124

In the reverse direction, which patterns are the most popular? The first eight are obvious: Every word is capable of all-green, all-gray, and single-yellow outputs, and none of those provide any information to Emojle. Likewise, the emoji string ..YY. produces no useful information to Emojle, since when ignoring the letters all words have a word that can produce ..YY.. After that, the next most-popular (least-helpful) pattern is Y.Y.., which rules out only “ayaya”.

The greater value is in the least-popular entries, which can drastically reduce the input set. Most of these rare, valuable patterns are mixes of two or three green squares with yellow squares for the rest: the only words that remain are words that have simple letter-swapped anagrams (such as “frail” and “flair”).

I’ve omitted all patterns that are four green and one yellow, since all of those are impossible.

Words with the most subsets

tares 1606
tales 1053
pares 1039
pelas 951
tears 813
cares 728
dures 727
pales 703
cores 634
pores 616
rates 611
dares 589
bares 464
teras 455

Words with the most supersets

qajaq 3188
jeeze 1763
scuzz 1596
jujus 1556
oxbow 1268
jazzy 1233
quipu 1194
phpht 1182
heeze 1118
xylyl 802
huzzy 782
ayaya 777
chizz 728
vozhd 703

Finally, the superset analysis: as above, if word A is a subset of word B, then all patterns that work for word A also work for word B. Consequently, with a complete archive of guesses, Emojle would still find word A indistinguishable from word B. 4,465 words have this subset behavior, leaving 8,507 that can be theoretically uniquely guessed.

By virtue of its high count of possibilities, “tares” tops the list here too: the word “tares” is by far the most common word to wind up at the end of an Emojle solve, as part of the final set of 1,606 words. This is similar to the list of words with the most patterns above, but not identical: even though “teras” is capable of producing almost the same number of unique emoji patterns, there exist significantly more valid word-pattern matches that can rule out “teras” compared to “tares”.

The list of words with the most supersets is likewise similar to the list of words with the fewest patterns: The set of possible patterns is small, so there is much less data that could uniquely identify those words. When only a few patterns are possible, many of the high-cardinality words are supersets, and consequently they remain in play. Should “qajaq” be picked as the target word, Emojle would not be very helpful: it would be hidden among 3,188 other candidate words.

Of course, these are all theoretical results if every word in the dictionary were played and posted publicly. In practice, even the words that can be uniquely identified may rely on a pattern from a single obscure word that is unlikely to be guessed.

Future work

Future extensions include:

Directly integrating with Twitter’s API to automate searching, though a human may still need to filter out memes and joke entries.
Identifying the probabilities of the remaining solutions, possibly using the usage frequency of words to determine which ones should have been guessed.
Suggesting which would be the highest-value emoji patterns to find, or which ones would hypothetically help select a word while the results are still inconclusive.
Heavily compressing the word list to save on bandwidth and improve latency.

Contact

If you have any questions about Emojle, please contact me.