Emojle: Solving Wordle by Emoji Alone

2022-02-06

Emojle is an emoji-based Wordle solver. Using only publicly-tweeted color square patterns and Wordle’s set of 12,972 five-letter English words, this tool narrows the set of words down, ideally ending on the one and only word that could have produced the combination of tweeted emoji patterns.

You can do a Twitter search to find the latest puzzle, then hone the search to the day’s quoted puzzle (such as "Wordle 225").

If you just want to see how it works, try these: YG..G, YYYG., GYGYY, GGGYY, and .G.YG. Based on those five emoji strings, the tool can uniquely identify the word “posse”: those guesses must have belonged to players who had guessed words like “solve”, “sposh”, “pesos”, “poses”, and “loupe” respectively. If you skip the first guess YG..G or remove it using the remove button to the right, you’d see that “alien” is also a candidate, in which case the guesses must have been words like “lines”, “anile”, “aline”, and “sloan”.


Background and Motivation

Josh Wardle’s game Wordle, hosted at powerlanguage.co.uk, is a word guessing game. Players have six tries to guess the five-letter English word; each guess must be a valid word. The letters are then marked: yellow (blue in color-blind mode) for a letter that is in the word but in the wrong position, and green (orange in color-blind mode) for a correct letter in the right position. Only one puzzle is available per day, though all 2,315 puzzles (October 2021 to October 2027) are known via the game’s posted code.

Wordle’s popularity is due in part to the clever social media trick of encouraging users to post their spoiler-free guesses: rather than posting the words, the app encourages sharing a grid of emoji. In this way, as The Verge’s James Vincent points out, “each grid tells a story with wonderful concision. With just 30 squares and three colors, Wordle’s emoji results convey narratives of luck, frustration, perseverance, and failure; each grid a miniature story, like a landscape painted in a matchbox.” From the shape of the squares, a reader can tell how quickly a solution came—or how close the player came without a victory.

My strategy of starting with a different word each day is beginning to show its flaws. But if I’ve learned anything from being an American, it’s never to change course no matter how catastrophic a policy proves to be.

Wordle 224 4/6

⬜⬜⬜⬜⬜
⬜🟩⬜🟩⬜
⬜🟩🟩🟩🟩
🟩🟩🟩🟩🟩

— John Green (@johngreen) February 1, 2022

Of course, some players post their unredacted games or screenshots, and others post their commentary or oblique hints. Clever readers can also infer from the scores whether a word is common or obscure, or based on a known player’s strategy or common starting word they could infer something about that first guess. But what about the emoji itself: If nobody gave any hints other than emoji, could you still guess the word?

The emoji do convey information. Of the 12,972 valid input words, a result like “⬜🟩🟩🟩🟩” (.GGGG) could only come from the 8,385 words where, if you replace the first letter, it could produce another valid word. A word like “churn” would be impossible, though “frail” and “grail” are both possible. Likewise, “🟩🟨🟨🟩🟩” (GYYGG) narrows the set to the 458 words where the second and third letters could be swapped to produce another valid word. With the data from enough interesting guesses, you could determine which single word is the only given word that could produce the visible emoji strings.

How it works

In an emoji pattern, there are 3⁵ possible emoji: each of 5 characters can be in any of 3 states. In practice, some of those are impossible; any pattern with four green characters and one yellow is nonsense, since the yellow character has no open position to swap to. Likewise, some of them have no value: all-white and all-green patterns exist for every single word. For this prototype, I assume that we care about the full set of 243 patterns per word, in part because that makes it very easy to convert from pattern to number and back again.

With that in mind, the concept is simple:

  1. Precalculate which of the 243 patterns are possible for each word in the English language. I wrote a short Node script and Wordle evaluator for this. For each word A, loop through every word B, calculate the pattern, and set the corresponding bit in that word’s 243-bit (31-byte) bitfield. (Conveniently, the result will be the same for B’s evaluation of A, but I didn’t take advantage of that.) The bitfield is stored as base64 in a text file along with the dictionary, resulting in a ~660KB dictionary file after about 2 minutes of computation on my laptop.
  2. In a given search session, start with the full set of 12,972 words.
  3. Each time a guess is submitted, remove any words from the current open set that do not have that pattern’s bit set. Equivalently, the union of the guesses form a bit mask where all bits of the mask must be a 1 (wordBits & mask === mask) or else the word is eliminated from the search.
  4. To find “examples” of how an arbitrary word fits an arbitrary pattern, ignore the bitfield and loop through the words.

Results

The good news is: After computing the dictionary, each word had a unique bitfield. None of the 12,972 words had identical bitfields.

The bad news is, that alone isn’t always enough to identify the words: Even though every word has a unique bitfield, some of those bitfields are subsets of other words’ bitfields. In all, 4,465 of 12,972 (34.4%) of words are subsets of other words, meaning that those words could never be identified as the given word using this tool. If word A is a subset of word B, then all possible patterns that work for word A also work for word B, so no pattern could eliminate B in favor of A (but some pattern would necessarily eliminate A in favor of B, since no two words have purely identical bitfields). If we could be confident that 100% of the 12,972 input words were tried and posted, we could conclude that the absence of certain guesses meant that those guesses were impossible, but that’s not the case here: you can’t use the absence of a tweet to infer the impossibility of a pattern.

In this sense, the speed and accuracy is also constrained by the breadth of tweets and the breadth of the dictionary. Wordle’s popularity is relatively new, and older puzzles don’t have as extensive of a backlog of tweets. Even then, though Wordle’s puzzle dictionary is famously human-curated, its input dictionary contains some obscure words and initialisms (“aahed”, “mensa”, “motis”). The word “grail” might be solvable with only gyggy and ygygg, but only if a human tried the rare words “glair” and “argil” and then tweeted about it in a searchable location.

Still, the tool works: It has solved some recent puzzles, and it can reduce many others to a dozen candidate words or fewer. In practice it seems to take about 30-40 human-curated (“interesting”) lines to get down to 2-5 answers.

Analyses

Using the database of patterns can yield some unique analyses, particuarly when compared to other analyses regarding entropy. In some ways these analyses converge with the other recommendations, but other measures diverge.

Words with the most patterns

  1. tares 211
  2. teras 210
  3. pores 205
  4. pares 205
  5. pelas 204
  6. tears 203
  7. cares 202
  8. tales 201
  9. dares 201
  10. teals 200
  11. rates 200
  12. pears 200
  13. dores 200
  14. bares 200

Words with the least patterns

  1. jazzy 69
  2. huzzy 69
  3. fluff 69
  4. phpht 68
  5. ayaya 68
  6. jiffy 67
  7. fuffy 66
  8. queue 65
  9. oxbow 65
  10. xylyl 64
  11. pzazz 61
  12. jujus 61
  13. jeeze 59
  14. qajaq 47

Because Emojle calculates with bitsets, some of the most clear analyses check the cardinality of the sets. Which words have the most bits set (i.e. match the most output patterns), and which have the least set (i.e. match the fewest output patterns)? “Tares” is the clear winner with 211: the dictionary is capable of producing 211 different emoji patterns for “tares”, whereas “qajaq” is capable of producing only 47 patterns. Intuitively, this makes sense: the words with the most pattern diversity have unique common letters, and the ones with the least pattern diversity have repeated rare letters.

Incidentally, it’s worth noting here that patterns are symmetric: Guessing “tares” against an arbitrary target word will produce the same pattern as guessing that same word against “tares”. If “tares” were ever picked as the word to find, then it would be easy, as most guesses would produce a valuable pattern.

This analysis also overlaps with the assertion that “tares” produces the highest entropy: By being capable of the most patterns, a first guess of “tares” will minimize the average number of words that remain in play, giving the player the smallest remaining set to continue to search.

  1. GGGGG 12972
  2. Y.... 12972
  3. .Y... 12972
  4. ..YY. 12972
  5. ..Y.. 12972
  6. ...Y. 12972
  7. ....Y 12972
  8. ..... 12972
  9. Y.Y.. 12971
  10. Y..Y. 12970
  11. ..G.. 12970
  12. ...G. 12968
  13. ..Y.Y 12967
  14. YY... 12966
  1. GGYYY 430
  2. YGGYG 378
  3. YGYYG 352
  4. YGYGY 348
  5. GYGYG 336
  6. YGGGY 254
  7. YYGGG 252
  8. GGYGY 236
  9. YGGYY 222
  10. YYGGY 214
  11. YYGYG 165
  12. GYGYY 156
  13. GYGGY 138
  14. GYYGY 124

In the reverse direction, which patterns are the most popular? The first eight are obvious: Every word is capable of all-green, all-gray, and single-yellow outputs, and none of those provide any information to Emojle. Likewise, the emoji string ..YY. produces no useful information to Emojle, since when ignoring the letters all words have a word that can produce ..YY.. After that, the next most-popular (least-helpful) pattern is Y.Y.., which rules out only “ayaya”.

The greater value is in the least-popular entries, which can drastically reduce the input set. Most of these rare, valuable patterns are mixes of two or three green squares with yellow squares for the rest: the only words that remain are words that have simple letter-swapped anagrams (such as “frail” and “flair”).

I’ve omitted all patterns that are four green and one yellow, since all of those are impossible.

Words with the most subsets

  1. tares 1606
  2. tales 1053
  3. pares 1039
  4. pelas 951
  5. tears 813
  6. cares 728
  7. dures 727
  8. pales 703
  9. cores 634
  10. pores 616
  11. rates 611
  12. dares 589
  13. bares 464
  14. teras 455

Words with the most supersets

  1. qajaq 3188
  2. jeeze 1763
  3. scuzz 1596
  4. jujus 1556
  5. oxbow 1268
  6. jazzy 1233
  7. quipu 1194
  8. phpht 1182
  9. heeze 1118
  10. xylyl 802
  11. huzzy 782
  12. ayaya 777
  13. chizz 728
  14. vozhd 703

Finally, the superset analysis: as above, if word A is a subset of word B, then all patterns that work for word A also work for word B. Consequently, with a complete archive of guesses, Emojle would still find word A indistinguishable from word B. 4,465 words have this subset behavior, leaving 8,507 that can be theoretically uniquely guessed.

By virtue of its high count of possibilities, “tares” tops the list here too: the word “tares” is by far the most common word to wind up at the end of an Emojle solve, as part of the final set of 1,606 words. This is similar to the list of words with the most patterns above, but not identical: even though “teras” is capable of producing almost the same number of unique emoji patterns, there exist significantly more valid word-pattern matches that can rule out “teras” compared to “tares”.

The list of words with the most supersets is likewise similar to the list of words with the fewest patterns: The set of possible patterns is small, so there is much less data that could uniquely identify those words. When only a few patterns are possible, many of the high-cardinality words are supersets, and consequently they remain in play. Should “qajaq” be picked as the target word, Emojle would not be very helpful: it would be hidden among 3,188 other candidate words.

Of course, these are all theoretical results if every word in the dictionary were played and posted publicly. In practice, even the words that can be uniquely identified may rely on a pattern from a single obscure word that is unlikely to be guessed.

Future work

Future extensions include:

  • Directly integrating with Twitter’s API to automate searching, though a human may still need to filter out memes and joke entries.
  • Identifying the probabilities of the remaining solutions, possibly using the usage frequency of words to determine which ones should have been guessed.
  • Suggesting which would be the highest-value emoji patterns to find, or which ones would hypothetically help select a word while the results are still inconclusive.
  • Heavily compressing the word list to save on bandwidth and improve latency.

Contact

If you have any questions about Emojle, please contact me.