A favorite elementary school word puzzle is the 'how many other words can you make from the letters of this big word' variety. So, for example, the word "bread" will yield: red, are, ear, dare, dear, read, bar, etc. There are lots more.
How to Solve It
How would you get a list of all of the right answers? First, you have to figure out how to ennumerate the complete list of letter subsets found in the "parent"word, and then you have to figure out what words (if any) can be formed by rearranging the letters in any one subset.
To generate an initial list of all possible letter subsets from the "parent" word, we can use binary numbers.
Any given subset of letters from our original word can be represented as as either used (1) or not used (0) … as a string of 1s and 0s according to whether the letter in the parent word was used or not. Consider the word "east":
# |
word |
binary |
subset |
1 |
east |
0001 |
t |
2 |
east |
0010 |
s |
3 |
east |
0011 |
st |
4 |
east |
0100 |
a |
5 |
east |
0101 |
at |
6 |
east |
0110 |
as |
7 |
east |
0111 |
ast |
8 |
east |
1000 |
e |
9 |
east |
1001 |
et |
10 |
east |
1010 |
es |
11 |
east |
1011 |
est |
12 |
east |
1100 |
ea |
13 |
east |
1101 |
eat |
14 |
east |
1110 |
eas |
15 |
east |
1111 |
east |
Map the letters you use to a "1" and any that you don't use to a "0", and the letter combo you use suddenly has a unique binary "signature".
For a word N characters long, there must be 2^N - 1 possible combinations of letters (you subtract one more since one of the combos is all 0s ... no letters used ... so it can't ever help). (The PHP "decbin" function is a quick way to express each number in the sequence as a binary string.)
For a four letter word like "east", that's 2^4 - 1 = 16 - 1 = 15 combos, and the binary form of each number from 1 to 15 can be used as a map for which letters are included in the subset.
Many "parent" words will have repeating letters, so some of these binary letter subsets will boil down to the same set of letters (just originating from different places in the "parent" word), so we only keep the list of unique letter combos.
Finally, we recognize each subset may represent more than one word, depending on how the letters are arranged (for example, the subset of "e", "a", and "t" can generate words "eat", "ate", and "tea"). But this is just the Jumble problem we've solved elsewhere, so we use that code to get the set of matches from the dictionary for each subset of letters we identify.
The dictionary I use contains over 100,000 words, and sometimes the matches are abbreviations or otherwise suspect. A lot depends on the word list you compare to.
One challenge to come home from the 2nd Grade was to find the 39 words hidden in "ornaments". Using this script, we found 368! Time to start talking about extra credit ...
Mark's Blog
(for www.vandine.biz)
mark.vandine on gmail
vandinem on Twitter