I've been playing a lot of Letter Tycoon on BoardGameArena (it seems
that losing at Scrabble to my wife is good training for this). I
wanted a practice opponent. So I wrote one.
The basic idea is simple enough: given ten available letter
cards, make high-scoring words out of them. This gets slightly
complicated by the special rules: if you've previously played and
staked a claim to the letter X, for example, you can duplicate any one
letter in your hand, while if you have a claim on B and your word
starts and ends with a vowel it's worth double points. (Otherwise I'd
have used the existing an
program, which does this sort of anagram
finding quite fast.)
So one part of this is clearly "given a word that I can assemble, how
much will it score". The other part is reducing the list of dictionary
words to only those that one can assemble.
There are existing anagram checkers (it seems to be a standard compsci
homework problem) but they take the form "is string A an anagram of
string B", i.e. do all the letters appear in whatever order. Not
useful.
Generating every possible combination of letters (including subsets of
them) is clearly going to be desperately slow.
The trick was to build the right hash function. I assign each of the
first 26 primes to a letter (a=2, b=3, c=5, etc.). The hash of a word
is the product of the primes of each of its letters. So the same set
of letters will produce the same hash, no matter what order it's in.
(Given that I potentially have 12-letter words, and the 26th prime is
101, this could potentially exceed 64 bits – though I could get
cunning by assigning the higher primes to less-common letters.
Fortunately Rust has a 128-bit integer type so I don't have to.)
Matching those numbers would do for checking whether seaside
is an
anagram of disease
, but not whether I can make aside
out of it.
(Because if I have a claim on B, as noted above, that might score more
points than the longer word.) But using this hash I can simply test:
hash(available) % hash(word) == 0
Consider the combination of letters abc
, with value 30 (2×3×5), as
my available set. Any word with 0 or 1 of each of a, b and c will have
a hash of which 30 is an integer multiple. But if the word has any
other prime factor, or an extra factor of one of these, its hash won't
evenly divide 30. So I have a quick comparison for choosing which
dictionary words to send off to the rather slower score-calculating
function.
(Still haven't written the code for the V claim, though, which lets
you play two shorter words rather than one longer one – and each of
your other claims can only be scored with one of them.)
Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.