Tuesday 3 April 2012


(LISZTOMANIA just went in the wordlist)

Late breaking news: Cross Nerd puzzles are now available via Alex Boisvert's Crossword Butler service. If you haven't checked out Crossword Butler, do it here and do it now. Also, hit up Alex's site for some of his fantastic puzzles and a host of useful constructing tools (the algebraic theme entry tool is particularly handy when brainstorming letter-manipulation type themes).

In other news, I've been doing some intense wordlist maintenance lately. Of course, wordlist additions are a daily thing, but up until now my list has been unscored. I recently read through all of Todd McClary's Autofill Project posts, and I was inspired to start thinking about how to best score and organize my list. And now would be a good time, since my list is relatively small: I've got about 120,000 entries, which is a slightly edited version of the Cruciverb.com "all" database plus 13,000 of my own entries. Now, to the layperson this would seem substantial (when I first took a naive crack at construction several years back, I was stoked to have like 200 solid 10-letter entries after an evening-long brainstorming session. I thought I would be an all-star constructor in no time with these hot words. I gave up in discouragement shortly after, surprise surprise (SURPRISESURPRISE just went in the wordlist)), but I've heard that some of the top dogs have scored databases of 750,000+ words! Obviously, I've got a long way to go, but I've started a more rigorous word-gathering process (carefully poaching all of the good wordspy.com entries and adding all of their inflections, adding lists of US presidents, Popes, alcoholic shots with colorful names like Muff Diver (MUFFDIVER/MUFFDIVERS just went in the wordlist), the "1001 Movies You Must See Before You Die", etc.), and I really should get scoring before the list blows up.

Traditionally, words are given an absolute numerical score. This is used by autofill algorithms to produce fill that is most desirable to the constructor, by starting with the highest-scored words and trying to maximize the combined score of all of the words in the grid. This is a simple but effective way to get the most out of the autofill process, but I see some problems with scoring a wordlist solely by assigning fixed numeric values. For one thing, words may be more or less desirable depending on the context, so it would be nice to have variable scores for certain words. For instance, when making the Cross Nerd puzzles I welcome vulgarity and off-color entries and thus would be tempted to score something like BABYBATTER quite highly (BABYBATTER was already in the wordlist, fyi), although if I was filling a grid for just about any other market I wouldn't want that entry within 100 miles of my puzzle. Furthermore, I don't really know how high or low to score words relative to each other to get the best fill, so I'd like to be able to easily and quickly adjust scores for the purposes of experimentation. Of course I can tell you that BLANKETHOG and SKRILLEX and ETPHONEHOME should be scored higher than RESEATED and UNAMASSED, but how much higher? What differences would you see in the fill if solid but boring 3-letter entries (SEA, e.g.) were scored much higher than showstopping long entries, or vice-versa? My solution, which I've begun to implement in a small wordlist management app, is to tag each word with one or more symbolic values rather than a single absolute score. When a scored wordlist is needed for filling a particular grid, one can be "compiled" using some or all of the symbolically-scored words on the master list. I'm thinking that it might be interesting to be able to break down the master list by length and symbolic score, and assign relative values to each sublist using a graphical interface (say, by placing every sublist at some point along a scale). Numerical scores would then be assigned based on position during the compilation process. That way, you could tinker with the relative values of various types of entries and see what sort of fill results, without ever having to consider whether an entry should be 90 or 91. I'm thinking that I'll have 5 or 6 different scores for general legitimacy/freshness/scrabbly-ness ("Unusable", "Will do in a pinch", "Perfectly legit but not terribly exciting", "OMG must put in your next puzzle", etc) as well as a number of tags like "vulgar", "too indie for the NYT", etc that can be used to filter out undesirable entries altogether. The software to do all of this is not coming together terribly quickly because I tend to not have time for much other than making crosswords (and living the rest of my life, here and there), but I'm excited to at least begin. If I end up with an end product that isn't too esoteric or unwieldy I'll release it free for y'all to use. I'd love to be able to give back to the community, as many of you have already provided tremendously useful tools and advice. The rest of you are just nice people. For the time being, let me know what you think about this approach to wordlist scoring. Do you see any shortcomings? Is this an approach that you would try yourself? Feedback is always welcome here, so don't be afraid to comment or drop me a line.

Moving on to the puzzle, this week's offering is a silly take on an old workhorse of a theme. Started off as a vague idea that made me laugh, but turned out to be a bitch to execute. Hopefully the result is at least coherent. We may never know how Dr. Fill would fare on this one, but we get to see what Watson makes of it.

More words, crossed and otherwise, next Tuesday.

Puzzle: The Mystery of the Tired Theme
Rating: XW-18A
Download the PDF and PUZ files here, or solve or download the Across Lite puzzle and/or software from the Java app below.


Alex said...

You are of course welcome to join the Collaborative Word List as well.

acme said...

Might you get in touch with me? andreacarla.michaels(at)gmail
I'd like to include you in on a constructors lunch in SF next month