The approach of the OneAcross clue solver is similar to that of the meta-search
engines on the web. We ask a number of sources the same question, and
then combine these results to produce a unified answer.
Crossword Database The main source of
knowledge for our system comes from a large collection of previously
seen crossword puzzles. These puzzles come from a variety of sources
and span a number of years. Currently the database contains over 9000
puzzles, or around 750,000 clues.
Candidate Generation
The clue and length that you entered is passed to a handful of
Expert Modules. Each expert module suggests possible answers to the
clue, weighted by confidence. These expert modules can each do a different kind
of search. For example, one might simply return all words in the
dictionary of the correct length (ignoring the clue), while another
might look for keywords in the clue that match previously seen clues.
Partial list of current Expert Modules:
- Exact Looks for word-for-word matches with previous clues.
- Partial Looks for words in common with previous clues.
- Transformation Looks for ways of changing your clue into a previous clue.
- Word Path Looks for a chain of associations from a word in your clue to word that fits your pattern.
- Crossword Dictionary Returns common crossword answers that fit the pattern.
- Dictionary Returns words that fit the pattern.
In our research system Proverb, we had 33
different expert modules, including expert modules for movies and
fill-in-the-blank clues.
Reweighting
Once the system has these weighted lists of candidates, it must
combine them into a single list for presentation. It first scales each
expert module's list to account for its historic success. That is, if
an expert module has been successful in the past, we might increase
the weight of all its guesses. Finally, these lists are combined
together, and sorted based on confidence.
|