Original SCOWL and its Artifacts
The original SCOWL (SCOWLv1) was a compilation of the information in the database into a set of simple word lists that can be combined to create speller dictionaries of various sizes and dialects (American, British (both -ise and -ize), Canadian and Australian).
SCOWLv2 instead combines all that information into a single text file and SQLite3 database.
This page contains information on SCOWLv1 and the artifacts from it.
Historic releases are available on SourceForge.net at https://sourceforge.net/projects/wordlist/files/.
Original SCOWL
SCOWL (Spell Checker Oriented Word Lists) and Friends is a database of information on English words useful for creating high-quality word lists suitable for use in spell checkers of most dialects of English. The database primary contains information on how common a word is, differences in spelling between the dialects if English, spelling variant information, and (basic) part-of-speech and inflection information.
SCOWL itself is a compilation of the information in the database into a set of simple word lists that can be combined to create speller dictionaries of various sizes and dialects (American, British (both -ise and -ize), Canadian and Australian).
View readme. Download Version 2020.12.07 as: tar.gz (Unix EOL), zip (DOS/Windows EOL). Get source.
VarCon
VarCon (Variant Conversion Info) is a database to convert between American, British (both “ise” and “ize” spellings), Canadian and Australian spellings and vocabulary as well as well as a table listing the equivalent forms of other variants.
readme, tar.gz, zip (2020.12.07) source
VarCon is no longer being maintained. Variant information from VarCon is now part of SCOWL.
AGID
AGID is an Automatically Generated Inflection Database from an insanely large word list. My goal is for the non-questionable entries to be 100% accurate.
readme, tar.gz, zip (2016.01.19) source
AGID is no longer being mainatined.
Unofficial Jargon File Word Lists
The Unofficial Jargon File Word Lists is a collection of useful Word Lists created from the Jargon file.
Part Of Speech Database
The Part Of Speech Database is a combination of “Moby (tm) Part-of-Speech II” and the WordNet database.
Please note that this is not a very high quality source. For common words the 2of12id.txt file from the Alternate 12 Dicts Package is better.
Ispell English Word Lists
This package contains the contents of the Ispell (ver 3.1.20) word list after being expand from there affix compressed form used by Ispell.
This package is provided for historical purposes. The Ispell lists are no longer directly used by SCOWL.
Other Sources
- MWords
- UK English Wordlist With Frequency Classification
- ENABLE, YAWL
- UK Advanced Cryptics Dictionary (UKACD)
- 1990 Census report names file
You can also find the source of these lists in the v1 branch.