Spell Checker

Mike C. Fletcher mcfletch at rogers.com
Sat Aug 9 10:20:02 EDT 2003


Jarek Zgoda wrote:
...

>>http://sourceforge.net/projects/pyspelling/
>>    
>>
>
>This one doesn't have too much to offer... According to SF: "this
>project did not release any files" and its state is "pre-alpha".
>  
>
Yup.  I suspended work on it when it became obvious that it would be 
months and months (years, maybe) before Chandler actually needed a 
spell-check engine.  The engine is fully functional (I use it to lookup 
words I don't know how to spell all the time), but the focus of the 
project is on having a *toolkit* from which to build spell-checking 
services into applications, rather than having a particular 
spell-checking service.

In particular, it focusses on making it possible to use large numbers of 
potentially dynamic word-sets. The idea behind that being to allow IDE 
(or Chandler) developers to construct per-document/document fragment, 
and per-project/library/programming-langauge/user grammars and use them 
all through the same query interface, swapping in grammars as 
appropriate for any given lookup.  The engine can store the grammars 
on-disk (2 or 3 formats), or in-memory, or you can provide your own 
storage mechanism.

At the moment the project has a fairly good phonetic compression 
algorithm (~= to the aspell one at the time, (constructed by reading the 
Aspell documention), and producing fairly similar results), and can do 
queries across "normal"-sized on-disk grammars fairly quickly (the 
in-memory ones are extremely fast of course), though the 3/4 million 
word grammars can be noticably slow.  It can read the grammar lists from 
Aspell dictionaries, btw.

If what you're looking for is an out-of-the-box spell checker, you'll 
probably want to look at one of the aspell/ispell wrappers (e.g. 
snakespell).  They are designed for looking up words in fairly static 
(compiled) word-sets, and for simple "is this similar to any 
English/French/German word" queries it's the easiest thing to manage.  
The Python Spelling Construction Set really is a construction set, not a 
packaged service.  That said, I should probably just run setup.py sdist 
bdist_wininst so that people can check it out without needing a CVS 
checkout.

Enjoy,
Mike

_______________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://members.rogers.com/mcfletch/








More information about the Python-list mailing list