Spell-check engine?

Terry Hancock hancock at anansispaceworks.com
Fri Oct 18 16:57:56 EDT 2002


On Friday 18 October 2002 01:00 pm, python-list-request at python.org wrote:
> Anyone have a fairly standard spell-check engine available for use from
> Python?  That is, something with an interface something like this:

I don't know of any ready-made solutions, and could use one myself, so
if you find something, please post. I immediately imagine using a wrapper
for ispell or aspell or the pspell library though I am ignorant of how to do
that.  I was able to find a few interesting leads, though:

ispell: http://fmg-www.cs.ucla.edu/geoff/ispell.html

aspell: http://aspell.sourceforge.net

pspell: http://sourceforge.net/project/showfiles.php?group_id=2791
            http://www.fifi.org/doc/libpspell-dev/man-html/manual.html

The last is a library, which would suggest the strategy of using
SWIG or Pyrex or some other python extension method to make
the C++ library available to Python.

Certainly these can do the things you want, and I have seen
programs that use the libraries.

> I'm interested primarily in the _engine_, I know where to get various
> dictionaries of words, but I don't know of any free fuzzy-match engines
> that give you suggestions given a body of words.  I'd really like the
> ability to have multiple loaded word-sets (with the ability to create
> unions of the sets as seen above), but I can, I suppose, handle that at
> the application level.

Alternatively, if you want a pure-python approach, I'm pretty sure that
the standard library module "difflib" could be persuaded to fuzzy-match
words.  It will fuzzy-match sequences, and a  word can be regarded as
a sequence of characters. E.g.:

>>> l = list('hello')
>>> l
['h', 'e', 'l', 'l', 'o']

But that's a long way from what you need, of course.

> This is more curiosity than a real project issue (I just downloaded a
> buggy spell-check for Mozilla).  I was just wondering if it would be
> feasible to wrap such a thing in Corba and COM wrappers and make it a
> generic cross-platform OS service so that projects such as Open Office,
> Mozilla and even proprietary software could share the same engine (and
> the same dictionaries).  Would want to provide some management UI stuff
> (control-panel app on Windows, not sure what on Linux) for
> adding/removing word-sets, exporting word-sets for backup, sharing and
> the like, but that'd be pretty simple compared to the services
> themselves.  Might also need policy control for user versus system
> dictionaries I suppose.

Sounds ambitious, but I think a pspell wrapper would probably fill
the bill.  I did  a brief google search without finding such a thing
already written, but it wasn't exhaustive -- you might want to look
a little harder before deciding to implement one.  But it's probably
not *that* hard to implement, either.  I suppose I might attempt it
myself someday -- but I have a big stack of higher priorities to get
through before I could even consider it. Best of luck.

Cheers,
Terry

--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks  http://www.anansispaceworks.com




More information about the Python-list mailing list