Spell-check engine?
Terry Hancock
hancock at anansispaceworks.com
Fri Oct 18 16:57:56 EDT 2002
On Friday 18 October 2002 01:00 pm, python-list-request at python.org wrote:
> Anyone have a fairly standard spell-check engine available for use from
> Python? That is, something with an interface something like this:
I don't know of any ready-made solutions, and could use one myself, so
if you find something, please post. I immediately imagine using a wrapper
for ispell or aspell or the pspell library though I am ignorant of how to do
that. I was able to find a few interesting leads, though:
ispell: http://fmg-www.cs.ucla.edu/geoff/ispell.html
aspell: http://aspell.sourceforge.net
pspell: http://sourceforge.net/project/showfiles.php?group_id=2791
http://www.fifi.org/doc/libpspell-dev/man-html/manual.html
The last is a library, which would suggest the strategy of using
SWIG or Pyrex or some other python extension method to make
the C++ library available to Python.
Certainly these can do the things you want, and I have seen
programs that use the libraries.
> I'm interested primarily in the _engine_, I know where to get various
> dictionaries of words, but I don't know of any free fuzzy-match engines
> that give you suggestions given a body of words. I'd really like the
> ability to have multiple loaded word-sets (with the ability to create
> unions of the sets as seen above), but I can, I suppose, handle that at
> the application level.
Alternatively, if you want a pure-python approach, I'm pretty sure that
the standard library module "difflib" could be persuaded to fuzzy-match
words. It will fuzzy-match sequences, and a word can be regarded as
a sequence of characters. E.g.:
>>> l = list('hello')
>>> l
['h', 'e', 'l', 'l', 'o']
But that's a long way from what you need, of course.
> This is more curiosity than a real project issue (I just downloaded a
> buggy spell-check for Mozilla). I was just wondering if it would be
> feasible to wrap such a thing in Corba and COM wrappers and make it a
> generic cross-platform OS service so that projects such as Open Office,
> Mozilla and even proprietary software could share the same engine (and
> the same dictionaries). Would want to provide some management UI stuff
> (control-panel app on Windows, not sure what on Linux) for
> adding/removing word-sets, exporting word-sets for backup, sharing and
> the like, but that'd be pretty simple compared to the services
> themselves. Might also need policy control for user versus system
> dictionaries I suppose.
Sounds ambitious, but I think a pspell wrapper would probably fill
the bill. I did a brief google search without finding such a thing
already written, but it wasn't exhaustive -- you might want to look
a little harder before deciding to implement one. But it's probably
not *that* hard to implement, either. I suppose I might attempt it
myself someday -- but I have a big stack of higher priorities to get
through before I could even consider it. Best of luck.
Cheers,
Terry
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com
More information about the Python-list
mailing list