[Tutor] spell/grammar checking

Alan Trautman ATrautman@perryjudds.com
Tue May 20 17:05:02 2003


Peter,

It's not just a list of words you needs which you pretty much need to pay
for either in lost time or a dictionary file. The other technique is to
parse several long text manuscripts that are typed well and loading all
non-duplicate words for your dictionary. This can be very effective in
making a specific parser for a limited field (medical specialties and law
come to mind). However if this is a learning experience the formula /
parsing techniques for selecting and or indexing related words is really
interesting. The exercise of determining the suggestions is interesting
after all you can't just look for the words with the fewest differences in
letters to be useful. There are common misspelling mistakes to consider etc.
The ideal is context sensitive correction to suggest based on not only
common errors, letter closeness, and related words. 

In short, you don't need a lot of words for your dictionary to be a great
project. You need several that are extremely close based on all the
conditions above and more I'm sure you can come up with to write an
interesting program. You can also use Word or M-W site for good clusters of
words to add for you checker.

HTH and keeps you interested, fuzzy logic and limited AI are favorite
personal subjects.

Alan

Hi,

I've been playing around with the guessing game and word counting case 
studies in Alan Gauld's book, expanding them, and I am wondering where 
people get the words for their spell checkers. Do programmers sit down and 
actually write huge dictionary files, or are there things like that 
available online? Thanks, Peter


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor