Keeping python code and database in sync

Chris Angelico rosuav at gmail.com
Fri Aug 29 22:28:54 EDT 2014


On Sat, Aug 30, 2014 at 12:14 PM, Skip Montanaro <skip at pobox.com> wrote:
> Yes, "words" are skipped if they contain anything other than lower
> case alphabetic characters. Really simple words = text.split(), then
> discard words not meeting the criteria.

Easy way to catch a few more: Just .strip() off a few common items of
punctuation (quotes (all types), full stop, comma, brackets (all
types), etc). If there are any inside the word, discard the word, but
those at one end or other aren't a problem.

ChrisA



More information about the Python-list mailing list