Comparing two book chapters (text files)

Tino Wildenhain tino at wildenhain.de
Thu Feb 5 05:21:44 EST 2009


andrew cooke wrote:
> On Feb 4, 10:20 pm, Nick Matzke <mat... at berkeley.edu> wrote:
>> So I have an interesting challenge.  I want to compare two book
>> chapters, which I have in plain text format, and find out (a) percentage
>> similarity and (b) what has changed.
> 
> no idea if it will help, but i found this yesterday - http://www.nltk.org/
> 
> it's a python toolkit for natural language processing.  there's a book
> at http://www.nltk.org/book with much more info.

Also there is difflib in the standard package which can be used
depending on exact definition of "similarity".

Regards
Tino
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3241 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20090205/c66d02f5/attachment-0001.bin>


More information about the Python-list mailing list