converting file formats to txt

BartlebyScrivener rpdooling at gmail.com
Tue Jul 4 20:48:56 EDT 2006


I suspect you will have to process those formats separately. But the
good news, at least for doc files, is that there is a script in the
Python Cookbook 2Ed that does what you want for MS Word docs and
another script that does it for Open Office docs.

The scripts are 2.26 and 2.27 pages 101-102.

I think you can probably find them at the ActiveState repository also.

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/279003

In the book, the title of the script is "Extracting Text from Microsoft
Word Documents"

It uses PyWin32 extension and COM to perform the conversion.

rd




More information about the Python-list mailing list