using python to edit a word file?

John Salerno johnjsal at NOSPAMgmail.com
Fri Aug 11 10:08:16 EDT 2006


Anthra Norell wrote:
> John,
> 
>       I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
> from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
> practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
>       If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.
> 
> Frederic

What I ended up doing was just saving the Word file as an XML file, and 
then writing a little script to process the text file. Then when it 
opens back in Word, all the formatting remains. The script isn't ideal, 
but it did the bulk of changing the numbers, and then I did a few things 
by hand. I love having Python for these chores! :)



import re

xml_file = open('calendar.xml')
xml_data = xml_file.read()
xml_file.close()

pattern = re.compile(r'<w:t>(\d+)</w:t>')

def subtract(match_obj):
     date = int(match_obj.group(1)) - 1
     return '<w:t>%s</w:t>' % date

new_data = re.sub(pattern, subtract, xml_data)

new_file = open('calendar2007.xml', 'w')
new_file.write(new_data)
new_file.close()



More information about the Python-list mailing list