can't open word document after string replacements

Bruno Desthuilliers onurb at xiludom.gro
Tue Oct 24 04:41:26 EDT 2006


Antoine De Groote wrote:
> Hi there,
> 
> I have a word document containing pictures and text. This documents
> holds several 'ABCDEF' strings which serve as a placeholder for names.
> Now I want to replace these occurences with names in a list (members).

Do you know that MS Word already provides this kind of features ?

> I
> open both input and output file in binary mode and do the
> transformation. However, I can't open the resulting file, Word just
> telling that there was an error. Does anybody what I am doing wrong?

Hand-editing a non-documented binary format may lead to undesirable
results...

> Oh, and is this approach pythonic anyway? 

The pythonic approach is usually to start looking for existing
solutions... In this case, using Word's builtin features and Python/COM
integration would be a better choice IMHO.

> (I have a strong Java
> background.)

Nobody's perfect !-)

> Regards,
> antoine
> 
> 
> import os
> 
> members = somelist
> 
> os.chdir(somefolder)
> 
> doc = file('ttt.doc', 'rb')
> docout = file('ttt1.doc', 'wb')
> 
> counter = 0
> 
> for line in doc:

Since you opened the file as binary, you should use file.read() instead.
Ever wondered what your 'lines' look like ?-)

>     while line.find('ABCDEF') > -1:

.doc is a binary format. You may find such a byte sequence in it's
content in places that are *not* text content.

>         try:
>             line = line.replace('ABCDEF', members[counter], 1)
>             docout.write(line)

You're writing back the whole chunk on each iteration. No surprise the
resulting document is corrupted.

>             counter += 1

seq = list("abcd")
for indice, item in enumerate(seq):
  print "%02d : %s" % (indice, item)


>         except:
>             docout.write(line.replace('ABCDEF', '', 1))
>     else:
>         docout.write(line)
> 
> doc.close()
> docout.close()
> 



-- 
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'onurb at xiludom.gro'.split('@')])"



More information about the Python-list mailing list