Editing MS Word
tonylabarbara at aol.com
tonylabarbara at aol.com
Fri Oct 26 15:49:06 EDT 2007
Hi;
I'm trying to edit MS Word tables with a python script. Here's a snippet:
import string
def msw2htmlTables():
input = "/usr/home/me/test.doc"
input = open(input,'r')
word = "whatever"
inputFlag = 0
splitString = []
for line in input:
# Check first the inputFlag, since we only want to delete the top
if inputFlag == 0:
splitString = line.split(word)
try:
keep = splitString[1]
except:
keep = "nada"
print len(splitString)
inputFlag = 1
elif inputFlag == 1:
# This means we've deleted the top junk. Let's search for the bottom junk.
splitString = line.split(word)
try:
keep = splitString[0]
inputFlag = 2
print len(splitString)
except:
keep += line
elif inputFlag == 2:
# This means everything else is junk.
pass
Now, if var "word" is "orange", it will never pring the length of splitString. If it's "dark", it will. The only difference is the way they appear in the document. "orange" appears with a space character to the left and some MS garbage character to the right, while "dark" appears with a space character to the left and a comma to the right. Furthermore, if I use MSW junk characters as the definition of "word" (such as " Ù ", which is what I really need to search), it never even compiles (complains of an unpaired quote). It appears that python doesn't like MSW's junk characters. What shall I do?
TIA,
Tony
________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - http://mail.aol.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071026/89bbf83d/attachment.html>
More information about the Python-list
mailing list