Analysing Word documents (slow) What's wrong with this code please!

jmdeschamps jmdeschamps at cvm.qc.ca
Fri Jan 16 16:51:35 EST 2004


Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions)

# START ========================
import win32com.client
import tkFileDialog, time

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")

myWordDoc = tkFileDialog.askopenfilename()

MSWord.Documents.Open(myWordDoc)

boldRanges=[]  #list of bold ranges
boldStart = -1
boldEnd = -1
t1= time.clock()
for i in range(len(MSWord.Documents[0].Content.Text)):
    if MSWord.Documents[0].Range(i,i+1).Bold  : # testing for bold
property
        if boldStart == -1:
            boldStart=i
        else:
            boldEnd= i
    else:
        if boldEnd != -1:
            boldRanges.append((boldStart,boldEnd))
            boldStart= -1
            boldEnd = -1          
t2 = time.clock()
MSWord.Quit()

print boldRanges  #see what we got
print "Analysed in ",t2-t1
# END =====================================

Thanks in advance



More information about the Python-list mailing list