[python-win32] Compiling a Python Windows application
Bob Gailer
bgailer at alum.rpi.edu
Mon Nov 27 23:56:53 CET 2006
Bokverket wrote:
> I did program a lot in VB's earlier versions, but it has grown... My
> reason
> for not considering VB was that the actual processing would make excellent
> use of the Python collection objects /dictionaries/, which in my mind would
> hold words of the Microsoft Word document.
Are you aware that VBA with the scripting runtime offers dictionary
object almost identical to Python's dict? Here is a VB Sub that counts
the words in the document:
Sub test()
Dim d As Document, w As Variant, w2 As String
Set dict = CreateObject("Scripting.Dictionary")
Set d = Documents(1)
For Each w In d.Range().Words
w2 = Trim(w)
If dict.Exists(w2) Then
dict(w2) = dict(w2) + 1
Else
dict(w2) = 0
End If
Next
End Sub
For my test document of about 21000 occurrences of 120 words this took
about 5 seconds. The Python equivalent takes 0.15 seconds.
>>> import time
>>> import win32com.client
>>> a = win32com.client.Dispatch("word.application")
>>> d = a.Documents(1)
# wrap the process to get the text from the document, split it into
words and build the dictionary
>>> def f():
... t=time.time()
... s=d.Range().Text
... w=s.split()
... wd={}
... for i in w:
... wd[i]=wd.setdefault(i,0)+1
... print time.time()-t
...
>>> f()
0.15700006485
> (The app's purpose is to analyze words of possibly very large Word documents.) Plus I suppose that a macro which would loop with a few lines over each word of the doc will be slow, although I don't if there is a compiling or byte-code mechanism. Am I wrong?
>
> I don't know if having a VB as glue to shelling Python is perfectly fine
> performance-wise, and it certainly would be a simple way to handle the
> dialog boxes that collect the parameters. Maybe that is a much better way
> than wondering about calling Python /shelling, calling a DLL, whatever/
> directly.
>
> Next question: Is Microsoft Word's API for Python published like for VB and
> easy to use?
>
Word has one API. It is what is published for VB. Your Python program
would use win32com.client to launch Word as a COM server, then interact
with it the same as a VB program (well, almost the same). For this you
need pywin32 http://sourceforge.net/projects/pywin32/.
import win32com.client
application = win32com.client.Dispatch("word.application")
# application is the same as the application object you see at the top
of the word DOM in VB.
document = application.Documents.Add() # to create a new document OR
.Documents.Open(filename) to open an existing document.
# OR if word is already running you can access an existing document
using .Documents(indexno OR name)
# how is different from VB? objects do not have default properties. Must
be explicit. No set statement. Functions and subs must have the () appended.
Hope that's enough to get you started.
Since your goal seems to be text processing I'd think you'd want to read
the entire document text into a Python string, then manipulate that.
text = document.Range().Text will get all the text of the document body.
(excludes header/footer). Note that paragraph breaks are \r, and that
table cells end in \r\x07.
--
Bob Gailer
510-978-4454
More information about the Python-win32
mailing list