[python-win32] can't read WORD document's header correctly
oyster
lepto.python at gmail.com
Sat Jun 27 00:30:05 EDT 2020
Hi, all
p.s. The script and test.docx can be found on
https://github.com/retsyo/read_word_header_win32com .
I have a [`DOCX`
file](https://github.com/retsyo/read_word_header_win32com), which has
the following info
| | pages of section | header
|
| ---------- | ------------------ |
------------------------------------------------- |
| section 1 | 1 | None
|
| section 2 | 2 | Abstract. current page ??,
total IV pages |
| section 3 | 5 | paper body, current page ??,
total 35 pages |
Then if the line `oSec.Headers(1).Range.Fields.Update` is used in the
following `VBA` code, the corrected header text will be shown by
calling `displayHeader`
```vb
Function myTrim(s)
a = Replace(s, vbLf, "")
myTrim = Trim(a)
End Function
Sub displayHeader()
idx = 1
For Each oSec In ActiveDocument.Sections
oSec.Headers(1).Range.Fields.Update 'this line must be called
MsgBox "sec " & idx & " " & myTrim(oSec.Headers(1).Range.Text)
idx = idx + 1
Next
End Sub
```
Then I coined the `Python` version, as we all know it looks like the
original `VBA` one
```python
import win32com
from win32com.client import Dispatch, constants
#~ word = win32com.client.Dispatch('Word.Application')
word = win32com.client.gencache.EnsureDispatch('Word.Application')
word.Visible = 1
word.DisplayAlerts = 0
word.Documents.Open('r:/test.docx')
for idx, oSec in enumerate(word.ActiveDocument.Sections):
#~ oSec.Headers(1).Range.Fields.Update()
print(f'sec {idx+1}', oSec.Headers(1).Range.Text.strip())
word.Documents.Close(constants.wdDoNotSaveChanges)
word.Quit()
```
However, the `Python` code does not give the same corrected header
text no matter I use `Dispatch('Word.Application')` or
`gencache.EnsureDispatch('Word.Application') `, and no matter I use
`Range.Fields.Update()` or not. You can read a much well-presented
version on https://github.com/retsyo/read_word_header_win32com, but in
one word, I can't get the expected result with any of the 4 different
approaches.
So, what is the problem, and how to fix it? Thank you in advance.
More information about the python-win32
mailing list