about Python doc reader

norseman norseman at hughes.net
Thu May 14 12:23:15 EDT 2009


Tim Golden wrote:
> norseman wrote:
>> I did try these.
>>
>> Doc at once:
>> outputs two x'0D' and the file.  Then it appends x'0D' x'0D' x'0A' 
>> x'0D' x'0A' to end of file even though source file itself has no EOL.
>> ( EOL is EndOfLine  aka newline )
>>
>> That's  cr cr             There are two blank lines at begining.
>>         cr cr lf cr lf    There is no EOL in source
>>                           Any idea what those are about?
>> One crlf is probably from python's print text, but the other?
>>
>> The lines=
>> appends   [u'\r', u'\r', u"  to begining of output
>> and   \r"]x'0D'x'0A'   to the end even though there is no EOL in source.
>>
>> output is understood:    u'\r'  is Apple EOL
>> the crlf is probably from print lines.
> 
> Not clear what you're doing to get there. This is the (wrapped) output 
> from my interpreter, using Word 2003. As you can see, new
> doc: one "\r", nothing more.
> 
> <dump>
> Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit 
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import win32com.client
>  >>> word = win32com.client.gencache.EnsureDispatch ("Word.Application")
>  >>> doc = word.Documents.Add ()
>  >>> print repr (doc.Range ().Text)
> u'\r'
>  >>>
> 
> </dump>
> 
==============
The original "do it this way" snippets were:
<code>
import win32com.client

doc = win32com.client.GetObject ("c:/temp/temp.doc")
text = doc.Range ().Text

</code>

Note that this will give you a unicode object with \r line-delimiters.
You could read para by para if that were more useful:

<code>
import win32com.client

doc = win32com.client.GetObject ("c:/temp/temp.doc")
lines = [p.Range () for p in doc.Paragraphs]

</code>


and I added:

print text    after "text =" line above

print lines   after "lines =" line above

then ran file using   python test.py >letmesee
followed by viewing letmesee in hex



Steve






More information about the Python-list mailing list