reading character in word

sebastien s.thuriez at quantaflow.com
Fri Dec 7 06:41:55 EST 2001


"Steve Holden" <sholden at holdenweb.com> wrote in message news:<ElpP7.4827$P_.178613 at atlpnn01.usenetserver.com>...
> "sebastien" <s.thuriez at quantaflow.com> wrote ...
> > Hi,
> >
> > Maybe have you encountered my problem : I do not manage to read
> > accentuated characters such as éèà ... from a microsoft word
> > application using win32com.client.
> >
> > Here is the programm that I used :
> >
> > from win32com.client import constants, Dispatch
> >
> > word= Dispatch('Word.Application')
> > mondoc=word.Documents.Open(r"c:\test.doc")
> > nombre_caracteres=mondoc.Characters.Count
> > for numero_caractere in range (1,int(nombre_caracteres)+1):
> >     caractere=mondoc.Characters.Item(numero_caractere)
> >     try:
> >         print caractere
> >      except:
> >         print("cannot read character")
> >         pass
> >
> The problem here is a lack of information: you are catching all errors, then
> not printing out enough information to see what the exact error is! I
> suspect it will be complaining about characters with ordinal values greater
> than 127 -- the print statement notoriously fails to handle these.
> >
> > Everything is fine as soon as I do not use characters with coma accent
> > such as é à è in the document...
> >
> >
> > I also tried on my PC to do :
> >
> > test_input=raw_input("entrer phrase avec accents : ")
> > print str(test_input)
> > for char in test_input:
> >     print char,ord(char)
> >
> > after enterring word using the éàè..there is no problem printing the
> > correct letters (for exemple été).
> >
> > There maybe some option in Word that I should set ??
> >
> > I hope that someone will have some leads ...
> 
> You need to do a search on Google for something like "Python print Unicode
> string" to get advice on conversion to an appropriate encoding before you
> print. If this isn't the problem, then remove the try/except and post the
> exact message you see.
> 
> regards
>  Steve

There is indeed a unicode problem. If I exceute the following code:
____________________________________________________________

from win32com.client import constants, Dispatch

word= Dispatch('Word.Application')
mondoc=word.Documents.Open(r"c:\testseb7.doc")
nombre_caracteres=mondoc.Characters.Count
caractere=""
for numero_caractere in range (1,int(nombre_caracteres)+1):
    caractere=mondoc.Characters.Item(numero_caractere)
    print caractere
    #print caractere.encode('latin-1')
______________________________________________________

the error is  :

File “win32com\gen_py\00020905-0000-0000-c000-000000000046x0x8x1.py,
line 9331, in___str__
return str(apply( self.__call__,args))
UnicodeError: ASCII encoding error : ordinal not in range(128)


then I tried to convert it to printable caracters or sequence number 
using :

____________________________________________________
from win32com.client import constants, Dispatch

word= Dispatch('Word.Application')
mondoc=word.Documents.Open(r"c:\testseb7.doc")
nombre_caracteres=mondoc.Characters.Count
caractere=""
for numero_caractere in range (1,int(nombre_caracteres)+1):
    caractere=mondoc.Characters.Item(numero_caractere)
    #print caractere
    print caractere.encode('utf-8')
_______________________________________________________

The error message is : 

File "E:\Python21\win32com\client\_init__.py". line 348, in
__getattr__
AttributeError: encode


If I try to use the code :

__________________________________________________________

from win32com.client import constants, Dispatch

word= Dispatch('Word.Application')
mondoc=word.Documents.Open(r"c:\testseb9.doc")
nombre_caracteres=mondoc.Characters.Count
caractere=""
for numero_caractere in range (1,int(nombre_caracteres)+1):
    caractere=mondoc.Characters.Item(numero_caractere)
    print unicode(caractere)
__________________________________________________________

I get the following error :
....
UnicodeError: Ascii encoding error: ordinal not in range(128)

I am sure there is a problem with the unicode settings but I do not
see how to modify it.

Sebastien.



More information about the Python-list mailing list