Unicode drives me crazy...

John Roth newsgroups at jhrothjr.com
Mon Jul 4 09:08:33 EDT 2005


<fowlertrainer at citromail.hu> wrote in message 
news:mailman.1311.1120477323.10512.python-list at python.org...
> Hi !
>
> I want to get the WMI infos from Windows machines.
> I use Py from HU (iso-8859-2) charset.
>
> Then I wrote some utility for it, because I want to write it to an XML 
> file.
>
> def ToHU(s,NoneStr='-'):
>    if s==None: s=NoneStr
>    if not (type(s) in [type(''),type(u'')]):
>       s=str(s)
>    if type(s)<>type(u''):
>       s=unicode(s)
>    s=s.replace(chr(0),' ');
>    s=s.encode('iso-8859-2')
>    return s
>
> This fn is working, but I have been got an error with this value: 
> 'Kommunik\xe1ci\xf3s port (COM1)'
>
> This routine demonstrates the problem
>
> s='Kommunik\xe1ci\xf3s port (COM1)'
> print s
> print type(s)
> print type(u'aaa')
> s=unicode(s) # error !
>
> This is makes me mad.
> How to I convert every objects to string, and convert (encode) them to 
> iso-8859-2 (if needed) ?
>
> Please help me !

As Tim Golden already explained, you're getting a unicode
object from the WMI interface. The best design help I can
give is to either convert it to iso-8859-2 at the point you
get the object and do your entire program with iso-8859-2
encoded strings, or do your entire program with unicode
objects and encode them as iso-8859-2 strings whenever
you want to write them out. Trying to do your conversion
in the middle will lead to excessive complexity, with the
resulting debugging problems.

If you do  go the unicode route, you must remember that
any method or function that's defined to return a string will
most likely throw an exception. This includes str()! Whether
or not the print statement will work depends on a number
of factors in how your Python installation was set up.

HTH

John Roth


> Thanx for help:
> ft




More information about the Python-list mailing list