Just TOO easy.... Re: Q: a simple(?) raw-utf-8 conversion to internal type unicode "\304\246\311\231\316\257\316\271\303\222"

NevilleDNZ nevillednz at gmail.com
Sun Dec 31 21:28:47 EST 2006


It was just TOO easy... on posting my message to google groups, and
when I re-read the posting on groups I found that google had pointed me
to a python-unicode tutorial...
www.reportlab.com/i18n/python_unicode_tutorial.html - exercise one :-)

Gosh sometime a google is worth so much more then ₁₀¹⁰⁰!

Happy New Year
NevilleD

It works now:
$ ./uc.py
English/ASCII quoting: "ĦəίιÒ ώσŔĹĐ" SUCCEEDS :-)
German/ALCOR quoting: ᛭test᛭ AOK :-)
German/ALCOR quoting: ᛭ĦəίιÒ ώσŔĹĐ᛭ FAILS :-(
nevilled at alfa:/root0/home/nevilled/Project/20 $ vi ./uc.py
nevilled at alfa:/root0/home/nevilled/Project/20 $ cat ./uc.py
#!/usr/bin/env python
imported=unicode("\304\246\311\231\316\257\316\271\303\222
\317\216\317\203\305\224\304\271\304\220","utf-8")
print "English/ASCII quoting:",'"'+imported+'"',"SUCCEEDS :-)" # xterm
encoding if UTF8
print "German/ALCOR quoting:",u"\N{runic cross punctuation}test\N{runic
cross punctuation}","AOK :-)"
print "German/ALCOR quoting:",u"\N{runic cross
punctuation}"+imported+u"\N{runic cross punctuation}","Just TOO easy
:-)"

$ ./uc.py
English/ASCII quoting: "ĦəίιÒ ώσŔĹĐ" SUCCEEDS :-)
German/ALCOR quoting: ᛭test᛭ AOK :-)
German/ALCOR quoting: ᛭ĦəίιÒ ώσŔĹĐ᛭ Just TOO easy :-)

NevilleDNZ wrote:
> Hi,
>
> Apologies first as I am not a unicode expert.... indeed I the details
> probably totally elude me.  Not withstanding:  how can I convert a
> binary string containing UTF-8 binary into a python unicode string?
>
> cutdown example:
> $ cat ./uc.py
> #!/usr/bin/env python
> imported="\304\246\311\231\316\257\316\271\303\222
> \317\216\317\203\305\224\304\271\304\220"
> print "English/ASCII quoting:",'"'+imported+'"',"SUCCEEDS :-)" # xterm
> encoding if UTF8
> print "German/ALCOR quoting:",u"\N{runic cross punctuation}"+"test"
> +"\N{runic cross punctuation}","AOK :-)"
> print "German/ALCOR quoting:",u"\N{runic cross
> punctuation}"+imported+u"\N{runic cross punctuation}","FAILS :-("
>
> $ ./uc.py
> English/ASCII quoting: "ĦəίιÒ ώσŔĹĐ" SUCCEEDS :-)
> German/ALCOR quoting: ᛭test᛭ AOK :-)
> German/ALCOR quoting:
> Traceback (most recent call last):
>   File "./uc.py", line 5, in <module>
>     print "German/ALCOR quoting:",u"\N{runic cross
> punctuation}"+imported+u"\N{runic cross punctuation}","FAILS :-("
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0:
> ordinal not in range(128)
>
> The last print statement fails because the ascii "imported" characters
> are 8 bit encoded UTF-8 and dont know it! How do I tell "imported" that
> it is actually already UTF-8 unicode?
> 
> Cheers
> NevilleDNZ




More information about the Python-list mailing list