How do I display unicode value stored in a string variable using ord()

Dave Angel d at davea.name
Thu Aug 16 18:47:17 EDT 2012


On 08/16/2012 06:09 PM, Charles Jensen wrote:
> Everyone knows that the python command
>
>      ord(u'…')
>
> will output the number 8230 which is the unicode character for the horizontal ellipsis.
>
> How would I use ord() to find the unicode value of a string stored in a variable?  
>
> So the following 2 lines of code will give me the ascii value of the variable a.  How do I specify ord to give me the unicode value of a?
>
>      a = '…'
>      ord(a)

You omitted the print statement.  You also didn't specify what version
of Python you're using;  I'll assume Python 2.x because in Python 3.x,
the u"xx" notation would have been a syntax error.

To get the ord of a unicode variable, you do it the same as a unicode
literal:

       a = u"j"         #note: for this to work reliably, you probably
need the correct Unicode declaration in line 2 of the file
       print ord(a)

But if you have a byte string containing some binary bits, and you want
to get a unicode character value out of it, you'll need to explicitly
convert it to unicode.

First, decide what method the byte string was encoded.  If you specify
the wrong encoding, you'll likely to get an exception, or maybe just a
nonsense answer.

       a = "\xc1\xc1"            #I just made this value up;  it's not
valid utf8
       b = a.decode("utf-8")
       print ord(b)



-- 

DaveA




More information about the Python-list mailing list