python3 raw strings and \u escapes

Jason Friedman jason at powerpull.net
Fri Jun 15 22:14:40 EDT 2012


>> This is a related question.
>>
>> I perform an octal dump on a file:
>> $ od -cx file
>> 0000000   h   e   l   l   o       w   o   r   l   d  \n
>>            6568    6c6c    206f    6f77    6c72    0a64
>>
>> I want to output the names of those characters:
>> $ python3
>> Python 3.2.3 (default, May 19 2012, 17:01:30)
>> [GCC 4.6.3] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>
>>>>>  import unicodedata
>>>>>  unicodedata.name("\u0068")
>>
>> 'LATIN SMALL LETTER H'
>>>>>
>>>>>  unicodedata.name("\u0065")
>>
>> 'LATIN SMALL LETTER E'
>>
>> But, how to do this programatically:
>>>>>
>>>>>  first_two_letters = "6568    6c6c    206f    6f77    6c72
>>>>>  0a64".split()[0]
>>>>>  first_two_letters
>>
>> '6568'
>>>>>
>>>>>  first_letter = "00" + first_two_letters[2:]
>>>>>  first_letter
>>
>> '0068'
>>
>> Now what?

>>>> hex_code = "65"
>>>> unicodedata.name(chr(int(hex_code, 16)))
> 'LATIN SMALL LETTER E'

Very helpful, thank you MRAB.

The finished product:  http://pastebin.com/4egQcke2.



More information about the Python-list mailing list