How do I display unicode value stored in a string variable using ord()

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Aug 20 01:56:10 EDT 2012


On Mon, 20 Aug 2012 00:44:22 -0400, Roy Smith wrote:

> In article <5031bb2f$0$29972$c3e8da3$5496439d at news.astraweb.com>,
>  Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
> 
>> > So it may be with utf-8 someday.
>> 
>> Only if you believe that people's ability to generate data will remain
>> lower than people's ability to install more storage.
> 
> We're not talking *data*, we're talking *text*.  Most of those
> whatever-bytes people are generating are images, video, and music.  Text
> is a pittance compared to those.

Paul Rubin already told you about his experience using OCR to generate 
multiple terrabytes of text, and how he would not be happy if that was 
stored in UCS-4.

HTML is text. XML is text. SVG is text. Source code is text. Email is 
text. (Well, it's actually bytes, but it looks like ASCII text.) Log 
files are text, and they can fill a hard drive pretty quickly. Lots of 
data is text.

Pittance or not, I do not believe that people will widely abandon compact 
storage formats like UTF-8 and Latin-1 for UCS-4 any time soon. Given 
that we're still trying to convince people to use UTF-8 over ASCII, I 
reckon it will be at least 40 years before there's even a slim chance of 
migrating from UTF-8 to UCS-4 in a widespread manner. In the IT world, 
that's close enough to "never" -- we might not even be using Unicode in 
2052.

In any case, time will tell who is right.



-- 
Steven



More information about the Python-list mailing list