printing list containing unicode string

Xah Lee xah at xahlee.org
Tue Sep 11 06:46:50 EDT 2007


J. Cliff Dyer wrote:
"  ...UCS-2, for example, is a fixed width, 2-byte encoding that can
handle any unicode code point up to 0xffff, but cannot handle the 3
and 4 byte extension sets. "

I was going to reply to say that this is a good point. But on my way i
looked up wikipedia,
http://en.wikipedia.org/wiki/UTF-16/UCS-2

quote:
" In computing, UTF-16 (16-bit Unicode Transformation Format) is a
variable-length character encoding for Unicode, capable of encoding
the entire Unicode repertoire. "

and
" UCS-2 (2-byte Universal Character Set) is an obsolete character
encoding which is a predecessor to UTF-16. The UCS-2 encoding form is
nearly identical to that of UTF-16, except that it does not support
surrogate pairs and therefore can only encode characters in the BMP
range U+0000 through U+FFFF. "

So, the matter isn't simple. (i.e. it is not decisive to say i'm
incorrect in my original criticism about that article's statement on
utf-8.)

------------

Btw, i think i should mention, that i have read from cover to cover
the unicode 3 specification in 2002. (one heavy, thick, large, deep
blue colored book)

Another resource that contributed my understanding of unicode, is the
book
"CJKV Information Processing" by Ken Lunde, which i read in the same
year.

Also of interest, is that i learned about a year ago, the chinese
encoding
http://en.wikipedia.org/wiki/GB_18030
which is required by law for all computers sold in China to support,
is actually a Unicode encoding. Specifically, in encompasses all the
chars in Unicode.

Also relevant info in our discussion, is that recently i was looking
at alexa.com's web ranking:

http://alexa.com/site/ds/top_sites?ts_mode=global&lang=none

and noticed several pure chinese lang websites are among the top 100.

Baidu.com (百度) is at top 8 today, followed by
腾讯网 (http://www.qq.com) at 12, and
新浪 sina.com.cn at 19, etc.

It is somewhat amazing in the context of computing and languages. No
other non-English lang comes close.

(Note here also, Chinese as measured by number of speakers, is roughly
4 times that of English.
http://en.wikipedia.org/wiki/Ethnologue_list_of_most_spoken_languages
This fact, coupled with developement and commercialization of China in
the past decade, are reasons of the above web ranking result.
)

Not relevant in our discussion, but I happend to also notice a site
named youporn.com (was ranked 69 few weeks ago). youporn.com is
basically like youtube.com, but with porn vids. It has long been my
thought, that the progress of humanity in a society can be measured as
by its popularity and acceptance of porn. (in fact i recall seeing
some academic (or not) report about this few months ago... couldn't
remember where now) Society as a whole, have improved dramatically
since the communication revolution in particulart started with the
web.

(see Xah's Porn Outspeak
http://xahlee.org/PageTwo_dir/Personal_dir/porn_movies.html

For more info about youtube.com, see:

http://en.wikipedia.org/wiki/Youporn

curious party might also check out

http://en.wikipedia.org/wiki/Youtube

which is a major phenomenon, in my opinion, contributed to the
progress of humanities far more than, say, any university or
educational institution.

(my thesis in general in this direction, is that communication, the
main media of knowledge, is the utmost factor in human animal's
progress with respect to what's generally considered humanitarianism.
More important than, say, the need to decry war, have laws, maintain
peace, spread gospels, aid the poor, ... etc. (and in fact, in this
thesis, i consider what commonly considered as good activities such as
aiding the poor, or any moral attitude and activities about good of
humanity (such as OpenSource), are in fact criminal in their effects
and almost in their intention too ...)) )

PS for some reason message posted thru google groups service since the
past week or so are stripping off the unicode chars double angle
brackets (U+00AB and U+00BB). For that reason, in this msg i've also
used double curly quotes "" whenever i have double angle brackets.

  Xah
  xah at xahlee.orghttp://xahlee.org/




More information about the Python-list mailing list