Using non-ascii symbols

Runsun Pan python.pan at gmail.com
Sat Jan 28 00:57:16 EST 2006


On 1/27/06, Terry Hancock <hancock at anansispaceworks.com> wrote:
> Well, let's just say, I think there should be different
> standards for "write once / read once" versus "write once /
> read many".  The mere use of written language once implied
> the latter, but I suppose text messaging breaks that rule.

Since we are on this, let me share with you guys a little 'ice-tip'
for how the younger generations in Taiwan communicate:

  A: why did you tell av8d that I am a bmw ?
  B: Well, you are just like one of those ogs or obs ...
  A: oic, you think you are much q than I ?
  B: ...
  A: I would 3q if you stop doing so.
  B: ok.
  A: Orz
  B: 88
  A: 881

Can you guys figure out the details ?

Here is the decoded version:

  A: why did you tell av8d that I am a bmw ?
[8 in our language is pronounced as "ba", so av8d = everybody]

  B: Well, you are just like one of those ogs or obs ...
[ogs= oh-ji-sang, obs=oh-ba-sang, Japanese, means old guy, old
woman, respectively]

  A: oic, you think you are much q than I ?
[oic=Oh I see; q = cute]

  A: I would 3q if you stop doing so.
[ 3q = thank you ]

  B: ok.

  A: Orz
[ appreciate very much --- it looks like a guy knee down when seeing an Empire ]

  B: 88
[ bye-bye ]

  A: 881
[ bye-bye with a tone, sometimes 886 = bye-bye-loh ]

The above example is just an extremely simple one. In the real world,
they combined all sort of language sources --- mandarine, japanese,
english, taiwanese ... as well as "shape" like Orz.

This kind of mixture-of-everything is widely used in young
generations, sometimes called "net terms", sometimes called "Martian
words". It faciliates the online activities among youngists, but
creates huge 'generation gaps' --- some dictionaries were published
for high school teachers to study in order for them to talk and
understand their students.

IMO, a language is a living organism, it has its own life and often
evolves with unexpected turns. Maybe in the future some of those
Martian Words will become part of formal Taiwanese, who knows ? :)

> First of all, they are, much more than Western alphabets,
> strict about stroke order and direction (technically the
> Roman alphabet is supposed to be drawn a certain way, but
> many people "cheat" -- I think that's harder to get away
> with with Asian characters, because they tend not to look
> right when drawn wrong).  And when you have the actual
> stroke sequence data as input, recognition is easier and
> more reliable (I think that was the point behind the
> "graffiti" system for the Palm Pilot).

But ... to my knowledge, all of the input tablets that using OCR has a
training feature. You can teach the program to recognize your own
order of strokes. The ability to train (be trained) is a very key
element of such an input device.

--
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
Runsun Pan, PhD
python.pan at gmail.com
Nat'l Center for Macromolecular Imaging
http://ncmi.bcm.tmc.edu/ncmi/
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~



More information about the Python-list mailing list