hex dump w/ or w/out utf-8 chars
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Tue Jul 9 05:34:08 EDT 2013
Le mardi 9 juillet 2013 09:00:02 UTC+2, Steven D'Aprano a écrit :
> On Mon, 08 Jul 2013 10:53:18 -0700, ferdy.blatsco wrote:
>
>
>
> > Not using python 3, for me (a programmer which was present at the
>
> > beginning of computer science, badly interacting with many languages
>
> > from assembler to Fortran and from c to Pascal and so on) it was an hard
>
> > job to arrange the abrupt transition from characters only equal to bytes
>
>
>
> Characters have *never* been equal to bytes. Not even Perl treats the
>
> character 'A' as equal to the byte 0x0A:
>
>
>
> if (0x0A eq 'A') {print "Equal\n";}
>
> else {print "Unequal\n";}
>
>
>
> will print Unequal, even if you replace "eq" with "==". Nor does Perl
>
> consider the character 'A' equal to 65.
>
>
>
> If you have learned to think of characters being equal to bytes, you have
>
> learned wrong.
>
>
>
>
>
> > to some special characters defined with 2, 3 bytes and even more. I
>
> > should have preferred another solution... but i'm not Guido....!
>
>
>
> What's a special character?
>
>
>
> To an Italian, the characters J, K, W, X and Y are "special characters"
>
> which do not exist in the ordinary alphabet. To a German, they are not
>
> special, but S is special because you write SS as ß, but only in
>
> lowercase.
>
>
>
> To a mathematician, σ is just as ordinary as it would be to a Greek; but
>
> the mathematician probably won't recognise ς unless she actually is
>
> Greek, even though they are the same letter.
>
>
>
> To an American electrician, Ω is an ordinary character, but ω isn't.
>
>
>
> To anyone working with angles, or temperatures, the degree symbol ° is an
>
> ordinary character, but the radian symbol is not. (I can't even find it.)
>
>
>
> The English have forgotten that W used to be a ligature for VV, and
>
> consider it a single ordinary character. But the ligature Æ is considered
>
> an old-fashioned way of writing AE.
>
>
>
> But to Danes and Norwegians, Æ is an ordinary letter, as distinct from AE
>
> as TH is from Þ. (Which English used to have.) And so on...
>
>
>
> I don't know what a special character is, unless it is the ASCII NUL
>
> character, since that terminates C strings.
--------
The concept of "special characters" does not exist.
However, the definition of a "character" is a problem
per se (character, glyph, grapheme, ...).
You are confusing Unicode, typography and linguistic.
There is no symbole for radian because mathematically
radian is a pure number, a unitless number. You can
hower sepecify a = ... in radian (rad).
Note the difference between SS and ẞ
'FRANZ-JOSEF-STRAUSS-STRAẞE'
jmf
More information about the Python-list
mailing list