Multibyte Character Surport for Python
Chris Liechti
cliechti at gmx.net
Wed May 8 19:45:33 EDT 2002
martin at v.loewis.de (Martin v. Loewis) wrote in
news:m3vg9yjthd.fsf at mira.informatik.hu-berlin.de:
> PEP 263 will introduce the notion of source encodings - without this,
> it wouldn't even be possible to parse the source code, anymore. The
> PEP, over months, had a question in it asking whether non-ASCII
> identifiers should be allowed (the follow-up question would then be:
> which ones?), and nobody ever spoke up requesting such a feature.
i wouldn't allow non ASCII chars. not because i don't like them - i write
german so i need äöü - but think of someone in a foreign country who just
does not have those keys on his keyboard. how is he supposed to enter a
variable with such characters?
or better use chinese symbols - i don't know what they mean, not
even speaking of how to pronounce them. should i enter variable names as
pictures, taking my digicam because i can't paint that good by hand?
also note Alex's comment about the natural language. how many languages
must a programmer learn to work on sources if english isn't sufficient?
of course that restriction on characters doesn't need to be for strings and
comments. (some comments aren't readable anyway even if you know the
language where the words are taken from ;-)
(the PEP resticts to identifiers to ASCII only - good)
and how many encodings will be allowed? need i have to a zillion code pages
on my machine to run modules i find on the net? ok, much from the unicode
stuff can be reused, but what for smaller targets, startup time etc.
regarding the PEP263.
- i think i don't like "coding" it's not the obvious name for me.
i'm more used to "encoding" like used with HTML and MIME.
- why use ASCII as default encoding in the future and not UTF-8 (or Latin-
1)? ASCII is a subset of UTF8 and it would allow the rest of the world to
leave the default when using a unicode aware editor. i think it will become
very nasty if you must write the correct encoding in each source file...
or is it by intention that smallest available encoding of all is taken to
enforce more typing?
but basicaly i think the PEP is a good idea.
chris
--
Chris <cliechti at gmx.net>
More information about the Python-list
mailing list