[Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings

Guido van Rossum guido@python.org
Mon, 15 Jul 2002 16:41:15 -0400


> > Who's gonna make the necessary changes to IDLE?
> 
> I am. idlefork patch #508973 implements most of that, but doesn't
> support UTF-8 signatures. It also doesn't give good diagnostics if the
> user did not declare an encoding but uses non-ASCII.

Cool.

> > > Allowing arbitrary Unicode in identifiers is no challenge, either,
> > > except that __dict__ dictionaries may suddenly find Unicode as keys.
> > > I'm not sure what other implications this would have, so it definitely
> > > is a separate issue.
> > 
> > As long as the only use of 8-bit strings is to contain pure ASCII,
> > this shouldn't be a problem.
> 
> I thought we were talking about non-ASCII in identifiers.

Yes, but all the non-ASCII has to be represented as Unicode strings.
I.e. no Latin-1 in 8-bit strings!

> > > Another issue with allowing Unicode is that a good definition of
> > > "letter" must be given (it clearly should not depend on the
> > > locale). The Unicode consortium gives guidelines, but those depend on
> > > the Unicode version.
> > 
> > I'd just use the isalpha() method of Unicode string objects.
> 
> That might vary across platforms (which I consider a bug) and across
> Python releases.

Really?  I thought Unicode's isalpha() was built on the Unicode text
database?

--Guido van Rossum (home page: http://www.python.org/~guido/)