[Python-Dev] directive statement (PEP 244)

Guido van Rossum guido@digicool.com
Mon, 16 Jul 2001 15:56:16 -0400


> > but last I looked if there was a docstring before the directive you
> > couldn't guarantee that the directive applied.
> 
> That was due to a misunderstanding of how the implementation could
> work... after reading your explanation below, here's a way which
> would work around this "requirement":
> 
> If the tokenizer gets to do the directive processing
> (rather than the compiler), then the placement of the directive 
> becomes irrelevant: it may only appear once per file and the tokenizer
> will see it before the compiler, so the encoding setting will already 
> have been made before the compiler even starts to compile the
> first doc-string.

Sure.  (Technically, it's not the tokenizer that interprets the
directives, but a pass that runs before the code generator runs.  The
compiler has sprouted quite a few passes lately... :-)

> No, I was never talking about editors. Paul brought that up.
> I am only concerned about telling the Python interpreter which
> encoding to assume when converting Unicode literals into
> Unicode objects -- that's all.

Well, I believe that for XML everybody (editors and other processors)
looks in the same place, right?

> He posted a clarification of what he think's is the way to go.
> I think this settles the argument.

I agree.

> Let's put it this way: are you expecting that all editors out
> there will be able to parse the Python way of defining the
> encoding of Unicode literals ?

Not right away, but this is what I would hope would happen eventually.

> My point is that I don't see editors as an issue in this discussion.

Well, anything we can do to make parsing the encoding indicator easier
for editors helps.

> > > About the magic comment: Unicode literals are translated into
> > > Unicode objects at compile time. The encoding information is
> > > vital for the decoding to succeed. If you place this information
> > > into a comment of the Python source code and have the compiler
> > > depend on it, removing the comment would break your program.
> > 
> > Yes, and so would removing a directive.  I don't see the point at
> > all.
> 
> Sure, but a user would normally not expect his program to
> fail just because he removes a comment...

Weak argument.  A magic comment is specially marked as such, e.g.

    #*encoding utf-8

You might as well say that users are prone to remove the #! comment...

> Hmm, are you suggesting to use something like the following
> instead:
> 
> __unicodeencoding__ = 'utf-8'

Not in this particular case, but for other cases where directives have
been suggested.  In this case (encoding) I'd prefer a magic comment.
I still haven't seen a good example of something for which directives
are the best solution.  Of course, it should be '__fileencoding__'. :-)

> Please see the correction I gave above and my reply to Martin which has 
> the specification of my proposed amendment.

I've seen them now.

--Guido van Rossum (home page: http://www.python.org/~guido/)