[Python-Dev] directive statement (PEP 244)
Guido van Rossum
guido@digicool.com
Mon, 16 Jul 2001 15:56:16 -0400
> > but last I looked if there was a docstring before the directive you
> > couldn't guarantee that the directive applied.
>
> That was due to a misunderstanding of how the implementation could
> work... after reading your explanation below, here's a way which
> would work around this "requirement":
>
> If the tokenizer gets to do the directive processing
> (rather than the compiler), then the placement of the directive
> becomes irrelevant: it may only appear once per file and the tokenizer
> will see it before the compiler, so the encoding setting will already
> have been made before the compiler even starts to compile the
> first doc-string.
Sure. (Technically, it's not the tokenizer that interprets the
directives, but a pass that runs before the code generator runs. The
compiler has sprouted quite a few passes lately... :-)
> No, I was never talking about editors. Paul brought that up.
> I am only concerned about telling the Python interpreter which
> encoding to assume when converting Unicode literals into
> Unicode objects -- that's all.
Well, I believe that for XML everybody (editors and other processors)
looks in the same place, right?
> He posted a clarification of what he think's is the way to go.
> I think this settles the argument.
I agree.
> Let's put it this way: are you expecting that all editors out
> there will be able to parse the Python way of defining the
> encoding of Unicode literals ?
Not right away, but this is what I would hope would happen eventually.
> My point is that I don't see editors as an issue in this discussion.
Well, anything we can do to make parsing the encoding indicator easier
for editors helps.
> > > About the magic comment: Unicode literals are translated into
> > > Unicode objects at compile time. The encoding information is
> > > vital for the decoding to succeed. If you place this information
> > > into a comment of the Python source code and have the compiler
> > > depend on it, removing the comment would break your program.
> >
> > Yes, and so would removing a directive. I don't see the point at
> > all.
>
> Sure, but a user would normally not expect his program to
> fail just because he removes a comment...
Weak argument. A magic comment is specially marked as such, e.g.
#*encoding utf-8
You might as well say that users are prone to remove the #! comment...
> Hmm, are you suggesting to use something like the following
> instead:
>
> __unicodeencoding__ = 'utf-8'
Not in this particular case, but for other cases where directives have
been suggested. In this case (encoding) I'd prefer a magic comment.
I still haven't seen a good example of something for which directives
are the best solution. Of course, it should be '__fileencoding__'. :-)
> Please see the correction I gave above and my reply to Martin which has
> the specification of my proposed amendment.
I've seen them now.
--Guido van Rossum (home page: http://www.python.org/~guido/)