[Python-Dev] directive statement (PEP 244)

M.-A. Lemburg mal@lemburg.com
Mon, 16 Jul 2001 21:14:43 +0200


Guido van Rossum wrote:
> 
> > > MAL seems to want two other changes: directive should be allowed
> > > (required???)
> >
> > "allowed" not "required".
> 
> but last I looked if there was a docstring before the directive you
> couldn't guarantee that the directive applied.

That was due to a misunderstanding of how the implementation could
work... after reading your explanation below, here's a way which
would work around this "requirement":

If the tokenizer gets to do the directive processing
(rather than the compiler), then the placement of the directive 
becomes irrelevant: it may only appear once per file and the tokenizer
will see it before the compiler, so the encoding setting will already 
have been made before the compiler even starts to compile the
first doc-string.
 
> > > before the module docstring, and it should support the
> > > syntax from his proto-PEP (directive key = value).
> > >
> > > But MAL and PaulP don't seem to agree on the semantics of this
> > > directive, and I haven't gotten a good answer why we can't do that
> > > with a magic comment.
> >
> > We don't ?
> 
> It seems to me that each post from you gets a response from Paul with
> some kind of objection, and vice versa.  Maybe you're converging, but
> I don't see where you are converging yet.  Also, your arguments
> sometimes seem contradictory.  For example, Paul has said that you may
> need a comment with an editor-specific encoding indicator, while you
> were expecting editors to look at the directive and made this a reason
> why the directive should precede the docstring.

No, I was never talking about editors. Paul brought that up.
I am only concerned about telling the Python interpreter which
encoding to assume when converting Unicode literals into
Unicode objects -- that's all.
 
> > Paul suggested adding encoding directives for 8-bit
> > strings and comments, but these cannot be used by the Python
> > compiler in any way and would only be for the benefit of an
> > editor, so I don't really see the need for them.
> 
> Another indication you two aren't on the same page just yet.

He posted a clarification of what he think's is the way to go.
I think this settles the argument.
 
> > A programmer
> > can still add some editor specific comment to the source file
> > to tell the editor in what encoding to display the file, but this
> > information is really only useful for the editor, not the
> > Python compiler.
> 
> This redundancy worries me though.  Are we going to encourage people
> to use an editor-specific comment for each editor out there that could
> be used to touch the file?

Let's put it this way: are you expecting that all editors out
there will be able to parse the Python way of defining the
encoding of Unicode literals ?

My point is that I don't see editors as an issue in this discussion.
 
> > About the magic comment: Unicode literals are translated into
> > Unicode objects at compile time. The encoding information is
> > vital for the decoding to succeed. If you place this information
> > into a comment of the Python source code and have the compiler
> > depend on it, removing the comment would break your program.
> 
> Yes, and so would removing a directive.  I don't see the point at
> all.

Sure, but a user would normally not expect his program to
fail just because he removes a comment...
 
> > I don't think that's good language design (besides, we already
> > have enough Unicode magic in Python already...), but then
> > people may feel different about this.
> 
> Directives come with their own set of magic.
> 
> > > In the mean time, I've decided to enable the yield keyword with a
> > > future statement.  In general I now prefer using future statements for
> > > enabling future features over the directive statement.
> > >
> > > So it's still unclear if we want a directive...
> >
> > One way or another we need a way to specify compiler parameters
> > and settings on a per-source file basis. Whether you call it
> > directive, pragma or magic comment is really secondary and only
> > a matter of language design.
> 
> I still haven't seen this need demonstrated.  Most purported uses of
> these are better done with existing mechanisms.  For example, in PEP
> 253 I propose an assignment to a global __metaclass__ to set the
> default class for a baseless class statement.

Hmm, are you suggesting to use something like the following
instead:

__unicodeencoding__ = 'utf-8'

> > I've only chosen PEP 244 as basis for the PEP because it seemed
> > to fit the need. If you decide to go down some other path,
> > then I'll happily update the PEP to whatever becomes part of
> > Python.
> 
> But you're implying without clearly specifying all sorts of amendments
> to PEP 244, which weakens your position.
>
> For example, PEP 244 allows a doc string before the directive, but you
> indicated that the directive can only affect strings that occur after
> it.  I don't think this is true: the creation of actual string objects
> is done after the whole file has been parsed, is it wouldn't be hard
> to collect and interpret all directives before creating code objects.

Please see the correction I gave above and my reply to Martin which has 
the specification of my proposed amendment.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/