PEP: Defining Unicode Literal Encodings (revision 1.1)

John W. Baxter jwbaxter at spamcop.com
Sun Jul 15 18:46:39 EDT 2001


In article <mailman.995126666.3898.python-list at python.org>, M.-A.
Lemburg <mal at lemburg.com> wrote:

> I already mentioned allowing directives in comments to work around
> the problem of directive placement before the first doc-string.
> 
> The above would then look like this:
> 
> #!/usr/local/bin/python
> # directive unicodeencoding='utf-8'
> u""" UTF-8 doc-string """
> 
> The downside of this is that parsing comments breaks the current
> tokenizing scheme in Python: the tokenizer removes comments before
> passing the tokens to the compiler ...wouldn't be hard to 
> fix though ;-) (note that tokenize.py does not)

I don't like the idea that removal of a comment changes the meaning of
the non-comment part of the source text...program text really should
mean the same thing without the comments.

But, I dislike this less than most of the previous ideas for dealing
with unicode docstrings.

Elsewhere, In article <mailman.995139510.24509.python-list at python.org>,
Tim Peters <tim.one at home.com> wrote:

> Another alternative:
> 
> #!/usr/local/python
> directive unicodeencoding 'utf-8'
> 
> __doc__ = u"""
>         This is a Unicode doc-string
> """
> 
> That is, the module docstring is just the module's __doc__ attr, and that
> can be bound explicitly (a trick I've sometimes use for *computed* module
> docstrings).

Ah...this may well be the way out.  It doesn't hurt the docstring tools
which use import.  Since it's already legal Python (isn't it?)--and is
used sometimes by Tim--it doesn't break a parsing docstring tool which
isn't already broken (although it may make the problem show up more
often).  And it doesn't affect those modules whose docstrings aren't
unicode and are done in the usual way.

The __doc__ = ...
method seems better than any of the things *I* muttered about in
earlier posts, which are now "inoperative" (USA-centric joke) and never
were particularly sensible.

  --John



More information about the Python-list mailing list