2.3 encoding parsing bug

Edward K. Ream edreamleo at charter.net
Tue Feb 17 18:09:48 EST 2004


> > To make myself perfectly clear: Python has absolutely no right to
complain
> > about comment lines that do not have the form:
> >
> > # -*- coding: <encoding> -*-
>
> It does. Please see
>
> http://www.python.org/doc/current/ref/encodings.html
>
> This is the precise specification; Python looks for a certain regular
> expression.

Ah jeez :-)

The regular expression 'coding[=:]\s*([\w-_.]+)' matches so much more than
the "recommended" lines,

# -*- coding: <encoding> -*-

and

# vim:fileencoding=<encoding-name>

This is most annoying.  It looks like Leo will have to change file formats
to accommodate this.  I could hack a special case for .py files, I suppose,
but any such hack still amounts to a change in file format.

Is there any chance of modifying the re to reduce the possibility of
confusion and "false matches"?  For example, matching only 'coding' and
'fileeencoding' (as words).

Thanks for your clarification of the situation.  I suppose I'll have to look
more closely at PEP's in the future.  These over-general encoding
declarations seem like a pretty low blow.

Edward

P.S.  I just looked at pep 263:

[quote]
To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file:

      #!/usr/bin/python
      # -*- coding: <encoding name> -*-

More precise, the first or second line must match the regular
expression "coding[:=]\s*([\w-_]+)".
[end quote]

This was just a really bad idea, put forward in stealth, buried in an re.
Having a _restricted_ kind of special-purpose comment is one thing:  having
a way-too-general kind of special-purpose comment is wrong, wrong, wrong.
It needlessly invalidates comments that _should_ have been none of Python's
business.

My guess is that I could have read this many times without having the
slightest hint of danger: the re bears almost no relation to the English
words.  I'm gnashing my teeth.

EKR
--------------------------------------------------------------------
Edward K. Ream   email:  edreamleo at charter.net
Leo: Literate Editor with Outlines
Leo: http://webpages.charter.net/edreamleo/front.html
--------------------------------------------------------------------





More information about the Python-list mailing list