Source syntax escapes, new raw string representation. Was: Re: PEP 263 comments

Bengt Richter bokr at oz.net
Wed Feb 27 19:32:17 EST 2002


On Mon, 25 Feb 2002 06:20:20 +0100, "Martin v. Loewis" <martin at v.loewis.de> wrote:

>To make some progress on PEP 263, I suggest that some of the open issues
>are resolved as follows:
>
>- Comment syntax: I suggest to use the form
>  -*- coding: <coding name> -*-
>  Emacs already recognizes this syntax, as does patch #508973 
>  on IDLEfork. The other proposed syntaxes should be removed from the
>  PEP.
>
Is this to be a *special purpose* (coding declaration) comment-context-limited
escape mechanism, or an *open ended* one? If special, I think Emacs can just
as well adapt to Python as vice versa. ISTM ad hoc special escapes in comments
are often the beginning of a convention for alternative out-of-band info, and
should be considered in that light. Cf. HTML ;-/

Perhaps it is time to (re?)consider a standard Python mechanism for embedding OOB
info at arbitrary places in Python source, analogous to XML's <![CDATA[ ... ]]> and
<? ... ?> (BTW, xml CDATA has a HUGE**N wart (IMO) in that it is not nestable.
They should have taken a clue from mime delimiting IMO).

I prefer orthogonal general purpose mechanisms to ad hoc syntactic escape warts,
so I would prefer <? ... ?> as a vehicle for defining encoding. <?py blah blah?>
could mean eval('blah blah') in some defined python environment and replacing
the <? ...?> with any string returned ('' if None), and then continue processing,
starting with the replaced text. This could be used for wholesale preprocessing
or simple mode flag control side effects, as in <?py set_special_flag(1)?>
(assuming that would get eval'd in a useful context).

In conjunction with a pythonic CDATA [1], it would permit wrapping a whole source
file (or starting at the second line):

#!/usr/bin/python
<?py recode(q'--unique delimiter[1]--'
... rest of source ...
--unique delimiter[1]--)?>

('recode' here has no special meaning. It would depend on the configured execution
context for <?py ...?>)

Thus the source could be arbitrarily reprocessed before being normally interpreted.
If you want to set a special effect flag, you could just say <?py set_special(1)?>.
If you want today's date embedded in place, write <?py now()?>, assuming that
function was appropriately defined in the execution context for <?py ...?>.
BTW, if we had an alternate assignment operator (perhaps ':=') that would make
flag=1 an expression when written as flag := 1, then we could write
<?py flag:=1? instead of <?py eval(compile('flag=1','','exec'))?>

There are side effect and execution environment issues to think about, but I think
the general mechanism could be very powerful, and ways could be created to configure
its operation via site.py etc.

[1] I am proposing a special q'...' string to be very like an r'...' string except that
the string in quotes following the q is interpreted as a mime-style arbitrary delimiter
string, and the whole value of the q'delim'...delim string representation is exactly
just the characters between the delimiters. E.g., you could

    assert q'<-=delim=->'content here<-=delim=-> == 'content here' # this would be true

without getting an error.

Note the lack of quotes around the final delimiter string, since it itself is
the final delimiter. This can also be used to solve the final unescaped
backslash problem for quoting windows paths:

    q'|'c:\foo\bar\|

Also note nestability, assuming you guarantee unique delimiter strings:

    q'<::outside::>'q'|'c:\foo\bar\|<::outside::>

A null q delimiter could be defined to imply delimiting by the end of the file or
other representation container. I.e., q''<-- content up to EOF -->

Escapes are recognized according to raw string rules inside the quotes of the
q delimiter string, so it has a final backslash problem, but that shouldn't be
too hard to live with.

q'delim' ... delim should be able contain *anything*, including <? and """ and #... and
r'whatever', etc, but <? ?> should perhaps not take precedence over ordinary string
and comment contexts. You could argue both ways. Also, a different target than py in the
xxx slot of <?xxx ...?> should act per xml PI specs, probably.

A q string could theoretically allow putting unescaped arbitrary binary data in
a source file, though many editors would have problems dealing with it. Even so,
there might be some use for that.

Thinking about it, <? ..?> and q'delim'...delim might be worth separate PEPs
irrespective of PEP 263 relevance? Opinions?

(I decided to retitle this post, since that is the actual focus here).

Regards,
Bengt Richter




More information about the Python-list mailing list