RFC PEP candidate: q'<delim>'quoted<delim> ?

Bengt Richter bokr at oz.net
Fri Mar 8 16:39:56 EST 2002


On Fri, 08 Mar 2002 14:45:35 GMT, "Terry Reedy" <tejarex at yahoo.com> wrote:

>
>"Bengt Richter" <bokr at oz.net> wrote in message
>news:a69lml$b2v$0 at 216.39.172.122...
>> On Fri, 08 Mar 2002 15:54:29 +1300, Greg Ewing
><greg at cosc.canterbury.ac.nz> wrote:
>> >Even something like q'delim' doesn't allow you to
>> >easily use completely arbitrary text, because you
>> >still have to pick *some* string that doesn't occur
>> >in the text. Although you don't have to modify the
>> >text, you do have to inspect it in order to choose
>> >a suitable delimiter.
>
>> Unless you choose one with vanishingly small probability
>> of being included, like a generated guid.
>
>This was Guido's intention when he choose triple quotes and I think he
>did excellently well.  I don't believe I had even seen """ and quite
>possibly not ''' ever before learning Python.  I do know that I found
>""" to be quite jarring, as if I have never seen it before.
>
Of course, now that Python sources abound, the probability of
encountering triple quotes is no longer vanishingly small, so
what will Guido do next, if he finds a motivation to do at the next
level what he did with triple quotes?

>> A smart editor could do this for you.
>
>A smart editor could also scan for triple quotes in pasted text and
>select one that was not found as the enclosing quotes and do the octal
>quoting fixup if both were.
>
True, but on second thought, actually more likely you'd use a separate
utility to generate a guid, since it has to go look for nics etc, and
just paste it in the editor. VC++ comes with such a utility.

My thought of pasting arbitrary binary octets in a quoted context actually
potentially involves multiple encodings, though: The encoding of the source,
the encoding used by the editor capturing and displaying the source, the
encoding used by the editor for the python source, the encoding for its
display, the encoding used for python internal representation, and the
encoding used for python interactive display, to name a few. (I'd guess MvL
has thought more deeply on that than I ;-)

The thing that occurs to me is that pasting into a raw-string context might
involve a contradiction, even with current Python r'strings':

Suppose what you pasted contained a single ^G ('\x07'), what would happen
when it came time to render it on the screen? You couldn't do the normal escapes (e.g. \x07),
because in the context of r'...' that would be four characters, not one.
So how would '\x07' (single character)  be represented if it were part of pasted text?

I guess I ought to try eval("r'\x07'") and see what happens ;-)

 >>> eval("r'\x07'")
 '\x07'
 >>> list(eval("r'\x07'"))
 ['\x07']
 >>> list("r'\x07'")
 ['r', "'", '\x07', "'"]
 >>> list(r'\x07')
 ['\\', 'x', '0', '7']

Is that correct? Shouldn't the expression r'\x07' return 4 characters
as it does if you list them with list(r'\x07')?

I.e., shouldn't list(eval("r'\x07'")) return the same as list(r'\x07'),
and shouldn't eval("r'\x07'") raise an illegal-representation (bad
raw-string syntax) exception?

Would someone explain this:

 >>> eval("list(r'\x07')")
 ['\x07']
 >>> list(r'\x07')
 ['\\', 'x', '0', '7']

(Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32)

also:
 Python 1.5.2 (#1, May 28 2000, 18:04:10)  [GCC egcs-2.91.66 19990314/Linux (egcs
 - on linux2
 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
 >>> eval("list(r'\x07')")
 ['\007']
 >>> list(r'\x07')
 ['\\', 'x', '0', '7']       

Pretty consistent. What's the difference between direct interactive eval and programmed eval? 
(I guess we have a new subject ;-)

Regards,
Bengt Richter







More information about the Python-list mailing list