string repr in 2.1

Robin Becker robin at jessikat.fsnet.co.uk
Tue May 29 07:42:46 EDT 2001


In article <mailman.991124742.8047.python-list at python.org>, Thomas Wouters <thomas at xs4all.net>
writes
>On Tue, May 29, 2001 at 12:47:39AM +0100, Robin Becker wrote:
>> In article <slrn9h5m4o.1hk.scarblac at pino.selwerd.nl>, Remco Gerlich
>> <scarblac at pino.selwerd.nl> writes
>
>> >Since 2.1, string repr uses heximal escapes instead of octal ones.
>
>> yes I guess all those *nix tools that like octal should be whipped and
>> made to obey the malevolent dictator.
>
>Do you have tools you use to parse quoted (repr'd) Python strings that
>handle octal correctly, but don't handle \x and \n\r escape codes ? Which
>ones ? And were you aware that they were going to break sooner or later,
>just because someone can prefer 'readable' escape codes and feed it that
>instead ? :)
>
On related notes if repr no longer does octal escapes does string input still accept them?

Also my trial of the 2.0 code versus an initial 2.1 coding reveals that an re based approach is
very slow :( Anyone know simply what's involved in doing a codec?

My test code looks like

######################################
import sys, string
from time import time
_ESCAPELIST=256*[')']
for c, n in {'\\':'\\\\', '(':'\\(',')':'\\)'}.items():
     _ESCAPELIST[ord(c)]=n
for c in range(0,32)+range(127,256):
     n = '\\'
     for i in (6,3,0):
          n += "01234567"[(c>>i)&7]
     _ESCAPELIST[c] = n
import re
_ESCAPEPAT = re.compile(r'[\\\(\)\000-\037\177-\377]')
del i, n, c

def _ESCAPESUB(m):
     return _ESCAPELIST[ord(m.group())]

def _escape21(self,s):
     return re.sub(_ESCAPEPAT,_ESCAPESUB,s)

def _escape20(self, s):
     """PDF escapes are like Python ones, but brackets need slashes before them too.
     Use Python's repr function and chop off the quotes first"""
     s = repr(s)[1:-1]
     s = string.replace(s, '(','\(')
     s = string.replace(s, ')','\)')
     return s

SIN=['absncedfgijklmno \n \177 1\037 () \\ \\ ()',
'The quick brown fox jumped over the lazy fox',
'\000\243', '(                 )']
N=len(sys.argv)<=1 and 10000 or int(sys.argv[1])

def do_time(f,N,SIN):
     n = len(SIN)
     J = xrange(n)
     SOUT=range(n)
     t0 = time()
     for i in xrange(N):
          for j in J:
                SOUT[j] = f(None,SIN[j])
     print 'Time for %dx%d uses of %s = %.2f"' % (N,n,f,(time()-t0))
     return SOUT

SOUT20 = do_time(_escape20,N,SIN)
SOUT21 = do_time(_escape21,N,SIN)

print SOUT20
print SOUT21
print SOUT20==SOUT21
######################################
C:\Tmp>doit.py
Time for 10000x4 uses of <function _escape20 at 007E71E4> = 1.75"
Time for 10000x4 uses of <function _escape21 at 007EE8D4> = 7.80"
['absncedfgijklmno \\012 \\177 1\\037 \\(\\) \\\\ \\\\ \\(\\)', 'The quick brown fox jumped over
the lazy fox', '\\000\\
243', '\\(                 \\)']
['absncedfgijklmno \\012 \\177 1\\037 \\(\\) \\\\ \\\\ \\(\\)', 'The quick brown fox jumped over
the lazy fox', '\\000\\
243', '\\(                 \\)']
1
-- 
Robin Becker



More information about the Python-list mailing list