raw strings
Duncan Booth
duncan at rcp.co.uk
Fri Oct 11 10:00:41 EDT 2002
mis6 at pitt.edu (Michele Simionato) wrote in
news:2259b0e2.0210110528.449ce434 at posting.google.com:
> Suppose for instance I want to substitute regexp1 with regexp2
in a
> text: in sed or perl I would give a command like
>
> s/regexp1/regexp2/
... where regexp1 is a regular expression and regexp2 is a string.
>
> In Python I must write
>
> import re
> re.compile(r'regexp1').sub(r'regexp2',text)
You could try writing re.sub(regexp1, replacement, string), or
using
your terminology:
re.sub(r'regexp1', r'regexp2', text)
where regexp2 is not a regular expression.
<snip>
> For this to work I need a raw_string function such that
>
> raw_string('regexp')==r'regexp'
I think you have a fundamental misunderstanding of what a 'raw
string'
actually is.
When Python parses your program it converts the characters
representing
a string constant into a value of type str (or unicode). There are
several ways to write any given string value for example a single
character string containing a newline could be written as any of:
'\n'
'\x0a'
'\012'
(Not to mention others such as '''
''' or even '\
\n\
').
You are asking for a function which, given the string, works out
how the
original constant was written and returns the string which would
have
resulted if the original string had been preceded by a backslash.
In
other words:
raw_string('\n') --> '\\n'
raw_string('\x0a') --> '\\x0a'
raw_string('\012') --> '\\012'
but in each case the parameter actually passed to raw_string is
the same
value, so there is no way to tell which result is required. The
result
for a single newline character could even be '\\\n\\n\\\n'.
> Is there somebody else who thinks like me ?
There are other people who misunderstand raw strings.
More information about the Python-list
mailing list