raw strings

Duncan Booth duncan at rcp.co.uk
Fri Oct 11 10:00:41 EDT 2002


mis6 at pitt.edu (Michele Simionato) wrote in
news:2259b0e2.0210110528.449ce434 at posting.google.com: 

> Suppose for instance I want to substitute regexp1 with regexp2 
in a
> text: in sed or perl I would give a command like
> 
> s/regexp1/regexp2/

... where regexp1 is a regular expression and regexp2 is a string.

> 
> In Python I must write 
> 
> import re
> re.compile(r'regexp1').sub(r'regexp2',text)

You could try writing re.sub(regexp1, replacement, string), or 
using
your terminology: 
   re.sub(r'regexp1', r'regexp2', text)
where regexp2 is not a regular expression.

<snip>

> For this to work I need a raw_string function such that
> 
> raw_string('regexp')==r'regexp' 

I think you have a fundamental misunderstanding of what a 'raw 
string'
actually is. 

When Python parses your program it converts the characters 
representing
a string constant into a value of type str (or unicode). There are
several ways to write any given string value for example a single
character string containing a newline could be written as any of: 
    	'\n'
      '\x0a'
    	'\012'
(Not to mention others such as '''
''' or even '\
\n\
').

You are asking for a function which, given the string, works out 
how the
original constant was written and returns the string which would 
have
resulted if the original string had been preceded by a backslash. 
In
other words: 

    raw_string('\n') --> '\\n'
    raw_string('\x0a') --> '\\x0a'
    raw_string('\012') --> '\\012'

but in each case the parameter actually passed to raw_string is 
the same
value, so there is no way to tell which result is required. The 
result
for a single newline character could even be '\\\n\\n\\\n'. 


> Is there somebody else who thinks like me ?

There are other people who misunderstand raw strings.



More information about the Python-list mailing list