[Tutor] Raw string

Tue Jul 22 04:41:03 CEST 2008

"Neven Gorsic" <neven.gorsic at gmail.com> wrote in message 
news:8acd28da0807210142j5c599a41waf3b9ae10d3cf91c at mail.gmail.com...
> On Mon, Jul 21, 2008 at 9:44 AM, Monika Jisswel
> <monjissvel at googlemail.com> wrote:
>> instead  of s='e:\mm tests\1. exp files\5.MOC-1012.exp'
>> try to use : s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace(
>> '\\', '\\\\')
>> for me here is what it gives:
>>
>>>>> s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace('\\', '\\\\')
>>>>> print s
>> e:\\mm tests\\1. exp files\\5.MOC-1012.exp
>>>>> s.split('\\\\')
>> ['e:', 'mm tests', '1. exp files', '5.MOC-1012.exp']
>>
>>
>> why \\  i you only have one \ ? : you need to escape it because its a
>> special character to the python interpreter.
>>
>>
>> the r character is important in the expression  s = r'e:\mm tests\1. exp
>> files\5.MOC-1012.exp'.replace('\\', '\\\\') because it tells the 
>> interpreter
>> to take the string as RAW and thus leave it unchanged even if it contains
>> special characters that should actually be treated special.
>>
>> now to use the r or not to use it ? when to use it ? how to use it ? I
>> always use it ! & always had the result I expected, so I would suggest 
>> you
>> use it always too specialy with the re module, ofcourse unless the result 
>> is
>> not satisaying you,
>>
>> so this code (using r this time works too) :
>>
>>>>> s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace(r'\\', r'\\\\')
>>>>> print s
>> e:\\mm tests\\1. exp files\\5.MOC-1012.exp
>>>>> s.split('\\\\')
>> ['e:', 'mm tests', '1. exp files', '5.MOC-1012.exp']
>>
>
> Thanks,
> I am aware of goodies that raw string offers, but my question was how to
> use it with variable that already contains string.  :)

What it seems you don't understand is that raw strings just a method to 
create a string.  If you already have a string read from a file, it is 
already created.  Maybe you are confused between the "representation" of a 
string and the "value" of the string.

    >>> s = 'c:\\abc\\123.txt'
    >>> t = r'c:\abc\123.txt'
    >>> s==t
    True

Two ways to *create* the *same* string.

Note there are two ways to *display* a string as well.  print displays the 
actual value.  If you don't use print, you get a representation of the 
string in a way that can be used to create the string in code.

    >>> print s
    c:\abc\123.txt
    >>> print t
    c:\abc\123.txt
    >>> s
    'c:\\abc\\123.txt'
    >>> t
    'c:\\abc\\123.txt'

Note what happens if you use single backslashes without a raw to create the 
string:

    >>> s = 'c:\abc\123.txt'
    >>> s
    'c:\x07bcS.txt'
    >>> print s
    c:bcS.txt

Because it wasn't a 'raw' string, the \a was interpreted as the 
non-printable BEL control character.  \123 was interpreted as an octal 
constant, which turned out to be capital-S.  The representation of the 
string contained a \x07 for the BEL control character.  Since it is 
unprintable, the representation displayed it as a hexadecimal escape code.

When you read a string from a file, the actual characters in the file end up 
in the string.  No backslash interpretation is performed.  So in your 
example, just read in the file and perform your operations:

sample.txt contains:

    c:\abc\123.txt

Code:

    >>> import os
    >>> pathname = open('sample.txt').read()
    >>> pathname
    'c:\\abc\\123.txt'
    >>> print pathname
    c:\abc\123.txt
    >>> print os.path.dirname(pathname)
    c:\abc
    >>> print os.path.basename(pathname)
    123.txt
    >>> os.path.dirname(pathname)
    'c:\\abc'
    >>> os.path.basename(pathname)
    '123.txt'

Does that clear up the confusion?

--Mark