[Tutor] Raw string
Mark Tolonen
metolone+gmane at gmail.com
Tue Jul 22 04:41:03 CEST 2008
"Neven Gorsic" <neven.gorsic at gmail.com> wrote in message
news:8acd28da0807210142j5c599a41waf3b9ae10d3cf91c at mail.gmail.com...
> On Mon, Jul 21, 2008 at 9:44 AM, Monika Jisswel
> <monjissvel at googlemail.com> wrote:
>> instead of s='e:\mm tests\1. exp files\5.MOC-1012.exp'
>> try to use : s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace(
>> '\\', '\\\\')
>> for me here is what it gives:
>>
>>>>> s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace('\\', '\\\\')
>>>>> print s
>> e:\\mm tests\\1. exp files\\5.MOC-1012.exp
>>>>> s.split('\\\\')
>> ['e:', 'mm tests', '1. exp files', '5.MOC-1012.exp']
>>
>>
>> why \\ i you only have one \ ? : you need to escape it because its a
>> special character to the python interpreter.
>>
>>
>> the r character is important in the expression s = r'e:\mm tests\1. exp
>> files\5.MOC-1012.exp'.replace('\\', '\\\\') because it tells the
>> interpreter
>> to take the string as RAW and thus leave it unchanged even if it contains
>> special characters that should actually be treated special.
>>
>> now to use the r or not to use it ? when to use it ? how to use it ? I
>> always use it ! & always had the result I expected, so I would suggest
>> you
>> use it always too specialy with the re module, ofcourse unless the result
>> is
>> not satisaying you,
>>
>> so this code (using r this time works too) :
>>
>>>>> s = r'e:\mm tests\1. exp files\5.MOC-1012.exp'.replace(r'\\', r'\\\\')
>>>>> print s
>> e:\\mm tests\\1. exp files\\5.MOC-1012.exp
>>>>> s.split('\\\\')
>> ['e:', 'mm tests', '1. exp files', '5.MOC-1012.exp']
>>
>
> Thanks,
> I am aware of goodies that raw string offers, but my question was how to
> use it with variable that already contains string. :)
What it seems you don't understand is that raw strings just a method to
create a string. If you already have a string read from a file, it is
already created. Maybe you are confused between the "representation" of a
string and the "value" of the string.
>>> s = 'c:\\abc\\123.txt'
>>> t = r'c:\abc\123.txt'
>>> s==t
True
Two ways to *create* the *same* string.
Note there are two ways to *display* a string as well. print displays the
actual value. If you don't use print, you get a representation of the
string in a way that can be used to create the string in code.
>>> print s
c:\abc\123.txt
>>> print t
c:\abc\123.txt
>>> s
'c:\\abc\\123.txt'
>>> t
'c:\\abc\\123.txt'
Note what happens if you use single backslashes without a raw to create the
string:
>>> s = 'c:\abc\123.txt'
>>> s
'c:\x07bcS.txt'
>>> print s
c:bcS.txt
Because it wasn't a 'raw' string, the \a was interpreted as the
non-printable BEL control character. \123 was interpreted as an octal
constant, which turned out to be capital-S. The representation of the
string contained a \x07 for the BEL control character. Since it is
unprintable, the representation displayed it as a hexadecimal escape code.
When you read a string from a file, the actual characters in the file end up
in the string. No backslash interpretation is performed. So in your
example, just read in the file and perform your operations:
sample.txt contains:
c:\abc\123.txt
Code:
>>> import os
>>> pathname = open('sample.txt').read()
>>> pathname
'c:\\abc\\123.txt'
>>> print pathname
c:\abc\123.txt
>>> print os.path.dirname(pathname)
c:\abc
>>> print os.path.basename(pathname)
123.txt
>>> os.path.dirname(pathname)
'c:\\abc'
>>> os.path.basename(pathname)
'123.txt'
Does that clear up the confusion?
--Mark
More information about the Tutor
mailing list