question on regular expressions
Darren Dale
dd55 at cornell.edu
Fri Dec 3 13:39:32 EST 2004
Michael Fuhr wrote:
> Darren Dale <dd55 at cornell.edu> writes:
>
>> I'm stuck. I'm trying to make this:
>>
>> file://C:%5Cfolder1%5Cfolder2%5Cmydoc1.pdf,file://C
>> %5Cfolderx%5Cfoldery%5Cmydoc2.pdf
>>
>> (no linebreaks) look like this:
>>
>> ./mydoc1.pdf,./mydoc2.pdf
>>
>> my regular expression abilities are dismal.
>
> This works for the example string you gave:
>
> newstring = re.sub(r'[^,]*%5[Cc]', './', examplestring)
>
> This replaces all instances of zero or more non-commas that are
> followed by '%5C' or '%5c' with './'. Greediness causes the pattern
> to replace everything up to the last '%5C' before a comma or the
> end of the string.
>
> Regular expressions aren't the only way to do what you want. Python
> has standard modules for parsing URLs and file paths -- take a look
> at urlparse, urllib/urllib2, and os.path.
>
Thanks to both of you. I thought re's were appropriate because the string I
gave is buried in an xml file. A more representative example is:
[...snip...]<url>file://C:%5Cfolder1%5Cfolder2%5Cmydoc1.pdf</url>[...snip...
data]<url>file://C%5Cfolderx%5Cfoldery%5Cmydoc2.pdf</url>[...snip...]
More information about the Python-list
mailing list