question on regular expressions

Michael Fuhr mfuhr at fuhr.org
Fri Dec 3 13:31:18 EST 2004


Darren Dale <dd55 at cornell.edu> writes:

> I'm stuck. I'm trying to make this:
>
> file://C:%5Cfolder1%5Cfolder2%5Cmydoc1.pdf,file://C
> %5Cfolderx%5Cfoldery%5Cmydoc2.pdf
>
> (no linebreaks) look like this:
>
> ./mydoc1.pdf,./mydoc2.pdf
>
> my regular expression abilities are dismal.

This works for the example string you gave:

newstring = re.sub(r'[^,]*%5[Cc]', './', examplestring)

This replaces all instances of zero or more non-commas that are
followed by '%5C' or '%5c' with './'.  Greediness causes the pattern
to replace everything up to the last '%5C' before a comma or the
end of the string.

Regular expressions aren't the only way to do what you want.  Python
has standard modules for parsing URLs and file paths -- take a look
at urlparse, urllib/urllib2, and os.path.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/



More information about the Python-list mailing list