regular expressions

J. Cliff Dyer jcd at sdf.lonestar.org
Wed Nov 7 07:35:27 EST 2007


Paul McGuire wrote:
> On Nov 6, 11:07 am, "J. Clifford Dyer" <j... at sdf.lonestar.org> wrote:
>   
>> On Tue, Nov 06, 2007 at 08:49:33AM -0800, krishnamanenia... at gmail.com wrote regarding regular expressions:
>>
>>
>>
>>     
>>> hi i am looking for pattern in regular expreesion that replaces
>>> anything starting with and betweeen http:// until /
>>> likehttp://www.start.com/startservice/yellow/fdhttp://helo/abcdwill
>>> be replaced as
>>> p/startservice/yellow/ fdp/abcd
>>>       
>> You don't need regular expressions to do that.  Look into the methods that strings have.  Look at slicing. Look at len.  Keep your code readable for future generations.
>>
>> Py>>> help(str)
>> Py>>> dir(str)
>> Py>>> help(str.startswith)
>>
>> Cheers,
>> Cliff
>>     
>
> Look again at the sample input.  Some of the OP's replacement targets
> are not at the beginning of a word, so str.startswith wont be much
> help.
>
> Here are 2 solutions, one using re, one using pyparsing.
>
> -- Paul
>
>
> instr = """
> anything starting with and betweeen "http://" until "/"
> like http://www.start.com/startservice/yellow/ fdhttp://helo/abcd
> will
> be replaced as
> """
>
> REPLACE_STRING = "p"
>
> # an re solution
> import re
> print re.sub("http://[^/]*", REPLACE_STRING, instr)
>
>
> # a pyparsing solution - with handling of target strings inside quotes
> from pyparsing import SkipTo, replaceWith, quotedString
>
> replPattern = "http://" + SkipTo("/")
> replPattern.setParseAction( replaceWith(REPLACE_STRING) )
> replPattern.ignore(quotedString)
>
> print replPattern.transformString(instr)
>
>
> Prints:
>
> anything starting with and betweeen "p/"
> like p/startservice/yellow/ fdp/abcd will
> be replaced as
>
>
> anything starting with and betweeen "http://" until "/"
> like p/startservice/yellow/ fdp/abcd will
> be replaced as
>
>   
Interesting.  In my email clients, they do show up at the beginning of
words (thunderbird and mutt), but in your reply they aren't.  I wonder
if there's some funky unicode space that your computer isn't
rendering....  Or something on my end.  There were definitely spaces in
his email as it appears on my computer.

But if there aren't, s.startswith() is clearly not the way to go.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071107/83d6c424/attachment.html>


More information about the Python-list mailing list