splitting perl-style find/replace regexp using python

James Stroud jstroud at mbi.ucla.edu
Thu Mar 1 05:09:50 EST 2007


John Pye wrote:
> Hi all
> 
> I have a file with a bunch of perl regular expressions like so:
> 
> /(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/ #
> bold
> /(^|[\s\(])\_\_([^ ].*?[^ ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/
> b>''$3/ # italic bold
> /(^|[\s\(])\_([^ ].*?[^ ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/ #
> italic
> 
> These are all find/replace expressions delimited as '/search/replace/
> # comment' where 'search' is the regular expression we're searching
> for and 'replace' is the replacement expression.
> 
> Is there an easy and general way that I can split these perl-style
> find-and-replace expressions into something I can use with Python, eg
> re.sub('search','replace',str) ?
> 
> I though generally it would be good enough to split on '/' but as you
> see the <\/b> messes that up. I really don't want to learn perl
> here :-)
> 
> Cheers
> JP
> 

This could be more general, in principal a perl regex could end with a 
"\", e.g. "\\/", but I'm guessing that won't happen here.

py> for p in perlish:
...   print p
...
/(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/
/(^|[\s\(])\_\_([^ ].*?[^ ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/b>''$3/
/(^|[\s\(])\_([^ ].*?[^ ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/
py> import re
py> splitter = re.compile(r'[^\\]/')
py> for p in perlish:
...   print splitter.split(p)
...
['/(^|[\\s\\(])\\*([^ ].*?[^ ])\\*([\\s\\)\\.\\,\\:\\;\\!\\?]|$', 
"$1'''$2'''$", '']
['/(^|[\\s\\(])\\_\\_([^ ].*?[^ ])\\_\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', 
"$1''<b>$2<\\/b>''$", '']
['/(^|[\\s\\(])\\_([^ ].*?[^ ])\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', 
"$1''$2''$", '']

(I'm hoping this doesn't wrap!)

James



More information about the Python-list mailing list