re sub help
Bengt Richter
bokr at oz.net
Sat Nov 5 19:06:09 EST 2005
On 4 Nov 2005 22:49:03 -0800, s99999999s2003 at yahoo.com wrote:
>hi
>
>i have a string :
>a =
>"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
>
>inside the string, there are "\n". I don't want to substitute the '\n'
>in between
>the [startdelim] and [enddelim] to ''. I only want to get rid of the
>'\n' everywhere else.
>
>i have read the tutorial and came across negative/positive lookahead
>and i think it can solve the problem.but am confused on how to use it.
>anyone can give me some advice? or is there better way other than
>lookaheads ...thanks..
>
Sometimes splitting and processing the pieces selectively can be a solution, e.g.,
if delimiters are properly paired, splitting (with parens to keep matches) should
give you a repeating pattern modulo 4 of
<"everywhere else" as you said><first delim><between><second delim> ...
>>> a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
>>> import re
>>> splitter = re.compile(r'(?s)(\[startdelim\]|\[enddelim\])')
>>> sp = splitter.split(a)
>>> sp
['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n']
>>> ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
'thisisasentence[startdelim]this\nis\nanother[enddelim]thisis'
>>> print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
thisisasentence[startdelim]this
is
another[enddelim]thisis
I haven't checked for corner cases, but HTH
Maybe I'll try two pairs of delimiters:
>>> a += "2222\n33\n4\n55555555[startdelim]6666\n77\n8888888[enddelim]9999\n00\n"
>>> sp = splitter.split(a)
>>> print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
thisisasentence[startdelim]this
is
another[enddelim]thisis222233455555555[startdelim]6666
77
8888888[enddelim]999900
which came from
>>> sp
['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n2222\n33
\n4\n55555555', '[startdelim]', '6666\n77\n8888888', '[enddelim]', '9999\n00\n']
Which had the replacing when not i%4 was true
>>> for i,s in enumerate(sp): print '%6s: %r'%(not i%4,s)
...
True: 'this\nis\na\nsentence'
False: '[startdelim]'
False: 'this\nis\nanother'
False: '[enddelim]'
True: 'this\nis\n2222\n33\n4\n55555555'
False: '[startdelim]'
False: '6666\n77\n8888888'
False: '[enddelim]'
True: '9999\n00\n'
Regards,
Bengt Richter
More information about the Python-list
mailing list