Changing every other instance of <B> in a file
Carel Fellinger
cfelling at iae.nl
Tue Mar 27 10:20:09 EST 2001
Lars Klæboe <larskl at klassekampen.no> wrote:
...
> Blablabla <B> talktalk <B> blabla blabla balbalblabla
> The resulting file.html (html)
> Blablabla <B> talktalk </B> blabla blabla balbalblabla
> As you can tell, every other instance of <B> is to be changed into </B>.
Let's hope those B-tags aren't nested, then the following might work:
###define an input
file = """
<BB> talktalk<BB> talktalk
<BB> talktalk
<CC> blabla blabla balbalblabla
<B> talktalk
<B> blabla blabla balbalblabla
<CC> talktalk
<BB> blabla blabla balbalblabla
"""
###define the appropriate html equivalent
result = """
<BB> talktalk<\BB> talktalk
<BB> talktalk
<CC> blabla blabla balbalblabla
<B> talktalk
<\B> blabla blabla balbalblabla
<\CC> talktalk
<\BB> blabla blabla balbalblabla
"""
import re
### re.sub can deal with functions instead of simple strings to substitute
### but we need a function with state, so let's use a callable class instead
class Change:
def __init__(self):
self.dict = {}
def __call__(self, matchobj):
key = matchobj.group()
val = self.dict.get(key, key)
if key == val:
self.dict[key] = val[:1] + '\\' + val[1:]
else:
self.dict[key] = key
return val
assert result == re.sub(r'(<[^>]+>)', Change(), file)
--
groetjes, carel
More information about the Python-list
mailing list