doing hundreds of re.subs efficiently on large strings

John Machin sjmachin at lexicon.net
Thu Mar 27 06:09:56 EST 2003


nihilo <exnihilo at NOmyrealCAPSbox.com> wrote in message news:<3E82911D.3030807 at NOmyrealCAPSbox.com>...
> 
> I am sorry but I am still not sure how you are suggesting the 
> replacement occur. After I have the iterator, what would I do with each 
> of the match objects in the iterator?

Below is an example of what Anders was talking about.
HTH,
John

=== file raboof.by ===
import re
from_str = ["foo", "bar", "zot"]
to_str = ["oofay", "arbay", "otzay"]
pat = "|".join(["(" + x + ")" for x in from_str])
print pat
text_in = "fee fie foo and bar or zot then foo again"
asm = []
asmap = asm.append
lpos = 0
for m in re.finditer(pat, text_in):
   spos, epos = m.span()
   grpnum = m.lastindex
   print spos, epos, grpnum
   asmap(text_in[lpos:spos])
   asmap(to_str[grpnum-1])
   lpos = epos
asmap(text_in[lpos:])
text_out = "".join(asm)
print text_in
print text_out

=== results ===
C:\junk>python raboof.py
(foo)|(bar)|(zot)
8 11 1
16 19 2
23 26 3
32 35 1
fee fie foo and bar or zot then foo again
fee fie oofay and arbay or otzay then oofay again

C:\junk>
==================




More information about the Python-list mailing list