NewB question on text manipulation

ProvoWallis gshepherd281281 at yahoo.com
Wed May 3 13:29:55 EDT 2006


Thanks very much for this I really appreciate it. I've pasted what I've
got now thanks to you.

I only have one issue that I can't figure out. When I print the new
string I'm getting all of the values in the lt list rather than just
the one that corresponds to the original entry.

E.g.,

My original data looks like this:

<1><SC>FAM LAW ENF<XC>259-232<LT>-687

<1><SC>APPEAL<XC>40-38; 40-44; 44-18; 45-15<LT>1

I want my output to look like this:

<1><SC>FAM LAW ENF<XC>259-232<LT>-687
<1><SC>APPEAL<XC>40-38<LT>1
<1><SC>APPEAL<XC>40-44<LT>1
<1><SC>APPEAL<XC>44-18<LT>1
<1><SC>APPEAL<XC>45-15<LT>1

But istead I'm getting this -- all of the entries in the lt list are
being added to my string when I just want one. I'm not sure how to
select just the entry in the lt list that I want.

<1><SC>FAM LAW ENF<XC>259-232<LT>-687<LT>1
<1><SC>APPEAL<XC>40-38<LT>-687<LT>1
<1><SC>APPEAL<XC>40-44<LT>-687<LT>1
<1><SC>APPEAL<XC>44-18<LT>-687<LT>1
<1><SC>APPEAL<XC>45-15<LT>-687<LT>1


###


Here's what I've got so far:


s_space = " "  # a single space
s_empty = ""  # empty string

pat = re.compile("\s*<SC>([^<]+)<XC>([^<]+)")

lst = []

while True:
    m = pat.search(s)
    if not m:
        break

    title = m.group(1).strip()
    xc = m.group(2)
    xc = xc.replace(s_space, s_empty)
    tup = (title, xc)
    lst.append(tup)
    s = pat.sub(s_empty, s, 1)

lt = s.strip()

for title, xc in lst:
    lst_pp = xc.split(";")
    for pp in lst_pp:
        print "<1><SC>%s<XC>%s%s" % (title, pp, lt)




More information about the Python-list mailing list