Another re question
Kent Polk
kent at tiamat.goathill.org
Tue Oct 24 14:05:37 EDT 2000
On Mon, 23 Oct 2000 20:43:20 -0400, Stephen Kloder wrote:
---------
>>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
>>> re.findall(findpid_pat,sid2pid)
['1X4567', '1 4853', '1X0608']
---------
Thanks. It works great in the cases I provided. Unfortunately, I
forgot about one case - where the first name can be blank (spaces).
>>> sid2pid="\n1 1979 1X4567\n00031 1 4853\n1S0959 1X0608\n 3S4267\n"
>>> print sid2pid
1 1979 1X4567
00031 1 4853
1S0959 1X0608
3S4267
>>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
>>> re.findall(findpid_pat,sid2pid)
['1X4567', '1 4853', '1X0608']
which misses my (new) last case.
I was using the logical or because I couldn't figure out how to
specify them together. Using your example to clean my stuff up
results in:
>>> findpid_pat = r'\012+\d |0*\w*[\t, ]+([\w ]+)'
>>> re.findall(findpid_pat, sid2pid)
['', '1X4567', '1 4853', '1X0608', '3S4267']
I don't understand how a empty string matches in this last
case. Separately they are:
>>> findpid_pat = r'\012+\d *\w*[\t, ]+([\w ]+)'
>>> re.findall(findpid_pat, sid2pid)
['1X4567', '1 4853', '1X0608']
and
>>> findpid_pat = r'\012+0*\w*[\t, ]+([\w ]+)'
>>> re.findall(findpid_pat, sid2pid)
['1979 1X4567', '1 4853', '1X0608', '3S4267']
Thanks Much!
More information about the Python-list
mailing list