regex problem

Tim Chase python.list at tim.thechases.com
Wed Nov 22 15:03:25 EST 2006


> line is am trying to match is
> 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra    29.9    0.00011   1
> 
> regex i have written is
> re.compile
> (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)')
> 
> I am trying to extract 0.0011 value from the above line.
> why doesnt it match the group(4) item of the match ?
> 
> any idea whats wrong  with it ?

Well, your ".{25}\s{3}" portion only gets you to one space short 
of your 29.9, so your "(\d+..." fails to match " 29.9" because 
there's an extra space there.  My guess (from only one datum, so 
this could be /way/ off base) would be that you mean "\s{4}" or 
possibly "\s{3,4}"

It seems like a very overconstrained regexp, but it might be just 
what you need to isolate the single line (or class of line) 
amongst the chaff of thousand others of similar form.

-tkc








More information about the Python-list mailing list