matching patterns after regex?
Martin
mdekauwe at gmail.com
Wed Aug 12 08:12:22 EDT 2009
On Aug 12, 12:53 pm, Bernard <bernard.ch... at gmail.com> wrote:
> On 12 août, 06:15, Martin <mdeka... at gmail.com> wrote:
>
>
>
> > Hi,
>
> > I have a string (see below) and ideally I would like to pull out the
> > decimal number which follows the bounding coordinate information. For
> > example ideal from this string I would return...
>
> > s = '\nGROUP = ARCHIVEDMETADATA\n
> > GROUPTYPE = MASTERGROUP\n\n GROUP =
> > BOUNDINGRECTANGLE\n\n OBJECT =
> > NORTHBOUNDINGCOORDINATE\n NUM_VAL = 1\n
> > VALUE = 19.9999999982039\n END_OBJECT =
> > NORTHBOUNDINGCOORDINATE\n\n OBJECT =
> > SOUTHBOUNDINGCOORDINATE\n NUM_VAL = 1\n
> > VALUE = 9.99999999910197\n END_OBJECT =
> > SOUTHBOUNDINGCOORDINATE\n\n OBJECT =
> > EASTBOUNDINGCOORDINATE\n NUM_VAL = 1\n
> > VALUE = 10.6506458717851\n END_OBJECT =
> > EASTBOUNDINGCOORDINATE\n\n OBJECT =
> > WESTBOUNDINGCOORDINATE\n NUM_VAL = 1\n
> > VALUE = 4.3188348375893e-15\n END_OBJECT
> > = WESTBOUNDINGCOORDINATE\n\n END_GROUP
>
> > NORTHBOUNDINGCOORDINATE = 19.9999999982039
> > SOUTHBOUNDINGCOORDINATE = 9.99999999910197
> > EASTBOUNDINGCOORDINATE = 10.6506458717851
> > WESTBOUNDINGCOORDINATE = 4.3188348375893e-15
>
> > so far I have only managed to extract the numbers by doing re.findall
> > ("[\d.]*\d", s), which returns
>
> > ['1',
> > '19.9999999982039',
> > '1',
> > '9.99999999910197',
> > '1',
> > '10.6506458717851',
> > '1',
> > '4.3188348375893',
> > '15',
> > etc.
>
> > Now the first problem that I can see is that my string match chops off
> > the "e-15" part and I am not sure how to incorporate the potential for
> > that in my pattern match. Does anyone have any suggestions as to how I
> > could also match this? Ideally I would have a statement which printed
> > the number between the two bounding coordinate strings for example
>
> > NORTHBOUNDINGCOORDINATE\n NUM_VAL = 1\n
> > VALUE = 19.9999999982039\n END_OBJECT =
> > NORTHBOUNDINGCOORDINATE\n\n
>
> > Something that matched "NORTHBOUNDINGCOORDINATE" and printed the
> > decimal number before it hit the next string
> > "NORTHBOUNDINGCOORDINATE". But I am not sure how to do this. any
> > suggestions would be appreciated.
>
> > Many thanks
>
> > Martin
>
> Hey Martin,
>
> here's a regex I've just tested : (\w+COORDINATE).*\s+VALUE\s+=\s([\d\.
> \w-]+)
>
> the first match corresponds to the whateverBOUNDINGCOORDINATE and the
> second match is the value.
>
> please provide some more entries if you'd like me to test my regex
> some more :)
>
> cheers
>
> Bernard
Thanks Bernard it doesn't seem to be working for me...
I tried
re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
is that what you meant? Apologies if not, that results in a syntax
error:
In [557]: re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
------------------------------------------------------------
File "<ipython console>", line 1
re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
^
SyntaxError: unexpected character after line continuation character
Thanks
More information about the Python-list
mailing list