String manipulation advice needed.
Bengt Richter
bokr at oz.net
Wed Oct 13 19:59:01 EDT 2004
On Wed, 13 Oct 2004 22:26:23 +0200, "Fredrik Lundh" <fredrik at pythonware.com> wrote:
>"Raaijmakers, Vincent (GE Infrastructure)" wrote:
>
>> What is the easiest way of getting this information out of a string:
>
>making some reasonable assumptions, and generalising to any number
>of information instances:
>
>> foo = "My number 70 is what I want to parse" => 70
>
>print re.findall("\d+", foo)
>
>> foo = "My info {info} between the curly brackets is what I want to parse" => {info}
>
>print re.findall("{[^}]+}", foo)
>
>> foo = "My info between [hello world] is what I want to parse" => [hello world]
>
>print re.findall("\[[^]]+\]", foo)
>
>or, as a one-size-fits-all pattern:
>
>print re.findall("\d+|{[^}]+}|\[[^]]+\]", foo)
>
You can also tag the alternatives with names, so you can know what was found, e.g.,
>>> foo = 'grabage {curly stuff} 123 [bracket stuff] 456'
>>> import re
>>> for m in re.finditer("(?P<dec>\d+)|(?P<curl>{[^}]+})|(?P<sqbk>\[[^]]+\])", foo):
... for k,v in m.groupdict().items():
... if v is None: continue
... print '%6s: %r' % (k,v)
...
curl: '{curly stuff}'
dec: '123'
sqbk: '[bracket stuff]'
dec: '456'
Not for speed, I suppose ;-) (I can't recall the better way I think I did that once ;-)
BTW, also note that nested {}'s or []'s would cause problems.
Regards,
Bengt Richter
More information about the Python-list
mailing list