simple string parsing ?

Marc Boeren m.boeren at guidance.nl
Thu Sep 9 10:20:40 EDT 2004


> =+GC142*(GC94+0.5*sum(GC96:GC101))
> 
> and I want to get :
> 
> ['=', '+', 'GC142', '*', '(', 'GC94', '+', '0.5', '*', 'sum', '(',
> 'GC96', ':', 'GC101', ')', ')']
> 
> how can I get this ??????

The quick and dirty way: you have a formula containing a lot of
delimiters. Any part of the string that is not a delimiter is grouped
into a substring. So:

>>> formula = '=+GC142*(GC94+0.5*sum(GC96:GC101))'
>>> delimiters = '=+*():'
>>> parts = []
>>> appending = False
>>> for char in formula:
...   if char in delimiters:
...     parts+= [char]
...     appending = False
...   else:
...     if appending:
...       parts[-1]+= char
...     else:
...       parts+= [char]
...       appending = True
...
>>> parts
['=', '+', 'GC142', '*', '(', 'GC94', '+', '0.5', '*', 'sum', '(',
'GC96', ':', 'GC101', ')', ')']


This is simply to get you what you want, if you wish to use this formula
to actually compute something, it may be wise to dive into the various
parser packages, I found TPG (Toy Parser Generator) easy to use for
simple things...

Cheerio, Marc.



More information about the Python-list mailing list