advanced number recognition in strings?

skip at pobox.com skip at pobox.com
Mon May 8 21:43:31 EDT 2006


    Sebastian> we want extract numbers from strings and wonder if there is
    Sebastian> already a module around for doing this. An example in our
    Sebastian> case would look like this:

    Sebastian> 0.032 +/- 0.5 x 10(-4)

    Sebastian> it would even be helpful to have a routine which does not
    Sebastian> recognise the +/- , but at least the 10(-4).

How about the tokenize module?

>>> import tokenize
>>> lines = ["0.032 +/- 0.5 x 10(-4)\n"]
>>> def readline():
...   if lines:
...     line = lines.pop(0)
...     return line
...   return ""
... 
>>> tokenize.tokenize(readline)
1,0-1,5:        NUMBER  '0.032'
1,6-1,7:        OP      '+'
1,7-1,8:        OP      '/'
1,8-1,9:        OP      '-'
1,10-1,13:      NUMBER  '0.5'
1,14-1,15:      NAME    'x'
1,16-1,18:      NUMBER  '10'
1,18-1,19:      OP      '('
1,19-1,20:      OP      '-'
1,20-1,21:      NUMBER  '4'
1,21-1,22:      OP      ')'
1,22-1,23:      NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''

Skip



More information about the Python-list mailing list