Looking for very simple general purpose tokenizer

Maarten van Reeuwijk maarten at remove_this_ws.tn.tudelft.nl
Tue Jan 20 07:43:02 EST 2004


I found a complication with the shlex module. When I execute the following
fragment you'll notice that doubles are split. Is there any way to avoid
numbers this?


source = """
 $NAMRUN
     Lz      =  0.15
     nu      =  1.08E-6
"""

import shlex
import StringIO

buf = StringIO.StringIO(source)
toker = shlex.shlex(buf)
toker.comments = ""
toker.whitespace = " \t\r"
print [tok for tok in toker]

Output:
['\n', '$', 'NAMRUN', '\n', 'Lz', '=', '0', '.', '15', '\n', 'nu', '=', '1',
'.', '08E', '-', '6', '\n']


-- 
===================================================================
Maarten van Reeuwijk                        Heat and Fluid Sciences
Phd student                             dept. of Multiscale Physics
www.ws.tn.tudelft.nl                 Delft University of Technology



More information about the Python-list mailing list