Looking for very simple general purpose tokenizer

Maarten van Reeuwijk maarten at remove_this_ws.tn.tudelft.nl
Mon Jan 19 04:55:31 EST 2004


Hi group,

I need to parse various text files in python. I was wondering if there was a
general purpose tokenizer available. I know about split(), but this
(otherwise very handy method does not allow me to specify a list of
splitting characters, only one at the time and it removes my splitting
operators (OK for spaces and \n's but not for =, / etc. Furthermore I tried 
tokenize but this specifically for Python and is way too heavy for me. I am
looking for something like this:


splitchars = [' ', '\n', '=', '/', ....]
tokenlist = tokenize(rawfile, splitchars)

Is there something like this available inside Python or did anyone already
make this? Thank you in advance

Maarten
-- 
===================================================================
Maarten van Reeuwijk                        Heat and Fluid Sciences
Phd student                             dept. of Multiscale Physics
www.ws.tn.tudelft.nl                 Delft University of Technology



More information about the Python-list mailing list