Good String Tokenizer

JamesHoward James.w.Howard at gmail.com
Tue Jul 24 15:21:54 EDT 2007


I have searched the board and noticed that there isn't really any sort
of good implementation of a string tokenizer that will tokenize based
on a custom set of tokens and return both the tokens and the parts
between the tokens.

For example, if I have the string:

"Hello, World!  How are you?"

And my splitting points are comma, and exclamation point then I would
expect to get back.

["Hello", ",", " World", "!", "  How are you?"]

Does anyone know of a tokenizer that will allow for this sort of use?

Thanks in advance,
Jim Howard




More information about the Python-list mailing list