Tokenize a string or split on steroids

Fernando Rodríguez frr at wanadoo.es
Sat Mar 9 12:10:14 EST 2002


On Sat, 09 Mar 2002 11:30:40 GMT, Bob Follek <b.follek at verizon.net> wrote:


>If you're unfamiliar with regular expressions, here's a good starting
>point: http://py-howto.sourceforge.net/regex/regex.html

Thanks. :-) BTW, the strings that must be tokenized contain other non
alphanumeric characters (parenthese, for example), so I tried another regex:
[{}].

The result, although usable, is sort of weird:

>>> s = "{one}{two}"
>>> x1 = re.compile('[{}]')
>>> x1.split(s)
['', 'one', '', 'two', '']

Where are those empty strings coming from??? :-?
I can filter() them out, but I wonder where they come from.... O:-)

TIA



| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Fernando Rodríguez   frr at EasyJob.NET 
| http://www.EasyJob.NET/
| Expert resume and cover letter creation system. 
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



More information about the Python-list mailing list