Behavior of re.split on empty strings is unexpected

Thomas Jollans thomas at jollans.com
Mon Aug 2 18:07:58 EDT 2010


On 08/02/2010 11:22 PM, John Nagle wrote:
>> [ s in rexp.split(long_s) if s ]
> 
>    Of course I can discard the blank strings afterward, but
> is there some way to do it in the "split" operation?  If
> not, then the default case for "split()" is too non-standard.
> 
>    (Also, "if s" won't work;   if s != ''   might)

Of course it will work. Empty sequences are considered false in Python.

Python 3.1.2 (release31-maint, Jul  8 2010, 09:18:08)
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> sprexp = re.compile(r'\s+')
>>> [s for s in sprexp.split('   spaces   every where !  ') if s]
['spaces', 'every', 'where', '!']
>>> list(filter(bool, sprexp.split('   more  spaces \r\n\t\t  ')))
['more', 'spaces']
>>>

(of course, the list comprehension I posted earlier was missing a couple
of words, which was very careless of me)



More information about the Python-list mailing list