split on NO-BREAK SPACE

Carsten Haese carsten at uniqsys.com
Sun Jul 22 11:36:15 EDT 2007


On Sun, 2007-07-22 at 17:15 +0200, Peter Kleiweg wrote:
> Is this a bug or a feature?
> 
> 
>     Python 2.4.4 (#1, Oct 19 2006, 11:55:22) 
>     [GCC 2.95.3 20010315 (SuSE)] on linux2
> 
>     >>> a = 'a b c\240d e'
>     >>> a
>     'a b c\xa0d e'
>     >>> a.split()
>     ['a', 'b', 'c\xa0d', 'e']
>     >>> a = a.decode('latin-1')
>     >>> a
>     u'a b c\xa0d e'
>     >>> a.split()
>     [u'a', u'b', u'c', u'd', u'e']

It's a feature. See help(str.split): "If sep is not specified or is
None, any whitespace string is a separator."

-- 
Carsten Haese
http://informixdb.sourceforge.net





More information about the Python-list mailing list