[issue8859] split() splits on non whitespace char when ther is no separator given.

Ezio Melotti report at bugs.python.org
Sun May 30 21:12:36 CEST 2010


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

Both on Linux and Windows I get:
>>> '\xa0'.isspace()
False
>>> u'\xa0'.isspace()
True

The Unicode char u'\xa0' is U+00A0 NO-BREAK SPACE, so unicode.split correctly considers it a whitespace.
However '\xa0' is not a whitespace, so str.split ignores it.
The correct solution is to convert your string to Unicode and then split.
I'd close this as invalid but I'd like you to confirm that the example I posted and that 'split' return the same result on both Linux and Windows before doing so (the fact that on Linux works it's probably caused by something else -- e.g. the label is already Unicode).

----------
nosy: +ezio.melotti
resolution:  -> invalid
status: open -> pending

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8859>
_______________________________________


More information about the Python-bugs-list mailing list