split on NO-BREAK SPACE
Wildemar Wildenburger
wildemar at freakmail.de
Sun Jul 22 15:27:23 EDT 2007
Peter Kleiweg wrote:
>
> Define white space to isspace()
>
>
Explain that phrase.
>
> Here is another "space":
>
> >>> u'\uFEFF'.isspace()
> False
>
> isspace() is inconsistent
>
I don't really know much about unicode, but google tells me that \uFEFF
is a byte order mark. I thought we we're implicitly in unison that
"whitespace" (whatever the formal definition) means "the stuff we put
into text to visually separate words".
So what is *your* definition of whitespace?
>>> Why does split() split when it says NO-BREAK?
>>>
>> Precisely. It says NO-BREAK. It doesn't say NO-SPLIT.
>>
>
> That is a stupid answer.
>
>
I fail to see why you deem it a good idea to become insulting at this point.
It is a very valid answer: NO-BREAK means "when wrapping characters into
paragraphs do not break at this space".
split() however does not wrap text, it /splits/ it (at whitespace
characters, as it happens). The NO-BREAK semantic has no meaning here.
/W
More information about the Python-list
mailing list