split on NO-BREAK SPACE

Wildemar Wildenburger wildemar at freakmail.de
Sun Jul 22 15:27:23 EDT 2007


Peter Kleiweg wrote:
>
> Define white space to isspace()
>  
>   
Explain that phrase.

>
> Here is another "space":
>
>   >>> u'\uFEFF'.isspace()
>   False
>
> isspace() is inconsistent
>   
I don't really know much about unicode, but google tells me that \uFEFF 
is a byte order mark. I thought we we're implicitly in unison that 
"whitespace" (whatever the formal definition) means "the stuff we put 
into text to visually separate words".
So what is *your* definition of whitespace?


>>> Why does split() split when it says NO-BREAK?
>>>       
>> Precisely. It says NO-BREAK. It doesn't say NO-SPLIT.
>>     
>
> That is a stupid answer.
>
>   
I fail to see why you deem it a good idea to become insulting at this point.
It is a very valid answer: NO-BREAK means "when wrapping characters into 
paragraphs do not break at this space".
split() however does not wrap text, it /splits/ it (at whitespace 
characters, as it happens). The NO-BREAK semantic has no meaning here.


/W



More information about the Python-list mailing list