[Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19

Guido van Rossum guido@python.org
Wed, 11 Dec 2002 10:39:56 -0500


> [/F proves beyond a shadow of a doubt that string.whitespace is
>  locale-sensitive]
> 
> Thanks, Fredrik!  That clarifies the behaviour Just is seeing.
> 
> Hey: I just realized that making textwrap trust string.whitespace is
> wrong in at least one case: 0xa0 is *non-breaking* space in ISO-8859-1,
> and converting it to 0x20 (regular ol' space) is clearly wrong -- the
> "non-break" request will be ignored.  So Unicode or not, textwrap should
> probably just hard-code the US-ASCII whitespace chars.

+1

> My attitude is that textwrap should work on European languages, whether
> they are encoded in 8-bit "ASCII" or Unicode.  I suspect that passing an
> arbitrary Unicode string to it is meaningles -- what the heck does it
> even mean to wrap a string of Chinese or Hebrew or Devangari characters?
> Beats me, and I think they're out of scope for textwrap.

Correct -- you can't trust the width of characters to be all the
same.  (I'm not even sure if that's true for Latin-1, Cyrillic or
Greek, but it seems likely.)

> So: do I even need to worry about the cornucopia of Unicode whitespace
> characters at all?  Or can I sweep that can of worms under the rug?
> (Pardon the horribly mixed metaphor.)

Please shove them under the garage.

--Guido van Rossum (home page: http://www.python.org/~guido/)