Should stdlib files contain 'narrow non breaking space' U+202F?

Mark Lawrence breamoreboy at yahoo.co.uk
Thu Dec 17 19:02:25 EST 2015


On 17/12/2015 23:18, Chris Angelico wrote:
> On Fri, Dec 18, 2015 at 10:05 AM, Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:
>> The culprit character is hidden between "Issue #" and "20540" at line 400 of
>> C:\Python35\Lib\multiprocessing\connection.py.
>> https://bugs.python.org/issue20540 and
>> https://hg.python.org/cpython/rev/125c24f47f3c refers.
>>
>> I'm asking as I've just spent 30 minutes tracking down why my debug code
>> would bomb when running on 3.5, but not 2.7 or 3.2 through 3.4.
>
> I'm curious as to why this character should bomb your code at all -
> it's in a comment. Is it that your program was expecting ASCII, or is
> it something about that particular character?
>

I'm playing with ASTs and using the stdlib as test data.  I was trying 
to avoid going down this particular route, but...

A lot of it is down to Windows, as the actual complaint is:-

     six.print_(source)
   File "C:\Python35\lib\encodings\cp1252.py", line 19, in encode
     return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u202f' in 
position 407: character maps to <undefined>

And as usual I've answered my own question.  The cp1252 shows even if my 
console is set to 65001, *BUT* I'm piping the output to file as it's so 
much faster.  Having taken five minutes to run the code without the pipe 
everything runs to completion.

I suppose the original question still holds, but I for one certainly 
won't be losing any sleep over it.  Talking of which, good night all :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence




More information about the Python-list mailing list