doctests compatibility for python 2 & python 3
Robin Becker
robin at reportlab.com
Fri Jan 17 11:17:27 EST 2014
On 17/01/2014 15:27, Steven D'Aprano wrote:
..........
>>
>> # -*- coding: utf-8 -*-
>> def func(a):
>> """
>> >>> print(func(u'aaa\u020b'))
>> aaaȋ
>> """
>> return a
>
> There seems to be some mojibake in your post, which confuses issues.
>
> You refer to \u020b, which is LATIN SMALL LETTER I WITH INVERTED BREVE.
> At least, that's what it ought to be. But in your post, it shows up as
> the two character mojibake, ╚ followed by ï (BOX DRAWINGS DOUBLE UP AND
> RIGHT followed by LATIN SMALL LETTER I WITH DIAERESIS). It appears that
> your posting software somehow got confused and inserted the two
> characters which you would have got using cp-437 while claiming that they
> are UTF-8. (Your post is correctly labelled as UTF-8.)
>
> I'm confident that the problem isn't with my newsreader, Pan, because it
> is pretty damn good at getting encodings right, but also because your
> post shows the same mojibake in the email archive:
>
> https://mail.python.org/pipermail/python-list/2014-January/664771.html
>
> To clarify: you tried to show \u020B as a literal. As a literal, it ought
> to be the single character ȋ which is a lower case I with curved accent on
> top. The UTF-8 of that character is b'\xc8\x8b', which in the cp-437 code
> page is two characters ╚ ï.
when I edit the file in vim with ut88 encoding I do see your ȋ as the literal.
However, as you note I'm on windows and no amount of cajoling will get it to
work reasonably so my printouts are broken. So on windows
(py27) C:\code\hg-repos>python -c"print(u'aaa\u020b')"
aaaȋ
on my linux
$ python2 -c"print(u'aaa\u020b')"
aaaȋ
$ python2 tdt1.py
/usr/lib/python2.7/doctest.py:1531: UnicodeWarning: Unicode equal comparison
failed to convert both arguments to Unicode - interpreting them as being unequal
if got == want:
/usr/lib/python2.7/doctest.py:1551: UnicodeWarning: Unicode equal comparison
failed to convert both arguments to Unicode - interpreting them as being unequal
if got == want:
**********************************************************************
File "tdt1.py", line 4, in __main__.func
Failed example:
print(func(u'aaa\u020b'))
Expected:
aaaȋ
Got:
aaaȋ
**********************************************************************
1 items had failures:
1 of 1 in __main__.func
***Test Failed*** 1 failures.
robin at everest ~/tmp:
$ cat tdt1.py
# -*- coding: utf-8 -*-
def func(a):
"""
>>> print(func(u'aaa\u020b'))
aaaȋ
"""
return a
def _doctest():
import doctest
doctest.testmod()
if __name__ == "__main__":
_doctest()
robin at everest ~/tmp:
so the error persists with our without copying errors.
Note that on my putty terminal I don't see the character properly (I see unknown
glyph square box), but it copies OK.
--
Robin Becker
More information about the Python-list
mailing list