[Python-ideas] Type hints for text/binary data in Python 2+3 code

Chris Angelico rosuav at gmail.com
Wed Mar 23 03:48:29 EDT 2016


On Wed, Mar 23, 2016 at 6:45 PM, Andrey Vlasovskikh
<andrey.vlasovskikh at gmail.com> wrote:
>
>> 2016-03-23, в 7:37, Chris Angelico <rosuav at gmail.com> написал(а):
>>
>> On Wed, Mar 23, 2016 at 2:39 PM, Guido van Rossum <guido at python.org> wrote:
>>>> I was concerned with UnicodeEncodeErrors in Python 2 during implicit conversions from unicode to bytes:
>>>>
>>>>    getattr(obj, u'Non-ASCII-name')
>>>>
>>>> There are several places in the Python 2 API where these ASCII-based unicode->bytes conversions take place, so the _AsciiUnicode type comes to mind.
>>>
>>> OK, so you want the type of u'hello' to be _AsciiUnicode but the type
>>> of u'Здравствуйте' to be just unicode, right? And getattr()'s second
>>> argument would be typed as... What?
>>
>> AIUI, getattr's second argument is simply 'str'; but in Python 2,
>> _AsciiUnicode (presumably itself a subclass of unicode) can be
>> implicitly promoted to str. A non-ASCII attribute name works fine, but
>> getattr converts unicode to str using the 'ascii' codec.
>
> Right. I'm not sure that a non-ASCII attribute name is fine in Python 2 though.

It's legal. I don't know that it's a good idea, but it is legal.

rosuav at sikorsky:~$ python
Python 2.7.11+ (default, Feb 22 2016, 16:38:42)
[GCC 5.3.1 20160220] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class NS(object): pass
...
>>> ns = NS()
>>> setattr(ns, "\x00", "null")
>>> getattr(ns, "\x00")
'null'
>>> setattr(ns, "\xA9", "copyright")
>>> getattr(ns, "\xA9")
'copyright'
>>> dir(ns)
['\x00', '__class__', '__delattr__', '__dict__', '__doc__',
'__format__', '__getattribute__', '__hash__', '__init__',
'__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'__weakref__', '\xa9']

ChrisA


More information about the Python-ideas mailing list