[Python-ideas] Add encoding attribute to bytes

Georg Brandl g.brandl at gmx.net
Sat Nov 7 00:17:36 CET 2009


Terry Reedy schrieb:
> A Python interpreter has one encoding for floats, ints, and strings. 
> sys.float_info and sys.int_info give details about the first two. 
> although they are mostly invisible to user code. (I presume they are 
> attached to sys rather than float and int precisely because this.) A 
> couple of recent posts have discussed making the unicode encoding (UCS2 
> v 4) both less visible and more discoverable to extensions.
> 
> Bytes are nearly always an encoding of *something*, but the particular 
> encoding used is instance-specific. As Guido has said, the programmer 
> must keep track. But how? In an OO language, one obvious way is as an 
> attribute of the instance. That would be carried with the instance and 
> make it self-identifying.
> 
> What I do not know if it is feasible to give an immutable instance of a 
> builtin class a mutable attribute slot.

As soon as you can mutate an instance, it is not an immutable type anymore.
Calling it "immutable" despite will cause trouble.  (The same bytes instance
could be used somewhere else transparently, e.g. as a function default
argument, or cached as a constant local.)

As for the usefulness, I often have to work with proprietary communication
protocols between computer and devices, and there the bytes have no encoding
whatsoever (though I agree that most bytes do have a meaningful encoding).
However, a class as fundamental as "bytes" should not be burdened with an
attribute that may not even apply -- it's easy to make a custom class to
represent a (bytes, encoding) pair.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.




More information about the Python-ideas mailing list