[issue8838] Remove codecs.readbuffer_encode() and codecs.charbuffer_encode()

Fri May 28 15:23:31 CEST 2010

Marc-Andre Lemburg <mal at egenix.com> added the comment:

Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou at free.fr> added the comment:
> 
>> class BinaryDataCodec(codecs.Codec):
>>
>>     # Note: Binding these as C functions will result in the class not
>>     # converting them to methods. This is intended.
>>     encode = codecs.readbuffer_encode
>>     decode = codecs.latin_1_decode
> 
> What's the point, though? Creating a non-symmetrical codec doesn't sound
> like a very useful or recommandable thing to do. 

Why not ? If you're only interested in the binary data and
don't care about the original input object type, that's a
very natural thing to do.

E.g. you could use a memory mapped file as input to the encoder.
Would you really expect the codec to recreate such a file object when
decoding the binary data ?

> Especially in the py3k
> codec model where encode() only works on unicode objects.

That's a common misunderstanding. The codec system does not
mandate a specific type combination. Only the helper methods
.encode() and .decode() on bytes and str objects in Python3 do.

>> While it's possible to emulate the functions via other methods,
>> these methods always introduce intermediate objects, which isn't
>> necessary and only costs performance.
> 
> The bytes() constructor doesn't (shouldn't) create any more intermediate
> objects than read/charbuffer_encode() do.

Looking at the code, the data takes quite a long path through
the whole machinery. For non-Unicode objects, it always tries to create
an integer and only if that fails reverts back to the buffer
interface after a few more function calls.

Furthermore, the bytes() constructor accepts a lot more
objects than the "s#" parser marker, e.g. lists of integers,
plain integers, arbitrary iterators, which a codec
just interested in the binary representation of an
object via the buffer interface most likely doesn't
want to accept.

> And all this doesn't address the fact that these functions have never
> been documented, and don't seem used in the outside world
> (understandably so, since there's no way to know about their existence,
> and their intended use).

That's a documentation bug and probably the result of the fact
that none of the exposed encoder/decoder APIs are documented.

----------
title: Remove codecs.readbuffer_encode() and codecs.charbuffer_encode() -> Remove codecs.readbuffer_encode()	and	codecs.charbuffer_encode()

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8838>
_______________________________________