[Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
M.-A. Lemburg
mal at egenix.com
Wed Jun 9 10:41:29 CEST 2010
Victor Stinner wrote:
> There are two opposite issues in the bug tracker:
>
> #7475: codecs missing: base64 bz2 hex zlib ...
> -> reintroduce the codecs removed from Python3
>
> #8838: Remove codecs.readbuffer_encode()
> -> remove the last part of the removed codecs
>
> If I understood correctly, the question is: should codecs module only contain
> encoding codecs, or contain also other kind of codecs.
Sorry, but I can only repeat what I've already mentioned
a few times on the tracker items: this is a misunderstanding.
The codec system does not mandate a specific type combination
(and that's per design). Only the helper methods .encode() and
.decode() on bytes and str objects in Python3 do in order to
provide type safety.
> Encoding codec API is now strict (encode: str->bytes, decode: bytes->str),
> it's not possible to reuse str.encode() or bytes.decode() for the other
> codecs. Marc-Andre Lemburg proposed to add .tranform() and .untranform()
> methods to str, bytes and bytearray types. If I understood correctly, it would
> look like:
>
> >>> b'abc'.transform("hex")
> '616263'
> >>> '616263'.untranform("hex")
> b'abc'
No, .transform() and .untransform() will be interface to same-type
codecs, i.e. ones that convert bytes to bytes or str to str. As with
.encode()/.decode() these helper methods also implement type safety
of the return type.
The above example will read:
>>> b'abc'.transform("hex")
b'616263'
>>> b'616263'.untranform("hex")
b'abc'
> I suppose that each codec will have a different list of accepted input and
> output types. Example:
>
> bz2: encode:bytes->bytes, decode:bytes->bytes
> rot13: encode:str->str, decode:str->str
> hex: encode:bytes->str, decode: str->bytes
hex will do bytes->bytes in both directions, just like it does
in Python2.
The methods to be used will be .transform() for the encode direction
and .untransform() for the decode direction.
> And so "abc".encode("bz2") would raise a TypeError.
Yes.
> --
>
> In my opinion, we should not mix codecs of different kinds (compression,
> cipher, etc.) because the input and output types are different. It would have
> more sense to create a standard API for each kind of codec. Existing examples
> of standard APIs in Python: hashlib, shutil.make_archive(), database API, etc.
If you want, you can have those as well, but then you'd
have to introduce new APIs or modules, whereas the codec
interface have existed for quite a while in Python2 and
are in regular use.
For most applications the very simple to use codec interface
to these codecs is all that is needed, so I don't see a strong
case for adding new interfaces, e.g.
hex_data = data.transform('hex')
looks clean and neat.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK 39 days to go
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
More information about the Python-Dev
mailing list