[Python-Dev] Add transform() and untranform() methods

Steven D'Aprano steve at pearwood.info
Fri Nov 15 11:02:41 CET 2013


On Fri, Nov 15, 2013 at 05:13:34PM +1000, Nick Coghlan wrote:

> A few things I noticed while implementing the recent updates:
> 
> - as you noted in your other email, while MAL is on record as saying
> the codecs module is intended for arbitrary codecs, not just Unicode
> encodings, readers of the current docs can definitely be forgiven for
> not realising that. We really need to better separate the codecs
> module docs from the text model docs (two new sections in the language
> reference, one for the codecs machinery and one for the text model
> would likely be appropriate. The io module docs and those for the
> builtin open function may also be affected)
> - a mechanism for annotating frames would help avoid the need for
> nasty hacks like the exception wrapping that aims to make codec
> failures easier to debug
> - if codecs exposed a way to separate the input type check from the
> invocation of the codec, we could redirect users to the module API for
> bad input types as well (e.g. calling "input str".encode("bz2")

> - if we want something that doesn't need to be imported, then encode()
> and decode() builtins make more sense than new methods on str, bytes
> and bytearray objects (since builtins would support memoryview and
> array.array as well, and it avoids ambiguity regarding the direction
> of the operation)

Sounds good to me.

> - the codecs module should offer a way to register a new alias for an
> existing codec
> - the codecs module should offer a way to map a name to a CodecInfo
> object without registering a new search function

It would be really good to be able to query the available codecs. For 
example, many applications offer an "Encoding" menu, where you can 
specify the codec used for text. That's hard in Python, since you 
can't retrieve a list of known codecs.


-- 
Steven


More information about the Python-Dev mailing list