[Python-Dev] Which direction is UnTransform? / Unicode is different

Wed Nov 20 14:38:46 CET 2013

On 20.11.13 02:28, Jim J. Jewett wrote:

> [...]
> Instead of relying on introspection of .decodes_to and .encodes_to, it
> would be useful to have charsetcodecs and tranformcodecs as entirely
> different modules, with their own separate registries.  I will even
> note that the existing help(codecs) seems more appropriate for
> charsetcodecs than it does for the current conjoined module.

I don't understand how a registry of transformation functions would 
simplify code. Without the transform() method I would write:

    >>> import binascii
    >>> binascii.hexlify(b'foo')
    b'666f6f'

With the transform() method I should be able to write:

    >>> b'foo'.transform("hex")

However how does the hex transformer get registered in the registry? If 
the hex transformer is not part of the stdlib, there must be some code 
that does the registration, but to get that code to execute, I'd have to 
import a module, so we're back to square one, as I'd have to write:

    >>> import hex_transformer
    >>> b'foo'.transform("hex")

A way around this would be some kind of import magic, but is this really 
neccessary to be able to avoid one import statement?

Furthermore different transformation functions might have different 
additional options. Supporting those is simple when we have simple 
transformation functions: The functions has arguments, and those are 
documented where the function is documented. If we want to support 
custom options for the .transform() method, transform() would have to 
pass along *args, **kwargs to the underlying transformer. However this 
is difficult to document in a way that makes it easy to find which 
options exist for a particular transformer.

Servus,
    Walter