[Python-Dev] Codecs and StreamCodecs

M.-A. Lemburg mal@lemburg.com
Thu, 18 Nov 1999 09:43:31 +0100


Guido van Rossum wrote:
> 
> > Why a factory? I've got a simple encode() function. I don't need a
> > factory. "flexibility" at the cost of complexity (IMO).
> 
> Unless there are certain cases where factories are useful.  But let's
> read on...
>
> > >     action - a string stating the supported action:
> > >                     'encode'
> > >                     'decode'
> > >                     'stream write'
> > >                     'stream read'
> >
> > This action thing is subject to error. *if* you're wanting to go this
> > route, then have:
> >
> > unicodec.register_encode(...)
> > unicodec.register_decode(...)
> > unicodec.register_stream_write(...)
> > unicodec.register_stream_read(...)
> >
> > They are equivalent. Guido has also told me in the past that he dislikes
> > parameters that alter semantics -- preferring different functions instead.
> 
> Yes, indeed!

Ok.

> (But weren't we going to do away with the whole registry
> idea in favor of an encodings package?)

One way or another, the Unicode implementation will have to
access a dictionary containing references to the codecs for
a particular encoding. You won't get around registering these
at some point... be it in a lazy way, on-the-fly or by some
other means.

What we could do is implement the lookup like this:

1. call encodings.lookup_<action>(encoding) and use the
   return value for the conversion
2. if all fails, cop out with an error

Step 1. would do all the import magic and then register
the found codecs in some dictionary for faster access
(perhaps this could be done in a way that is directly
available to the Unicode implementation, e.g. in a
global internal dictionary -- the one I originally had in
mind for the unicodec registry).

> > Not that I'm advocating it, but register() could also take a single
> > parameter: if a class, then instantiate it and call methods for each
> > action; if an instance, then just call methods for each action.
> 
> Nah, that's bad -- a class is just a factory, and once you are
> allowing classes it's really good to also allowing factory functions.
> 
> > [ and the third/original variety: a function object as the first param is
> >   the actual hook, and params 2 thru 4 (each are optional, or just the
> >   stream funcs?) are the other hook functions ]
> 
> Fine too.  They should all be optional.

Ok.
 
> > > obj = factory_function_for_<action>(errors='strict')
> >
> > Where does this "errors" value come from? How does a user alter that
> > value? Without an ability to change this, I see no reason for a factory.
> > [ and no: don't tell me it is a thread-state value :-) ]
> >
> > On the other hand: presuming the "errors" thing is valid, *then* I see a
> > need for a factory.
> 
> The idea is that various places that take an encoding name can also
> take a codec instance.  So the user can call the factory function /
> class constructor.

Right. The argument is reachable via:

Codec = encodings.lookup_encode('utf-8')
codec = Codec(errors='?')
s = codec(u"abcäöäü")

s would then equal 'abc??'.

--

Should I go ahead then and change the registry business to
the new strategy (via the encodings package in the above
sense) ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    43 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/