[Python-Dev] Codecs and StreamCodecs

M.-A. Lemburg mal@lemburg.com
Thu, 18 Nov 1999 12:41:32 +0100


Mark Hammond wrote:
> 
> [Guido]
> > > (But weren't we going to do away with the whole registry
> > > idea in favor of an encodings package?)
> >
> [MAL]
> > One way or another, the Unicode implementation will have to
> > access a dictionary containing references to the codecs for
> > a particular encoding. You won't get around registering these
> > at some point... be it in a lazy way, on-the-fly or by some
> > other means.
> 
> What is wrong with my idea of using well-known-names from the encoding
> module?  The dict then is "encodings.<encoding-name>.__dict__".  All
> encodings "just work" because the leverage from the Python module
> system.  Unless Im missing something, there is no need for any extra
> registry at all.  I guess it would actually resolve to 2 dict lookups,
> but thats OK surely?

The problem is that the encoding names are not Python identifiers,
e.g. iso-8859-1 is allowed as identifier. This and
the fact that applications may want to ship their own codecs (which
do not get installed under the system wide encodings package)
make the registry necessary.

I don't see a problem with the registry though -- the encodings
package can take care of the registration process without any
user interaction. There would only have to be an API for
looking up an encoding published by the encodings package for
the Unicode implementation to use. The magic behind that API
is left to the encodings package...

BTW, nothing's wrong with your idea :-) In fact, I like it
a lot because it keeps the encoding modules out of the
top-level scope which is good.

PS: we could probably even take the whole codec idea one step
further and also allow other input/output formats to be registered,
e.g. stream ciphers or pickle mechanisms. The step in that
direction is not a big one: we'd only have to drop the specification
of the Unicode object in the spec and replace it with an arbitrary
object. Of course, this will still have to be a Unicode object
for use by the Unicode implementation.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    43 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/