[Python-Dev] Some thoughts on the codecs...
M.-A. Lemburg
mal@lemburg.com
Wed, 17 Nov 1999 11:11:05 +0100
Mark Hammond wrote:
>
> This is leading me to conclude that our "codec registry" should be the
> file system, and Python modules.
>
> Would it be possible to define a "standard package" called
> "encodings", and when we need an encoding, we simply attempt to load a
> module from that package? The key benefits I see are:
>
> * No need to load modules simply to register a codec (which would make
> the number of open calls even higher, and the startup time even
> slower.) This makes it truly demand-loading of the codecs, rather
> than explicit load-and-register.
>
> * Making language specific distributions becomes simple - simply
> select a different set of modules from the "encodings" directory. The
> Python source distribution has them all, but (say) the Windows binary
> installer selects only a few. The Japanese binary installer for
> Windows installs a few more.
>
> * Installing new codecs becomes trivial - no need to hack site.py
> etc - simply copy the new "codec module" to the encodings directory
> and you are done.
>
> * No serious problem for GMcM's installer nor for freeze
>
> We would probably need to assume that certain codes exist for _all_
> platforms and language - but this is no different to assuming that
> "exceptions.py" also exists for all platforms.
>
> Is this worthy of consideration?
Why not... using the new registry scheme I proposed in the
thread "Codecs and StreamCodecs" you could implement this
via factory_functions and lazy imports (with the encoding
name folded to make up a proper Python identifier, e.g.
hyphens get converted to '' and spaces to '_').
I'd suggest grouping encodings:
[encodings]
[iso}
[iso88591]
[iso88592]
[jis]
...
[cyrillic]
...
[misc]
The unicodec registry could then query encodings.get(encoding,action)
and the package would take care of the rest.
Note that the "walk-me-up-scotty" import patch would probably
be nice in this situation too, e.g. to reach the modules in
[misc] or in higher levels such the ones in [iso] from
[iso88591].
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 44 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/