[Python-Dev] Some thoughts on the codecs...

Mark Hammond mhammond@skippinet.com.au
Wed, 17 Nov 1999 08:54:15 +1100


[Andy writes:]
> Leave JISXXX and the CJK stuff out.  If you get into Japanese, you
> really need to cover ShiftJIS, EUC-JP and JIS, they are big, and
there

[Then Marc relpies:]
> 2. give more information to the unicodec registry:
>    one could register classes instead of instances which the Unicode

[Jack chimes in with:]
> I would suggest adding the Dos, Windows and Macintosh
> standard 8-bit charsets
> (their equivalents of latin-1) too, as documents in these
> encoding are pretty
> ubiquitous. But maybe these should only be added on the
> respective platforms.

[And the conversation twisted around to Greg noting:]
> Next, the number of "open" calls:
>
>               Solaris     Linux    IRIX
>  Perl             16         10       9
>  Python          107         71      48

This is leading me to conclude that our "codec registry" should be the
file system, and Python modules.

Would it be possible to define a "standard package" called
"encodings", and when we need an encoding, we simply attempt to load a
module from that package?  The key benefits I see are:

* No need to load modules simply to register a codec (which would make
the number of open calls even higher, and the startup time even
slower.)  This makes it truly demand-loading of the codecs, rather
than explicit load-and-register.

* Making language specific distributions becomes simple - simply
select a different set of modules from the "encodings" directory.  The
Python source distribution has them all, but (say) the Windows binary
installer selects only a few.  The Japanese binary installer for
Windows installs a few more.

* Installing new codecs becomes trivial - no need to hack site.py
etc - simply copy the new "codec module" to the encodings directory
and you are done.

* No serious problem for GMcM's installer nor for freeze

We would probably need to assume that certain codes exist for _all_
platforms and language - but this is no different to assuming that
"exceptions.py" also exists for all platforms.

Is this worthy of consideration?

Mark.