pythonXX.dll size: please split CJK codecs out

"Martin v. Löwis" martin at v.loewis.de
Sun Aug 21 10:28:36 EDT 2005


Giovanni Bajo wrote:
> FWIW, this just highlights how ineffecient your build system is. Everything you
> currently do by hand could be automated, including MSI generation. Also, you
> describe the Windows procedure, which I suppose it does not take into account
> what needs to be done for other OS. But I'm sure that revamping the Python
> building system is not a piece of cake.

You are wrong. It is not true that everything I do by hand could be
automated. Atleast after automation, I still would have to do things
by hand, namely invoke the automation.

You probably haven't looked at the MSI generation at all: it *is*
automatic. However, everytime something changes in the structure,
the code generating the MSI must be adjusted to the new structure.

> I'll take the point though: it's easier to maintain for developers, and most
> Python users don't care.

See, this I find surprising. If there really is such a big need for
python24.dll being split in many more modules - why doesn't anybody
just do this, and offers it as a separate installation for use
with py2exe?

The fact that this hasn't happened indicates that users don't need
it badly enough. I personally rarely need to create a standalone
Python application, but when I did, I just used freeze, and static
linking. That way, I got a single binary, with no magic packaging,
and a minimal one, too.

>>In addition, having everything in a single DLL speeds up Python
>>startup a little, since less file searching is necessary.
> 
> I highly doubt this can be noticed in an actual benchmark, but I could be
> wrong. I can produce numbers though, if this can help people decide.

No, this is a minor issue. If you do write a PEP, and you find it
relatively easy to compare the maximum modularization to the minimal
one, it would be useful to underline your point, of course.

> I'm willing to write up such a PEP, but it's hard to devise an universal
> policy. 

Indeed. For Python 2.4, I made up a policy for myself: everything that
does not depend on a separate (non-system) library goes into
pythonxy.dll. That way, everybody will be able to compile Python
from sources without downloading anything else, yet it causes minimum
maintenance overhead. That's how the current python24.dll came about.

> Basically, the only element we can play with is the size of the
> resulting binary for the module. Would you like a policy like "split out every
> module whose binary on Windows is > X kbytes?".

It's less important what I like - I think I would ask for a poll on
the proposed PEP, and I would be -1 on anything that means more work
for contributors. But that would be only one voice, and, if a majority
of the Windows Python users preferred your policy, it would be
implemented (of course, somebody contributing the resulting project
files or some automation for them would also help).

> My personal preference would go to something "make python2x.dll include only
> the modules which are really core, like sys and os". This would also provide
> guidance to future modules, as they would simply go in external modules (I
> don't think really core stuff is being added right now).

Ok, then write that into the PEP. You would have to provide a definition
for "core", e.g. "everything that is needed for startup".

As a guideline, the Unix build process currently includes only the
following modules by default:

- marshal, imp, __main__, __builtin__, sys, exceptions: Modules
  living in Python/*.c
- gc, signal: invoked directly from the interpreter
- thread: not sure
- posix, errno, _sre, _codecs, so that setup.py can run
- zipimport, to avoid bootstrapping problems for importing python24.zip
- _symtable, because setup.py cannot get the dependencies right
- xxsubtype, for an undocumented reason I forgot

Regards,
Martin



More information about the Python-list mailing list