[Distutils] short circuiting module lookups

Noah Gift noah.gift at gmail.com
Tue Apr 7 13:54:46 CEST 2009


I work off of a rather large NFS infrastructure where thousands of
machines are constantly doing things, and recently I discovered a few
things about both setuptools and standard Python lookup that are
causing problems.

I use nosetests, and noticed that it can take up to 10 seconds to 60
seconds to execute nosetests, a lot of this has to do with the load on
the file system I am on that is shared by thousands of machines, but I
was a bit troubled to notice the following behavior with an strace:

1.  In the case of entry points for setuptools, it actually recurses
into EVERY egg directory in your path, not just the egg you requested,
adds them to your sys.path and additionally looks for four files
inside of every egg.  On a laptop on local storage, this doesn't
matter, but when thousands of machines hit the same filer, with many
python processes, bad things happen...

2.  Python itself, also looks at quite a few locations in the search
for modules.

It looks like this behavior with eggs and setuptools makes them
virtually unusable in large installations.  Is there any advice for
people that have my situation, in which flexibility, i.e. path lookups
are not that important, what is important is the least possible
lookups to find a module, even to the best case scenario, in which I
can tell a command line tool, or wrapper script to only look at a
specific path, and to never look at any other location?

I haven't done much research yet, but thought I would field the
question before I went down a rat hole...

--
Cheers,

Noah


More information about the Distutils-SIG mailing list