[Distutils] __init__.py files missing from my eggs

Phillip J. Eby pje at telecommunity.com
Fri Nov 11 21:00:46 CET 2005


At 06:31 PM 11/11/2005 +0000, Richard Cooper wrote:
>From: Phillip J. Eby [mailto:pje at telecommunity.com]
> > What you probably want is a layout like this:
> >
> > app/
> >     setup.py
> >     My/
> >        Project/
> >                app
> > Plugin1/
> >     setup.py
> >     My/
> >        Project/
> >                plugin1/
> >
> > etc.
>
>My knee-jerk reaction is 'Ewww!' Is this what PEAK (or ant other large
>setuptools based project) does?

Yes, actually.  More precisely, it's what PEAK *will* do, since I haven't 
actually started breaking out individual projects yet.


>How do you handle My.Project.utils (for
>example) and the half dozen or so other packages at different levels of
>the hierarchy which are used in multiple "products"?

You make them individual projects.  For example, I have a module, 
peak.util.imports, that will be spun off as its own project.  Its layout 
will be:

    setup.py
    peak/
         __init__.py
         util/
              __init__.py
              imports.py

The __init__.py files will be empty, of course, since as namespace packages 
they don't need any contents.  Any project (including PEAK itself) that 
needs to use this module will simply depend on this project as a whole.

My original plan for setuptools was to use an approach that could section 
out parts of a large distribution like this.  However, I quickly realized 
that this was counterproductive to the Python "ecosystem"; it would be more 
beneficial to have lots of small projects that can be used by others 
(without buying into the entire PEAK system), than it would be for me to 
keep using the same "all-in-one" source tree layout that I'm currently 
using for PEAK.

I suspect that you may like Zope's "zpkg" system better for your purposes, 
since it is designed to extract subsets of a larger distribution and 
automatically generate setup scripts for them.  I don't know if you can use 
it to create eggs, but it might be possible to modify it to do so.


>My needs are a bit specialised, this is a commercial app so source
>distribution, easy_install, etc will never be on the cards. Distribution
>is strictly "frozen" products (py2exe and eggs) only.

easy_install is useful as an in-house tool for automatically building your 
projects.  For example, you could create a setup.py that lists all your 
plugin project names as requirements, and a setup.cfg that lists their 
subversion URLs, and you could run easy_install against that directory to 
check out and build all the latest plugin eggs, dropping them in a 
distribution directory.  Your users don't ever have to see easy_install, 
for it to be useful in your project.  Of course, you may not be using 
Subversion, etc., and this may be moot, but I'm not psychic and don't know 
anything about what you're doing besides what you tell me.  :)


> > If you are working on these projects at the same
> > time, you need only run "setup.py develop" in each one to set
> > up a sys.path structure that merges all the packages into one.
>
>We would probably end up with 20 or so "projects" if we did things that
>way as opposed to the single code tree we have now. Given that,
>"setup.py develop"*20 seems like more work than our current approach
>which is "add the root of the code tree to PYTHONPATH"

Does that mean you have 19 plugins?  Perhaps I'm misunderstanding something 
about your intentions.  I assume that your projects would be one for the 
app, and one for each plugin, so that you can create a py2exe for the app 
and an egg for each plugin.  If that's the case, then you would only have 
20 projects if you had 19 plugins.

Also, if the plugins don't depend on each other, then to work on any one 
plugin you only run "develop" twice: once for the app and once for the 
plugin.  Also, this may not be especially relevant to your project at the 
moment, but "develop" also makes sure that your environment is synchronized 
with any external requirements.  For example, in a multi-developer scenario 
where developer Aaron changes a dependency on the FooBar package from 1.1 
to 1.2, then when developer Bob updates his source checkout and runs 
"develop", it will find/fetch/build/install FooBar 1.2, thereby helping to 
reduce the pain of adding or removing dependencies.

You could probably quite rightly say that this doesn't affect your project 
because you don't have external dependencies, but a significant part of 
*why* few projects have external dependencies is because they're a costly 
pain -- if you're not using setuptools.  I point this out because 
setuptools is intended primarily to enable a new way of developing 
projects, and enabling people to do things that were previously 
unthinkable.  If you aren't trying to do what was impossible before, then 
certainly setuptools will not appear to help your current situation much.  :)



> > Note, by the way, that you don't need namespace packages to
> > have plugins for an application.  Plugins can be in any
> > package structure you like, as eggs' "entry points" system
> > can be used by plugins to advertise the functions they
> > provide, and the importing can be done automatically for you.
>
>Yeah I know. The entry point system is the main reason I started playing
>with setuptools. However I still wanted the plugins we produce to all
>live in the My.project.plugins package.

Note that giving each plugin its own top-level package would completely 
eliminate any duplication of directory structure between plugins, if that's 
your concern.  But note that in any case, all that you'd be duplicating are 
empty directories and zero-length __init__.py files.


>It looks like my problem is that I'm mingling separate products in the
>same code tree and setuptools/distutils doesn't like that. Fair enough.
>I don't think I'm quite ready to drink the "one tree per product"
>kool-aid just yet. Firstly, it seems weird to me to split up the source
>like that and secondly, my colleagues would probably kill me ;-)

Here's another possible layout, that will work nicely for eggs, but won't 
work for "setup.py develop":

TheProject/
            MyApp/
                __init__.py
                setup.py
                *all app code here*

            Plugin1/
                __init__.py
                setup.py
                *plugin code here*

The idea here is that you have a MyApp top-level package, and a top-level 
package for each plugin.  Each setup script is *inside* the corresponding 
package directory.  Adding TheProject to PYTHONPATH makes everything 
importable, just like you have now.  But each project corresponds to a top 
level package, and uses package_dir = {'MyApp':'.'}, or {'Plugin1':'.'} 
etc. so that each project directory is also a package directory.

This compromise layout preserves the single-checkout+PYTHONPATH approach, 
and should allow you to generate eggs without any special hacks, at the 
cost of flattening your hierarchy a bit, and making it impossible to use 
setuptools' "develop" or "test" commands (which need the package 
directories to be somewhere *under* the directory containing setup.py).

Personally, I find that deeply nested foo.bar.baz hierarchies are a 
Java-ism; Python doesn't need them and is usually better off without them; 
"flat is better than nested", and all that.  But I can also understand that 
reorganizing your packages just to support eggs might be considered too 
costly.  Just make sure you factor in the cost of *maintaining* a hack to 
do this, since the reorganization cost would be a one-time thing, but the 
hack would be something you'd have to maintain forever.


>So what I will probably end up doing is hacking up setuptools to insert
>the __init__.py files I need. Which begs the question - Is this:
>
>a) A dirty, dirty hack I should never speak of again OR
>b) Not a bad idea and potentially useful to other people.

Yes.  :)

It's both, really.  Consider Chandler, for example, which is currently very 
similar to your project.  Actually, it's worse, because it doesn't even 
have a setup script yet.  In the next release cycle, we'll be moving to 
eggs and we will have to do some source tree splitting, and there will be 
some pain, and questioning, and maybe even some griping about it.  *But*, 
we spent a lot of time during the current release cycle doing quite a lot 
of *flattening* of the source tree, turning deeply nested modules like 
"osaf.contentmodel.contacts.Contacts" into "osaf.pim.contacts", and 
consolidating APIs so that most code can now do:

     from osaf import pim
     aContact = pim.Contact()

instead of the former:

     import osaf.pim.contacts.Contacts as Contacts
     aContact = Contacts.Contact()

Flat really *is* better than nested.  Much better.  And as a bonus, 
anything that we choose to split into separate eggs will not have as much 
duplication.  In the general case, the 'osaf' top-level package will be the 
only thing that gets duplicated, and since it's a namespace package, the 
__init__.py will be an empty file anyway.

So, I would encourage you to urge those you share the hack with, to 
consider all of the issues I've brought out here, and use it only as a 
crutch to allow a project to get past a temporary issue with its layout, 
not as a cane to perpetually lean on.  Putting on my software development 
manager hat, I would tell you flatly that the time to manage the directory 
structure duplication and use of "develop" is nothing compared to the time 
saved by being able to intelligently manage external dependencies.  And if 
you don't have any external dependencies, it's only because they cost too 
much before.  Setuptools changes the cost equation, and therefore changes 
what the ideal development patterns are.  The reality may be that in your 
particular environment you don't have the authority or reputation to lead 
such a change, but we shouldn't let that stop other people from getting 
that awareness.

Okay, I'll get off the soapbox now.  :)  Hack away, just make sure to point 
people to this thread and especially this message.

One other idea, by the way...  Have you tried adding:

     py_modules = ['My.__init__', 'My.App.__init__', ...]

to the setup() arguments?  It's just a guess on my part, but I think it 
might actually work, without doing any hacking on distutils or 
setuptools.  Just a thought.  Happy hacking, either way.  :)



More information about the Distutils-SIG mailing list