[Pythonmac-SIG] py2app questions..

Chris Barker - NOAA Federal chris.barker at noaa.gov
Wed Apr 10 20:19:38 CEST 2013


First: Ronald, thanks for keeping py2app up to date!

The good news: I just ran a recent py2app on an older app of mine, and
it all worked out of the box.

However, the resulting bundle is HUGE. A lot of this is inevitable,
I'm using a universal build, and some big packages, but there seems to
be some room for improvement:

The app  in question is using:

wxPython
numpy
a tiny bit of scipy
matplotlib

So it's going to be big!

Numpy, scipy and matplotlib all have a lot of intertwining modules,
many of which get imported whether you need them or not, so probably
the best one can do is include them all (more on that later)

However, I'm getting a couple packages that I have no idea why:

OpenGL
email
zmq

hand-removing them from the bundle doesn't break anything.

I'm wondering if their recipes are getting triggered by accident
somehow -- how can I tell how/why a particular package got included?

I'm also getting a number of *.so files that I don't understand --
some of the larger ones are:

_bsddb.so
_sqlite3.so
_imaging.so  (is that PIL?)

there are a LOT more, but it's had to know what they are from just the
names, and most are pretty small anyway  (except the wx ones that I
need and expect to be big...)

I understand that it's a goal of py2app to work "out of the box"
without needing to hand-tweak a lot to get it to work. This is a
worthy goal, and the recipes approach works great. However, it would
be nice if there was an alternate approach that made it easier to
build a more optimized package. One idea:

Py2app (and all the other stand-alone builders I've looked at) figure
out want to include from source code analysis. however, as it's pretty
common for packages to "import" stuff that may not be used a
particular app, you can get a lot of stuff you dontt need. For
instance, I'm pretty sure that PIL import tkInter. These imports are
often wrapped in an if clause or inside a function that may never get
called, but source code analysis isn't going to find all of those, so
the same thing to do is include it all -- resulting in bloated
packages.

I propose an alternative -- analysis of the app at run-time: The user
would run the app (maybe a test suite), then we'd take a look at
sys.modules and see everything that was actually loaded. This would
miss anything that wasn't exercised by the code you ran, but for most
cases, a test suite would bring in everything (comprehensive test
suites are hard to come by, but all it would need to do is test enough
to import every package it might use -- that's not a heavy lift).

This seems so easy -- am I missing something???

To support this, py2app would need a way to bypass the source code
analysis, and instead load a data file with the list of modules that
need to be included. Actually, as it sometimes takes a while to scan
teh code, it would be nice of py2app could optinally dump the results
of teh source code analysis, and be abel to re-load it later, rather
than needed to re-run.

1) Am I missing something as to why the run-time analysis wouldn't work ?

2) how hard would it be to patch py2app to load a module list, rather
than scan the source?

Thoughts??

-Chris






-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


More information about the Pythonmac-SIG mailing list