"stuff", a general request for help

Tue May 9 03:16:15 EDT 2000

I have been experimenting with a large assortment of stuff in python
recently.  The particular reason that I wrote this message was the
nature of python optimization; I don't think I understand how to do
it, and after reading guido's optimization anecdote
(http://www.python.org/doc/essays/list2str.html) I gather that it's
rather important (the same algorythm expressed in different ways can
evidently make up to a 15x speed difference).

I'm doing a lot with operator overloading right now, and I know
python's syntactic sugar adds some overhead, so I decided I'd try the
rather simple case of a multidimensional array and see if I could
figure out a way to make it go faster or take up less space.  I
thought this would be easy, since the for loop to create a list of
lists was bound to be slow, and doing everything as one big array
would be easier.

My attempt was a failure.  (Well, I DID get the initializer to go
faster, but at what seems to be horrible cost.)  The code seemed tad
too long for USENET, but a .tar.gz containing the code in question is
at http://www.twistedmatrix.com/~glyph/py/tr2/tr2-20000509.tar.gz

The thing that really struck me is the non-trivial amount that the
optimization

        self.__getattr__=self.__getattr__

in the initializer made.  Also the incredible lack of any performance
gain through translation to C.  Also the apparently crazy overhead on
python function invocation...

Can someone with more python experience than I give me some advice on
where I should be looking at this point?

I put a script in the archive called 'optest' which should run the
timings for this ... ymmv though, especially on the translation-to-c
stuff.  If it doesn't work it should be a simple change to a path in
py2c/py2ctree.py.

(no, I didn't write a python-to-C translator, I used the already
available one)

An average set of output from the script for me is this:

glyph at helix:~/cvs/matrix/tr2-20000509% ./optest 
notebook
c_notebook/
  notebook/dimensional.py -> .c -> .so
  notebook/dimentest.py -> .c -> .so
  notebook/__init__.py -> .c -> .so
array_init ->	397 ms
array_zero ->	20165 ms
opt_init ->	292 ms
opt_zero ->	18804 ms
carray_init ->	365 ms
carray_zero ->	18887 ms
list_init ->	340 ms
list_zero ->	3443 ms

As long as I'm encouraging people to download this thing, too; I am
getting started with PyUnit and regression testing ... but my tests
seem really random and arbitrary.  Much appreciated would also be
comments on the style / structure of said tests (in the regr/
directory) and pointers to better ones.  I've read about JUnit and
PyUnit, but there doesn't seem to be much in the way of a description
of what makes a good test.

Right now, writing tests takes 2x as long as writing the code
initially, and there have been more bugs in the tests than in the code
itself.  I know that these can be a real resource for me, but I'm not
quite sure how.

Thanks!

(Sorry for covering so much in one email, but these are both areas
where I'm *REALLY* stuck and confused and could use some general
opining of pythonic luminaries ^_^)

-- 
                  __________________________________________
                 |    ______      __   __  _____  _     _   |
                 |   |  ____ |      \_/   |_____] |_____|   |
                 |   |_____| |_____  |    |       |     |   |
                 |   @ t w i s t e d m a t r i x  . c o m   |
                 |   http://www.twistedmatrix.com/~glyph/   |
                 `__________________________________________'