[Python-Dev] experiments with PYMALLOC (long)

Andrew MacIntyre andymac@bullseye.apana.org.au
Thu, 19 Jul 2001 23:40:55 +1000 (EST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---888574987-20871-995550055=:4658
Content-Type: TEXT/PLAIN; charset=US-ASCII

[this post is primarily for informational purposes, although I would
 welcome serious suggestions on possible options for dealing with
 either the longexp issue or the PYMALLOC performance issue - AIM]

In my port of Python to OS/2 using the EMX suite, I encountered the
situation of not being able to pass the longexp test in the test suite.

The test is simply:
>>>NUMREPS = 65580
>>>eval('[' + '2,' * NUMREPS + ']')

With the advent of PYMALLOC in 2.1 I hoped that this issue could be dealt
with, however defining WITH_PYMALLOC achieved nothing other than to cause
Numeric to fail on import (I am lead to believe that this is now fixed in
Numeric 20.1).

Revisiting my earlier diagnostic results reinforced the fact that the
longexp test is really a stress test of the parser.  In this test, the
parser ends up creating humongous numbers of nodes.  Each of these nodes
is only 20 bytes (+1 for insurance) for which the EMX malloc() returns a
chunk 64 bytes long - and there appears to be a minimum of 13 such nodes
+ a handful of 2+1 byte allocations occupying 12 bytes each for each
element in the list being parsed.

Not a happy situation, as it is sufficient to exhaust my dev system's
swap space, and OS/2 stops dead.

I then thought of doctoring Python to use PYMALLOC for _all_ interpreter
memory management (the attached patch is all it took, against 2.1).

And with the exception of the socket test, which fails the first time with
a "no memory" error but succeeds the second time when the .pycs don't need
to be recompiled, the completely PYMALLOC managed interpreter passes the
regression test _including_ the longexp test.

I was starting to think in terms of releasing the (yet to be) 2.1.1 port
configured this way.  But then I decided to benchmark the two interpreter
configurations using the regression test as the benchmark.....

On my dev system, the average results (of 3 runs) are:
             no .pyc      w/.pyc
std malloc    3m 41s      3m 25s    (test_longexp skipped)
PYMALLOC      6m 12s      5m 25s    (test_socket fails in "no .pyc" case)

[the skipped longexp test, run standalone on the PYMALLOC interpreter,
takes <5s total, so its not a significant factor in the times]

:-( :-(  I think the OS/2 port is going to have to continue to risk
failure on the longexp test on many systems as such a performance hit is
hard to justify.

Environment:
System=  AMD K6/2-300, 64M RAM, DMA IDE drive (pre UDMA33)
         40MB preallocated swap space, that can expand to 140MB
S/ware=  OS/2 v4, FP12
         EMX 0.9d fix 03, gcc 2.8.1
         compile options "-O2 -fomit-frame-pointer"
         NDEBUG _not_ defined, so all assert()s still active

[PS: please cc any replies to me as I'm not subscribed to this list]

--
Andrew I MacIntyre                    "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au     | Snail: PO Box 370
        andymac@pcug.org.au               |        Belconnen ACT 2616
Web:    http://www.andymac.org/           |        Australia

---888574987-20871-995550055=:4658
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="pymalloc_all.patch"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0107192340550.4658@central>
Content-Description: pymalloc_all.patch
Content-Disposition: attachment; filename="pymalloc_all.patch"

KioqIEluY2x1ZGVccHltZW0uaC5vcmlnCVNhdCBTZXAgIDIgMDk6Mjk6MjYg
MjAwMA0KLS0tIEluY2x1ZGVccHltZW0uaAlTdW4gSnVsIDE1IDE3OjQ0OjM4
IDIwMDENCioqKioqKioqKioqKioqKg0KKioqIDI1LDM2ICoqKioNCi0tLSAy
NSw0NyAtLS0tDQogICAgIFNlZSB0aGUgY29tbWVudCBibG9jayBhdCB0aGUg
ZW5kIG9mIHRoaXMgZmlsZSBmb3IgdHdvIHNjZW5hcmlvcw0KICAgICBzaG93
aW5nIGhvdyB0byB1c2UgdGhpcyB0byB1c2UgYSBkaWZmZXJlbnQgYWxsb2Nh
dG9yLiAqLw0KICANCisgI2lmZGVmCVBZTUFMTE9DX0FMTA0KKyAjaWZuZGVm
IFB5Q29yZV9NQUxMT0NfRlVOQw0KKyAjdW5kZWYgUHlDb3JlX1JFQUxMT0Nf
RlVOQw0KKyAjdW5kZWYgUHlDb3JlX0ZSRUVfRlVOQw0KKyAjZGVmaW5lIFB5
Q29yZV9NQUxMT0NfRlVOQyAgICAgIF9QeUNvcmVfT2JqZWN0TWFsbG9jDQor
ICNkZWZpbmUgUHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgX1B5Q29yZV9PYmpl
Y3RSZWFsbG9jDQorICNkZWZpbmUgUHlDb3JlX0ZSRUVfRlVOQyAgICAgICAg
X1B5Q29yZV9PYmplY3RGcmVlDQorICNkZWZpbmUgTkVFRF9UT19ERUNMQVJF
X01BTExPQ19BTkRfRlJJRU5ECTENCisgI2VuZGlmDQorICNlbHNlDQogICNp
Zm5kZWYgUHlDb3JlX01BTExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfUkVB
TExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfRlJFRV9GVU5DDQogICNkZWZp
bmUgUHlDb3JlX01BTExPQ19GVU5DICAgICAgbWFsbG9jDQogICNkZWZpbmUg
UHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgcmVhbGxvYw0KICAjZGVmaW5lIFB5
Q29yZV9GUkVFX0ZVTkMgICAgICAgIGZyZWUNCisgI2VuZGlmDQogICNlbmRp
Zg0KICANCiAgI2lmbmRlZiBQeUNvcmVfTUFMTE9DX1BST1RPDQoqKiogT2Jq
ZWN0c1xvYm1hbGxvYy5jLm9yaWcJTW9uIE1hciAxMiAwNTozNjoxMiAyMDAx
DQotLS0gT2JqZWN0c1xvYm1hbGxvYy5jCVRodSBKdWwgMTkgMjM6MjQ6MjQg
MjAwMQ0KKioqKioqKioqKioqKioqDQoqKiogNzMsODIgKioqKg0KLS0tIDcz
LDg5IC0tLS0NCiAgICogYWxsb2NhdG9yIHdoaWNoIGV4cG9ydHMgZnVuY3Rp
b25zIHdpdGggbmFtZXMgX290aGVyXyB0aGFuIHRoZSBzdGFuZGFyZA0KICAg
KiBtYWxsb2MsIGNhbGxvYywgcmVhbGxvYywgZnJlZS4NCiAgICovDQorICNp
ZmRlZglQWU1BTExPQ19BTEwNCisgI2RlZmluZSBfU1lTVEVNX01BTExPQwkJ
bWFsbG9jDQorICNkZWZpbmUgX1NZU1RFTV9DQUxMT0MJCS8qIHVudXNlZCAq
Lw0KKyAjZGVmaW5lIF9TWVNURU1fUkVBTExPQwkJcmVhbGxvYw0KKyAjZGVm
aW5lIF9TWVNURU1fRlJFRQkJZnJlZQ0KKyAjZWxzZQ0KICAjZGVmaW5lIF9T
WVNURU1fTUFMTE9DCQlQeUNvcmVfTUFMTE9DX0ZVTkMNCiAgI2RlZmluZSBf
U1lTVEVNX0NBTExPQwkJLyogdW51c2VkICovDQogICNkZWZpbmUgX1NZU1RF
TV9SRUFMTE9DCQlQeUNvcmVfUkVBTExPQ19GVU5DDQogICNkZWZpbmUgX1NZ
U1RFTV9GUkVFCQlQeUNvcmVfRlJFRV9GVU5DDQorICNlbmRpZg0KICANCiAg
LyoNCiAgICogSWYgbWFsbG9jIGhvb2tzIGFyZSBuZWVkZWQsIG5hbWVzIG9m
IHRoZSBob29rcycgc2V0ICYgZmV0Y2gNCg==
---888574987-20871-995550055=:4658--