From fdrake@acm.org Mon Oct 1 15:17:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 1 Oct 2001 10:17:06 -0400 Subject: [Python-Dev] Integrating Expat In-Reply-To: <200109301453.QAA21436@paros.informatik.hu-berlin.de> References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> Message-ID: <15288.31458.735886.325225@grendel.zope.com> Martin von Loewis writes: > [I know I've asked this before, but Fred wanted me to ask it again :-] Actually, I think I simply suggested the forum so that others could comment as well. ;-) > What do you think about an integration of Expat into Python, to be > always able to build pyexpat (and with the same version also)? > Which version of Expat would you use? Would you put the expat files > into a separate directory, or all into modules? I have mixed feelings. There are really two things that we could do: We could add Expat to our CVS repository, which means syncing a bunch of files everytime a new Expat release comes out, or we could bundle the Expat sources with the Python source distribution when the distribution is built, but not add them to CVS. This avoids the extra files in CVS, but complicates construction of the distribution and adds a new wrinkle to the configuration management. > Here is my proposal: Integrate Expat 2.95.2 for release together with > Python 2.2; into an expat subdirectory of Modules (taking only the lib > files of expat). > > This would affect build procedures on all targets; in particular, > pyexpat would not link to a shared expat DLL, but incorporate the > object files. For the "Parsed XML" Zope product, we included the sources for the Expat library in our CVS, but added our own configure.in and other build-control files, which are simpler than those included with Expat (since it only needs to build the static library). This seems to work reasonably well, and does not introduce new wrinkles to the configuration management. So I think we agree on the approach to take. M.-A. Lemburg writes: > Are you sure that we should choose expat as "native" XML parser ? > > There are other candidates which would fit this role just > as well (in particular, Fredrik's sgmlop looks like a nice > extension since it not only works with XML but also many > other meta languages). See Martin's comments about this. I think this precludes inclusion of sgmlop until the problems it has have been addressed in the implementation. I'm not sure what "meta languages" it handles; I thought it only dealt with XML/XHTML and HTML document markup. > If you want a very fast validating XML parser, RXP would also > be a good choice -- AFAIK, the RXP folks would allow us to > ship RXP under a different license than GPL which is then > bound to Python. Agreed. I think it would be really nice to have an interface for RXP that was easy to build and use. I haven't looked at PyLTXML in a long time, so I'm not sure what state it's in. > Given the many alternatives, I am not sure whether going with > expat is the right path... may be wrong though. As Martin said, RXP and Expat together don't really qualify as "many". sgmlop just isn't robust enough (yet), and it's not clear there are other alternatives. There is libxml (a.k.a. gnome-xml), which is licensed under the LGPL; Python bindings for that are described as being in the alpha stage, but I haven't had time to play with them myself. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Mon Oct 1 15:50:45 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Oct 2001 16:50:45 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> Message-ID: <3BB882C5.D1BDFE79@lemburg.com> Martin von Loewis wrote: > > > Are you sure that we should choose expat as "native" XML parser ? > > It wouldn't necessarily be the only parser. To process XML, different > applications have different needs. However, since the expatreader is > the only SAX reader included in the standard library at the moment, > guaranteeing presence of pyexpat is oft-requested. Notice that > pyexpat.c is also in the standard library already. Just wanted to make sure that we still have the option of including other parsers as well :-) > > There are other candidates which would fit this role just > > as well (in particular, Fredrik's sgmlop looks like a nice > > extension since it not only works with XML but also many > > other meta languages). > > Not that many candidates would work as well. For example, sgmlop has a > number of known bugs, and a few unknown ones. Guido once complained > that it is easy to crash sgmlop with ill-formed input, and rejected > inclusion of sgmlop when xmlrpclib was integrated. A known problem is > that entity references are not expanded in attributes. Well, let's put it this way: if someone finds a need to fix these bugs, it is more likely to happen in the Python core, e.g. xmlrpclib has already received a few tweaks (by yourself ;-) after it was checked into the core. I think that the sgmlop design is sufficiently simple and easy to extend to make it a good candidate for inclusion. Sure, we'll get bug reports, but why not add sgmlop marked as experimental to the core in order to get it stabilized and bug-fixed ?! I would very much like a sandbox like part in the Python standard dist to encourage stabilizing of proposed-to-be-included std lib extensions, e.g. how about a sandbox package in the std lib ?! > Beyond that, I'm not aware of many more pure-C parsers that could be > reasonably be integrated into the core. There are many XML parsers, > but many of the are written in C++ or Java. Me neither... except RXP which is written in plain C. > > If you want a very fast validating XML parser, RXP would also > > be a good choice -- AFAIK, the RXP folks would allow us to > > ship RXP under a different license than GPL which is then > > bound to Python. > > RXP would indeed be a choice. Of course, integrating it is much > harder; you'd have to write the C module first, plus documentation, > plus a SAX driver, plus test cases. I'm not sure how much code you can > inherit from PyLTXML. Sure; the question I wanted to raise was: given that we have such an interface, would RXP also be a candidate for inclusion ? > On performance: Please have a look at > > http://www.xml.com/lpt/a/Benchmark/exec.html > > which suggests that expat still has a speed advantage over rxp > (assuming that the measurements where done carefully, i.e. disabling > validation in RXP). Hmm, I know that at least one company has been having great success in using RXP with Python; from their experience RXP is faster on average XML than any of the other available (validating) parsers. May be due to their application, though, so YMMV. > > Given the many alternatives, I am not sure whether going with > > expat is the right path... may be wrong though. > > It shouldn't be the only path. pyexpat is already integrated into the > Python library, all I'm suggesting to give the promise that it will be > available on every 2.2 Python installation. > > Any volunteers working on RXP integration are certainly welcome to do > so; code contributions to PyXML will be welcome (provided the GPL > issue gets resolved). Code contributions to the Python core would > require some review, of course - it took quite some time to get > pyexpat stable, and I guess any other C-integrated parser won't work > from scratch, either. True. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Oct 1 16:02:22 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Oct 2001 17:02:22 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> Message-ID: <3BB8857E.4C370439@lemburg.com> Martin von Loewis wrote: > > > If you want a very fast validating XML parser, RXP would also > > be a good choice -- AFAIK, the RXP folks would allow us to > > ship RXP under a different license than GPL which is then > > bound to Python. > > RXP would indeed be a choice. Of course, integrating it is much > harder; you'd have to write the C module first, plus documentation, > plus a SAX driver, plus test cases. I'm not sure how much code you can > inherit from PyLTXML. > > On performance: Please have a look at > > http://www.xml.com/lpt/a/Benchmark/exec.html > > which suggests that expat still has a speed advantage over rxp > (assuming that the measurements where done carefully, i.e. disabling > validation in RXP). How would libxml fit into this picture ? http://xmlsoft.org/ libxml is written in C as well and under the LGPL. There's also Apache's Xerces which is written in a portable subset of C++ (is probably to big though to be intergated into Python): http://xml.apache.org/xerces-c/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Mon Oct 1 16:23:56 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 1 Oct 2001 17:23:56 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de><3BB74B99.3B230398@lemburg.com><200109301708.TAA22863@paros.informatik.hu-berlin.de> <15288.31458.735886.325225@grendel.zope.com> Message-ID: <019a01c14a8d$178d39a0$b3fa42d5@hagrid> > I have mixed feelings. There are really two things that we could > do: We could add Expat to our CVS repository, which means syncing a > bunch of files everytime a new Expat release comes out I thought MvL had already volunteered to do this? > > There are other candidates which would fit this role just > > as well (in particular, Fredrik's sgmlop looks like a nice > > extension since it not only works with XML but also many > > other meta languages). > > See Martin's comments about this. I think this precludes inclusion > of sgmlop until the problems it has have been addressed in the > implementation. cannot fix bugs if nobody bothers to report them ;-) (the crash issue appears to be a rumour; there was a bug when running in SGML mode, but that was fixed long ago. people using the current release in real-life applications haven't reported any stability problems...) on the other hand, sgmlop itself will never be anything but a "fast but sloppy" XML tokenizer. if you risk running into xml compliance nazis <0.1 wink>, you shouldn't use it. From fdrake@acm.org Mon Oct 1 16:23:59 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 1 Oct 2001 11:23:59 -0400 Subject: [Python-Dev] Integrating Expat In-Reply-To: <019a01c14a8d$178d39a0$b3fa42d5@hagrid> References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <15288.31458.735886.325225@grendel.zope.com> <019a01c14a8d$178d39a0$b3fa42d5@hagrid> Message-ID: <15288.35471.300115.993620@grendel.zope.com> Fredrik Lundh writes: > I thought MvL had already volunteered to do this? I didn't state this was a huge issue or that it didn't have a nice solution. ;-) It also isn't something that happens all that often, given that I don't have a lot of time to make Expat releases. > cannot fix bugs if nobody bothers to report them ;-) > > (the crash issue appears to be a rumour; there was a bug when > running in SGML mode, but that was fixed long ago. people using > the current release in real-life applications haven't reported any > stability problems...) Glad to hear this! Perhaps someone (not implying you) should start writing a substantial test suite for it to ferret out any remaining bugs? I don't see a test_sgmlop.py in the PyXML package; if you already have something perhaps you could contribute it? It might help you unload maintenance if anyone does manage to find a bug. > on the other hand, sgmlop itself will never be anything but a "fast > but sloppy" XML tokenizer. if you risk running into xml compliance > nazis <0.1 wink>, you shouldn't use it. "Nazi" would not have been my word for it, but ... Wham! ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jepler@inetnebr.com Mon Oct 1 15:00:29 2001 From: jepler@inetnebr.com (Jeff Epler) Date: Mon, 1 Oct 2001 09:00:29 -0500 Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: <200109301837.OAA32659@cj20424-a.reston1.va.home.com> References: <200109301837.OAA32659@cj20424-a.reston1.va.home.com> Message-ID: <20011001090026.B1755@inetnebr.com> On Sun, Sep 30, 2001 at 02:37:00PM -0400, Guido van Rossum wrote: > > Sure. Still, I think interpreter diagnostics should be pointing to the > > exact place of trouble. At least, __getattribute__ must appear somewhere > > in the traceback to give a hint where from __repr__ was attempted to be > > called. > > When you write a faulty __getattribute__ that returns None instead of > raising AttributeError, it's not realistic to expect __getattribute__ > to be in the stack trace. PyChecker might want to give an error if flow in __getattr[ibute]__ doesn't either pass through a return or a 'raise AttributeError'. Even if the intent is to have attributes default to None, an explicit 'return None' could be added. Jeff From skip@pobox.com (Skip Montanaro) Mon Oct 1 18:08:00 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 12:08:00 -0500 Subject: [Python-Dev] Integrating Expat In-Reply-To: <3BB882C5.D1BDFE79@lemburg.com> References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <3BB882C5.D1BDFE79@lemburg.com> Message-ID: <15288.41712.751198.83416@beluga.mojam.com> mal> I think that the sgmlop design is sufficiently simple and easy to mal> extend to make it a good candidate for inclusion. Sure, we'll get mal> bug reports, but why not add sgmlop marked as experimental to the mal> core in order to get it stabilized and bug-fixed ?! I would be happy to sgmlop added to the core. The xmlrpclib encoding and decoding do need some sort of C-based acceleration to be usable: % python testxmlrpc.py testing with xmlrpclib 0.9.8 using FastParser 415 dumps per second 106 loads per second disabling fast parsers in xmlrpclib using SlowParser 412 dumps per second 16.1 loads per second FWIW, the xmlrpclib delivered with Python is substantially slower dumping data than the 0.9.x versions that have been around awhile, though its decoding performance degrades less without sgmlop. Compare the above with this: % PYTHONPATH=~/misc/python/python2 python testxmlrpc.py testing with xmlrpclib 1.0b3 using SgmlopParser 229 dumps per second 94.3 loads per second disabling fast parsers in xmlrpclib using ExpatParser 231 dumps per second 76.8 loads per second I haven't had or taken the time to investigate the difference yet. Skip From loewis@informatik.hu-berlin.de Mon Oct 1 18:09:25 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Mon, 1 Oct 2001 19:09:25 +0200 (MEST) Subject: [Python-Dev] Integrating Expat In-Reply-To: <15288.31458.735886.325225@grendel.zope.com> (fdrake@acm.org) References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <15288.31458.735886.325225@grendel.zope.com> Message-ID: <200110011709.TAA13228@paros.informatik.hu-berlin.de> > I have mixed feelings. There are really two things that we could > do: We could add Expat to our CVS repository, which means syncing a > bunch of files everytime a new Expat release comes out, or we could > bundle the Expat sources with the Python source distribution when the > distribution is built, but not add them to CVS. This avoids the extra > files in CVS, but complicates construction of the distribution and > adds a new wrinkle to the configuration management. I'm in favour of putting it into the CVS. This will allow CVS users to build it, and it will allow proper integration and testing of required changes into setup.py (and Setup.dist; the current text has caused a lot of confusion in the past). Regards, Martin From mal@lemburg.com Mon Oct 1 18:50:31 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Oct 2001 19:50:31 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <3BB882C5.D1BDFE79@lemburg.com> <15288.41712.751198.83416@beluga.mojam.com> Message-ID: <3BB8ACE7.B61A4881@lemburg.com> Skip Montanaro wrote: > > mal> I think that the sgmlop design is sufficiently simple and easy to > mal> extend to make it a good candidate for inclusion. Sure, we'll get > mal> bug reports, but why not add sgmlop marked as experimental to the > mal> core in order to get it stabilized and bug-fixed ?! > > I would be happy to sgmlop added to the core. The xmlrpclib encoding and > decoding do need some sort of C-based acceleration to be usable: > > % python testxmlrpc.py > testing with xmlrpclib 0.9.8 > using FastParser > 415 dumps per second > 106 loads per second > disabling fast parsers in xmlrpclib > using SlowParser > 412 dumps per second > 16.1 loads per second > > FWIW, the xmlrpclib delivered with Python is substantially slower dumping > data than the 0.9.x versions that have been around awhile, though its > decoding performance degrades less without sgmlop. Compare the above with > this: > > % PYTHONPATH=~/misc/python/python2 python testxmlrpc.py > testing with xmlrpclib 1.0b3 > using SgmlopParser > 229 dumps per second > 94.3 loads per second > disabling fast parsers in xmlrpclib > using ExpatParser > 231 dumps per second > 76.8 loads per second > > I haven't had or taken the time to investigate the difference yet. Hmm, you cannot really compare these numbers though, since the two runs use two different sets of parsers. Have you checked using SgmlopParser with the 0.9.8 version of xmlrpclib ? There's also a project on SF called py-xmlrpc which uses a C implementation as basis and is said to be much faster than xmlrpclib (at least that's what they quote on their web-page). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Mon Oct 1 19:05:35 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 13:05:35 -0500 Subject: [Python-Dev] Integrating Expat In-Reply-To: <3BB8ACE7.B61A4881@lemburg.com> References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <3BB882C5.D1BDFE79@lemburg.com> <15288.41712.751198.83416@beluga.mojam.com> <3BB8ACE7.B61A4881@lemburg.com> Message-ID: <15288.45167.465025.917304@beluga.mojam.com> mal> Hmm, you cannot really compare these numbers though, since the two mal> runs use two different sets of parsers. Have you checked using mal> SgmlopParser with the 0.9.8 version of xmlrpclib ? They are the same parser. I forgot to mention that. What is called "FastParser" in 0.9.8 is called SgmlopParser in the CVS version. That has a different thing called "FastParser". I believe it is the thing you can get by contacting Pythonware. mal> There's also a project on SF called py-xmlrpc which uses a C mal> implementation as basis and is said to be much faster than mal> xmlrpclib (at least that's what they quote on their web-page). Yes, it is. Amazingly enough, the guy who wrote it (Shilad Sen at Sourcelight Technologies) works in the same building I do (and it's a pretty small building). We had lunch last week and talked a bit about it. It doesn't yet do Unicode. I sent Shilad my little test script. He modified it to use his parser. His results suggest that py-xmlrpc is about as fast as cPickle. Skip From Samuele Pedroni Mon Oct 1 10:20:10 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Mon, 1 Oct 2001 11:20:10 +0200 (MET DST) Subject: [Python-Dev] Re: descr PLAN.txt Message-ID: <200110010920.LAA07635@core.inf.ethz.ch> > > I think and it seems it contains some useful information > > for when we try to port the descr changes to jython, > > or will all this stuff be completely documented somewhere else? > > Hopefully the PEPs will be much more complete records. PLAN.txt was > just my own to-do list, indicating how far I was along the realization > of the PEPs. I realize that the PEPs are currently behind, but it is > in the PLAN.txt to remedy this. :-) I see. > I'm glad you are planning to copy this effort in Jython. Yes I'm planning to do try that. I don't think that without keeping up with Python Jython can survive. Not that we are very fast at keeping up OTOH. (yes for very practical reasons I'm among those who would prefer a slower development for Python :)) > (It should > actually be simpler in Jython as it doesn't have so much of a > dichotomy between types and classes as C-Python does.) I take this as an encouragement but actually the good point is that Java is bit more productive than multi-platform C, that's all. You should consider that we have to integrate the changes with the whole Java interoperability architecture and that we should deal with the limitations of java reflection and probably invent some new kind of inheriting from java, different from what is already there, that means new code that should produce dynamically some java bytecode ... > If there's > anything I can do to help (short of coding in Java :-) please let me > know. In particular, if there's anything unclear in the PEPs or > PLAN.txt or in python.org/2.2/descrintro.html or anywhere else > (including the C code), I'd be happy to clarify it at your (or anybody > else's) request. Thanks, I imagine we will have some questions :) > (Is Finn Bock still around in Jython?) AFAIK yes. For 2.1 (yet not out) he has done: - rich comparison - new coercion model - weak ref support - importing from jars on sys.path ... I have done: static nested scopes fixed some old nasty bugs improved the compatibility of importing wrt CPython And in any case I hope so: given that for 2.2 we have to do: iterators generators descr both in the runtime and jythonc ... and our user base so far has been quite passive. Sorry for the rants. regards. From skip@pobox.com (Skip Montanaro) Mon Oct 1 18:55:37 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 12:55:37 -0500 Subject: [Python-Dev] xmlrpclib speed boost Message-ID: <15288.44569.922160.604530@beluga.mojam.com> Regarding slowdown in xmlrpclib dump performance, I wrote: Skip> I haven't had or taken the time to investigate the difference yet. Well, it turned out to be pretty trivial. Somewhere along the way, the import of the escape function from the cgi module got moved out of the top level into the various functions that call it. Moving it back out to top level fixed that. I just checked in this change and a fairly trivial test case for xmlrpclib. This change may be deemed not to be the correct fix as far backwards compatibility is concerned (it uses the "from m import x as y" feature which was new with 2.0 I think). If someone alters this fix, please don't put the import back into the functions that call cgi.escape. Skip From thomas@xs4all.net Mon Oct 1 19:45:49 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 1 Oct 2001 20:45:49 +0200 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: <15288.44569.922160.604530@beluga.mojam.com> References: <15288.44569.922160.604530@beluga.mojam.com> Message-ID: <20011001204548.B846@xs4all.nl> On Mon, Oct 01, 2001 at 12:55:37PM -0500, Skip Montanaro wrote: > This change may be deemed not to be the correct fix as far backwards > compatibility is concerned (it uses the "from m import x as y" feature which > was new with 2.0 I think). If someone alters this fix, please don't put the > import back into the functions that call cgi.escape. Why not ? Moving the import to the top level just causes the slowdown to occur at a different moment. If this is really the problem, the slowdown should occur only the first time you use a particular function (unless you explicitly un-import cgi somehow ?) Importing cgi only in the functions that actually use it, in order to avoid the slowdown unless it's really necessary, sure seems like a sensible solution to me :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Oct 1 19:44:37 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Oct 2001 20:44:37 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> <200109301708.TAA22863@paros.informatik.hu-berlin.de> <3BB882C5.D1BDFE79@lemburg.com> <15288.41712.751198.83416@beluga.mojam.com> <3BB8ACE7.B61A4881@lemburg.com> <15288.45167.465025.917304@beluga.mojam.com> Message-ID: <3BB8B995.3EA05D66@lemburg.com> Skip Montanaro wrote: > > mal> Hmm, you cannot really compare these numbers though, since the two > mal> runs use two different sets of parsers. Have you checked using > mal> SgmlopParser with the 0.9.8 version of xmlrpclib ? > > They are the same parser. I forgot to mention that. What is called > "FastParser" in 0.9.8 is called SgmlopParser in the CVS version. That has a > different thing called "FastParser". I believe it is the thing you can get > by contacting Pythonware. I see. Rereading your numbers suggests that only dumps got slower. Now that you've fixed this in CVS the reason is obvious... from xyz import abc is slow. OTOH, Fredrik mentions that he put in this change in order to decrease startup time for the lib. I guess you can't win 'em all :-) > mal> There's also a project on SF called py-xmlrpc which uses a C > mal> implementation as basis and is said to be much faster than > mal> xmlrpclib (at least that's what they quote on their web-page). > > Yes, it is. Amazingly enough, the guy who wrote it (Shilad Sen at > Sourcelight Technologies) works in the same building I do (and it's a pretty > small building). We had lunch last week and talked a bit about it. It > doesn't yet do Unicode. I sent Shilad my little test script. He modified > it to use his parser. His results suggest that py-xmlrpc is about as fast > as cPickle. Cool ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Mon Oct 1 19:47:04 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 1 Oct 2001 14:47:04 -0400 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: <15288.44569.922160.604530@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Somewhere along the way, the import of the escape function from the cgi > module got moved out of the top level into the various functions that > call it. Moving it back out to top level fixed that. I just checked in > this change and a fairly trivial test case for xmlrpclib. > > This change may be deemed not to be the correct fix as far backwards > compatibility is concerned (it uses the "from m import x as y" > feature which was new with 2.0 I think). I pay no attention to "backwards compatibility" issues in the libraries we ship, and have Guido's blessing for that ornery attitude: the libs we ship should do the best possible job in the context of the release they're shipped in. It's explicitly not a PythonLabs goal that people be able to pick modules out of a release and use them in an earlier release (although it may be a goal of some developers for some modules -- but then the burden is on them to ensure it works for the module and release combinations they care about). From skip@pobox.com (Skip Montanaro) Mon Oct 1 19:56:58 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 13:56:58 -0500 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: <20011001204548.B846@xs4all.nl> References: <15288.44569.922160.604530@beluga.mojam.com> <20011001204548.B846@xs4all.nl> Message-ID: <15288.48250.206337.702825@beluga.mojam.com> >> This change may be deemed not to be the correct fix as far backwards >> compatibility is concerned (it uses the "from m import x as y" >> feature which was new with 2.0 I think). If someone alters this fix, >> please don't put the import back into the functions that call >> cgi.escape. Thomas> Why not ? Moving the import to the top level just causes the Thomas> slowdown to occur at a different moment. If this is really the Thomas> problem, the slowdown should occur only the first time you use a Thomas> particular function (unless you explicitly un-import cgi somehow Thomas> ?) Trust me, this really *is* the problem. Instead of getting imported once, it gets imported once for every string and once for every key in every dictionary. I timed it before and after. Compare Marshaller.dump_string before def dump_string(self, value): from cgi import escape self.write("%s\n" % escape(value)) and after def dump_string(self, value): self.write("%s\n" % _escape(value)) I don't care how little work it is to import a module a second time, it is probably going to be on the same order of magnitude as that write call. In Marshaller.dump_struct the import was *inside* the for loop. I originally pulled it up out of the for loop but left it inside the method. That got me about a 25% boost in my simple test. I was ready to check in that one change and thought, "aw hell, might as well see what happens if I pull the import all the way out to the top level". Dump performance went all the way back up to where 0.9.8 is. Thomas> Importing cgi only in the functions that actually use it, in Thomas> order to avoid the slowdown unless it's really necessary, sure Thomas> seems like a sensible solution to me :-) Except dumping structs (dictionaries) and strings happens a lot. Even if all we ever did was dump ints, floats and lists, the total cost of my change would be one import. Skip From guido@python.org Mon Oct 1 19:54:34 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 01 Oct 2001 14:54:34 -0400 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: Your message of "Mon, 01 Oct 2001 20:45:49 +0200." <20011001204548.B846@xs4all.nl> References: <15288.44569.922160.604530@beluga.mojam.com> <20011001204548.B846@xs4all.nl> Message-ID: <200110011854.OAA07942@cj20424-a.reston1.va.home.com> > > This change may be deemed not to be the correct fix as far > > backwards compatibility is concerned (it uses the "from m import x > > as y" feature which was new with 2.0 I think). If someone alters > > this fix, please don't put the import back into the functions that > > call cgi.escape. > > Why not ? Moving the import to the top level just causes the slowdown to > occur at a different moment. If this is really the problem, the slowdown > should occur only the first time you use a particular function (unless you > explicitly un-import cgi somehow ?) > > Importing cgi only in the functions that actually use it, in order to avoid > the slowdown unless it's really necessary, sure seems like a sensible > solution to me :-) There's a cost to an import inside a function even if the module is already imported. If the function is very small and heavily used (as it apparently was in this case) the function can slow down considerably. It's kind of sad that the escape() function, which is only a few lines, comes from the cgi module which is thousands of lines that are not normally used except by actual CGI scripts. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Mon Oct 1 20:20:37 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 1 Oct 2001 15:20:37 -0400 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: <200110011854.OAA07942@cj20424-a.reston1.va.home.com> References: <15288.44569.922160.604530@beluga.mojam.com> <20011001204548.B846@xs4all.nl> <200110011854.OAA07942@cj20424-a.reston1.va.home.com> Message-ID: <15288.49669.875132.906282@grendel.zope.com> Guido van Rossum writes: > It's kind of sad that the escape() function, which is only a few > lines, comes from the cgi module which is thousands of lines that are > not normally used except by actual CGI scripts. Yep. In this case, it probably makes more sense to call the escape() function from the xml.sax.saxutils module, though. The cgi version is for HTML and the xml.sax.saxutils version is for XML (big surprise). With only one argument, though, there's no difference. If we really want to deal with the startup time problem (regardless of which module we get the escape function from), we could do this: def _escape(s): global _escape import xml.sax.saxutils _escape = xml.sax.saxutils.escape return _escape(s) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fredrik@pythonware.com Mon Oct 1 21:01:31 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 1 Oct 2001 22:01:31 +0200 Subject: [Python-Dev] xmlrpclib speed boost References: <15288.44569.922160.604530@beluga.mojam.com> Message-ID: <02cc01c14ab3$e0124660$b3fa42d5@hagrid> skip wrote: > This change may be deemed not to be the correct fix as far backwards > compatibility is concerned (it uses the "from m import x as y" feature which > was new with 2.0 I think). If someone alters this fix, please don't put the > import back into the functions that call cgi.escape. you forgot to remove the comment at the top of the file that says it works with 1.5.2 and later. I've restored backwards compatibility, and tweaked the marshaller a little bit more. the new code is about 80% faster than 1.0b3 on my machine. ymmv, as usual. From skip@pobox.com (Skip Montanaro) Mon Oct 1 21:04:11 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 15:04:11 -0500 Subject: [Python-Dev] xmlrpclib speed boost In-Reply-To: <02cc01c14ab3$e0124660$b3fa42d5@hagrid> References: <15288.44569.922160.604530@beluga.mojam.com> <02cc01c14ab3$e0124660$b3fa42d5@hagrid> Message-ID: <15288.52283.381456.704181@beluga.mojam.com> Fredrik> skip wrote: >> This change may be deemed not to be the correct fix as far backwards >> compatibility is concerned (it uses the "from m import x as y" >> feature which was new with 2.0 I think). If someone alters this fix, >> please don't put the import back into the functions that call >> cgi.escape. Fredrik> you forgot to remove the comment at the top of the file that Fredrik> says it works with 1.5.2 and later. Thanks, this is what I was referring to (of course, I didn't see the comment). Fredrik> I've restored backwards compatibility, and tweaked the Fredrik> marshaller a little bit more. the new code is about 80% faster Fredrik> than 1.0b3 on my machine. ymmv, as usual. Cool. Skip From gward@python.net Tue Oct 2 01:54:17 2001 From: gward@python.net (Greg Ward) Date: Mon, 1 Oct 2001 20:54:17 -0400 Subject: [Python-Dev] Demo/dns Message-ID: <20011001205417.A6563@cthulhu.mems-exchange.org> Demo/dns is pretty old, and has been superseded by Anthony Baxter's pydns project on SourceForge. At the very least, Demo/dns/README should point to http://sourceforge.net/projects/pydns instead of Anthony's old page for this project (dead URL). More radically, we could delete all the code in Demo/dns, and just leave behind a little README pointing at Anthony's project. I've *just* checked out his code, haven't dug in seriously yet, so I don't know if this is a good idea yet. Opinions? If I hear nothing, I'll just fix the URL in the README. Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ A day for firm decisions!!!!! Or is it? From skip@pobox.com (Skip Montanaro) Tue Oct 2 02:45:10 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 20:45:10 -0500 Subject: [Python-Dev] Performance of various marshallers Message-ID: <15289.7206.266140.820041@beluga.mojam.com> I just tested each of the marshallers readily available to me. I dumped and loaded this object: ['MusicEntry', 'email': 'foo@bar.baz.spam', 'time': '7:30pm', 'tickets': '', 'program': '', 'state': 'MA', 'start': '2002-01-26', 'venueurl': '', 'country': '', 'performers': ['An Evening with Karen Savoca'], 'addressid': 7283, 'name': '', 'zip': '', 'city': 'Sudbury', 'info': 'Reservations required. Please call (978)443-3253 or e-mail Laurie at lalcorn@ultranet.com.', 'merchandise': [], 'event': '', 'keywords': ['.zyx.41'], 'submit_time': '2001-08-28', 'key': 325629, 'active': 1, 'end': '2002-01-26', 'address1': '', 'venue': 'Fox Run House Concerts', 'price': '$17', 'address3': '', 'address2': '', 'update_time': '2001-09-22:19:28:44'}] I don't claim this is typical data, but it is typical of the type of data I push through XML-RPC, so it's important to me. You can see why moving imports out of dump_string was so worthwhile. I would be happy to change the object being marshalled to better reflect what people think is "typical". All numbers in the following table are in encodings or decodings per second. All times were measured using time.clock. The number of times the encoding/decoding operation was performed was varied to give a reasonable total test time (approximately 5 seconds). Each test was run 3 times. The largest number is recorded below, rounded to three significant digits. encode decode ------ ------ marshal 25900 7830 cPickle 1230 149 xmlrpclib 0.9.8 w/ sgmlop 416 107 w/o sgmlop 415 16.3 xmlrpclib 1.0b4 w/ sgmlop 365 92.0 w/o sgmlop 363 74.9 py-xmlrpc 2780 2260 Skip From fdrake@acm.org Tue Oct 2 03:42:38 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 1 Oct 2001 22:42:38 -0400 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <15289.7206.266140.820041@beluga.mojam.com> References: <15289.7206.266140.820041@beluga.mojam.com> Message-ID: <15289.10654.162126.780312@grendel.zope.com> Skip Montanaro writes: > total test time (approximately 5 seconds). Each test was run 3 times. The > largest number is recorded below, rounded to three significant digits. > > encode decode > ------ ------ > marshal 25900 7830 > cPickle 1230 149 Were the cPickle tests run in binary or original flavor? > xmlrpclib 0.9.8 > w/ sgmlop 416 107 > w/o sgmlop 415 16.3 <--+ > xmlrpclib 1.0b4 | > w/ sgmlop 365 92.0 | > w/o sgmlop 363 74.9 <--+ > py-xmlrpc 2780 2260 | | +---------------------------------------------------+ | +----> I presume that Expat was available for the second run and not for the first? These should probably be broken into three categories: sgmlop, expat, and xmllib. I also presume that py-xmlrpc never calls from C->Python during the parse phase, but I've not yet had a chance to look at this code. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From skip@pobox.com (Skip Montanaro) Tue Oct 2 03:52:58 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 21:52:58 -0500 Subject: [Python-Dev] is htmllib broken in 2.2a4? Message-ID: <15289.11274.567758.60276@beluga.mojam.com> Responding to a question in python-help about extracting links from web pages, I wrote a simple href printer: import htmllib, formatter class MyParser(htmllib.HTMLParser): def anchor_bgn(self, href, name, type): print href fmt = formatter.NullFormatter() parser = MyParser(fmt, verbose=1) parser.feed(open("tour01.html").read()) parser.close() When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I missed. I thought maybe the method resolution order change affected things, but htmllib.HTMLParser only uses single inheritance. When displaying help about htmllib.HTMLParser, pydoc.help does emit the method resolution order, which it doesn't generally seem to do: class HTMLParser(sgmllib.SGMLParser) | Method resolution order: | HTMLParser | sgmllib.SGMLParser | markupbase.ParserBase ... Skip From fdrake@acm.org Tue Oct 2 03:55:54 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 1 Oct 2001 22:55:54 -0400 Subject: [Python-Dev] is htmllib broken in 2.2a4? In-Reply-To: <15289.11274.567758.60276@beluga.mojam.com> References: <15289.11274.567758.60276@beluga.mojam.com> Message-ID: <15289.11450.952344.374616@grendel.zope.com> Skip Montanaro writes: > When run using 2.2a4, it never prints anything. It outputs a list of hrefs > when run with 2.1 or 1.6. Either there's a bug somewhere (in my code > possibly, though it's pretty simple) or some semantics changed that I Sounds like a bug to me. Please file a bug report on SF including your code, and assign to me. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tim.one@home.com Tue Oct 2 04:05:11 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 1 Oct 2001 23:05:11 -0400 Subject: [Python-Dev] is htmllib broken in 2.2a4? In-Reply-To: <15289.11274.567758.60276@beluga.mojam.com> Message-ID: [Skip Montanaro] > Responding to a question in python-help about extracting links from web > pages, I wrote a simple href printer: > > import htmllib, formatter > > class MyParser(htmllib.HTMLParser): > def anchor_bgn(self, href, name, type): > print href > > fmt = formatter.NullFormatter() > parser = MyParser(fmt, verbose=1) > parser.feed(open("tour01.html").read()) > parser.close() > > When run using 2.2a4, it never prints anything. It outputs a > list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere > (in my code possibly, though it's pretty simple) or some semantics > changed that I missed. Sorry, I don't know anything about that, and I don't know that code. Open a bug report! Sure doesn't sound right to me. > I thought maybe the method resolution order change affected things, The MRO hasn't changed for classic classes. Only for new classes (so if you don't derive from object, nothing about MRO changed). > but htmllib.HTMLParser only uses single inheritance. When displaying > help about htmllib.HTMLParser, pydoc.help does emit the method > resolution order, which it doesn't generally seem to do: I recently changed pydoc to display MRO if and only if there are more than two classes an attribute *could* come from (if there are no more than two classe involved, there's no possibility of confusion; but if there are more than two, confusion is possible). > class HTMLParser(sgmllib.SGMLParser) > | Method resolution order: > | HTMLParser > | sgmllib.SGMLParser > | markupbase.ParserBase > ... It listed MRO simply because more than 2 classes are possible attribute sources. From skip@pobox.com (Skip Montanaro) Tue Oct 2 04:06:04 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 22:06:04 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <15289.10654.162126.780312@grendel.zope.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> Message-ID: <15289.12060.493061.267430@beluga.mojam.com> > total test time (approximately 5 seconds). Each test was run 3 times. The > largest number is recorded below, rounded to three significant digits. > > encode decode > ------ ------ > marshal 25900 7830 > cPickle 1230 149 Were the cPickle tests run in binary or original flavor? I wasn't aware of a "binary flavor". It's not mentioned in the online docs. I just called cPickle.dumps or cPickle.loads as appropriate. It looks like I should call them with a second binary flag. > xmlrpclib 0.9.8 > w/ sgmlop 416 107 > w/o sgmlop 415 16.3 <--+ > xmlrpclib 1.0b4 | > w/ sgmlop 365 92.0 | > w/o sgmlop 363 74.9 <--+ > py-xmlrpc 2780 2260 | | +---------------------------------------------------+ | +----> I presume that Expat was available for the second run and not for the first? These should probably be broken into three categories: sgmlop, expat, and xmllib. In 0.9.8 there are two parsers, fast (with sgmlop) and slow (without). I believe the ExpatParser was used in the second version. It doesn't really matter to me because they are all perform so abysmally. I also presume that py-xmlrpc never calls from C->Python during the parse phase, but I've not yet had a chance to look at this code. I don't know. I've not looked at the code, only the output. I have cc'd Shilad Sen on this thread. He should be able to tell us how py-xmlrpc gets such good performance. Skip From skip@pobox.com (Skip Montanaro) Tue Oct 2 04:21:16 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 1 Oct 2001 22:21:16 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <15289.10654.162126.780312@grendel.zope.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> Message-ID: <15289.12972.375469.537213@beluga.mojam.com> > encode decode > ------ ------ > cPickle 1230 149 Were the cPickle tests run in binary or original flavor? Setting the binary flag when dumping using cPickle boosts the encode rate to 4450 and the decode rate to 3940. Skip From andy@reportlab.com Tue Oct 2 06:53:40 2001 From: andy@reportlab.com (Andy Robinson) Date: Tue, 2 Oct 2001 06:53:40 +0100 Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1637 - 11 msgs In-Reply-To: Message-ID: > If you want a very fast validating XML parser, RXP would also > be a good choice -- AFAIK, the RXP folks would allow us to > ship RXP under a different license than GPL which is then > bound to Python. > > Given the many alternatives, I am not sure whether going with > expat is the right path... may be wrong though. > Lucky I tuned in. Reportlab has had great success with RXP. We have a python wrapper, pyRXP, with binaries available for several platforms. It is GPLed at present. They wish to keep GPL just in case someone big comes along and wants their code for ten million set-top boxes or something. However, I persuaded them to grant a license to let it be used through the Python binding under Python-like terms, as long as we invent the words and save them having to waste time on lawyers. They would even be happy for it to go into the Python distribution. And we're happy to maintain the wrapper and binaries for several platforms, which we have to do for our customers anyway. If one of the core Python team, who I know have long and painful experience of this stuff, would like to drop me a line, we can probably sort this out in a night. The other thing we found very useful was our representation. We make reports, and ML is a common data source; so our goal is typically to slurp XML into memory as fast as possible, with validation. We eventually hit on a 'tuple tree': each tag is represented as (tagname, attrs, list-of-children, spare) We get there about 6x faster than the fastest alternative parser we know, because all the work is done in C; with typical use of other parsers you call back into Python on every tag. The tree structure is a fraction of the size in memory of what gets created by models using objects for every node. It would be very easy to add this as an alternative interface to expat as well. So then Python users could have a choice of tree or events, and validating or non-validating, all done in C and in the standard distribution. Andy Robinson CEO/Chief Architect, Reportlab Inc. From effbot@telia.com Tue Oct 2 08:43:04 2001 From: effbot@telia.com (Fredrik Lundh) Date: Tue, 2 Oct 2001 09:43:04 +0200 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> Message-ID: <03aa01c14b15$e329ac00$b3fa42d5@hagrid> fred wrote: > > xmlrpclib 0.9.8 > > w/ sgmlop 416 107 > > w/o sgmlop 415 16.3 <--+ > > xmlrpclib 1.0b4 | > > w/ sgmlop 365 92.0 | > > w/o sgmlop 363 74.9 <--+ > > py-xmlrpc 2780 2260 | > | > +---------------------------------------------------+ > | > +----> I presume that Expat was available for the second run and not > for the first? These should probably be broken into three > categories: sgmlop, expat, and xmllib. footnote: 0.9.8 didn't support pyexpat. > I also presume that py-xmlrpc never calls from C->Python > during the parse phase, but I've not yet had a chance to look > at this code. does py-xmlrpc use a real XML parser? From skip@pobox.com (Skip Montanaro) Tue Oct 2 13:29:45 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 07:29:45 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <03aa01c14b15$e329ac00$b3fa42d5@hagrid> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> Message-ID: <15289.45881.920165.427067@beluga.mojam.com> >> I also presume that py-xmlrpc never calls from C->Python during the >> parse phase, but I've not yet had a chance to look at this code. Fredrik> does py-xmlrpc use a real XML parser? I suspect not. It's special purpose is to parse or generate XML-RPC, so you know ahead of time that the end result is the only thing you need. Skip From akuchlin@mems-exchange.org Tue Oct 2 18:57:24 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 02 Oct 2001 13:57:24 -0400 Subject: [Python-Dev] Bookstore proceeds Message-ID: I've received another cheque from Amazon with the last quarter's proceeds for the Python bookstore. The amount is just a bit less than $500; I'll contribute the few dollars to bring it up to $500 flat. Question: what do with the money? Right now the best candidate is to pay the Python10 conference fees for some of Jeffrey Elkner's students. I can't think of anything else to do with the money; anyone have any brilliant suggestions? --amk From guido@python.org Tue Oct 2 19:28:33 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 02 Oct 2001 14:28:33 -0400 Subject: [Python-Dev] Bookstore proceeds In-Reply-To: Your message of "Tue, 02 Oct 2001 13:57:24 EDT." References: Message-ID: <200110021828.f92ISXk25410@odiug.digicool.com> > I've received another cheque from Amazon with the last quarter's > proceeds for the Python bookstore. The amount is just a bit less than > $500; I'll contribute the few dollars to bring it up to $500 flat. > > Question: what do with the money? Right now the best candidate is to > pay the Python10 conference fees for some of Jeffrey Elkner's > students. I can't think of anything else to do with the money; anyone > have any brilliant suggestions? Subsidize Finn Bock and Samuele Pedroni coming to Python10? --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@ActiveState.com Tue Oct 2 19:34:45 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 11:34:45 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> Message-ID: <3BBA08C5.240EB4BC@ActiveState.com> Skip Montanaro wrote: > >... > > I suspect not. It's special purpose is to parse or generate XML-RPC, so you > know ahead of time that the end result is the only thing you need. One reason to use a full XML parser is you get Unicode cheaply. I don't see Unicode as a feature that you add in a weekend at the end... Paul Prescod From akuchlin@mems-exchange.org Tue Oct 2 19:37:34 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 2 Oct 2001 14:37:34 -0400 Subject: [Python-Dev] Bookstore proceeds In-Reply-To: <200110021828.f92ISXk25410@odiug.digicool.com>; from guido@python.org on Tue, Oct 02, 2001 at 02:28:33PM -0400 References: <200110021828.f92ISXk25410@odiug.digicool.com> Message-ID: <20011002143734.J21687@ute.mems-exchange.org> On Tue, Oct 02, 2001 at 02:28:33PM -0400, Guido van Rossum wrote: >Subsidize Finn Bock and Samuele Pedroni coming to Python10? Will $500 be of significant assistance? Note on cost: student fees seem to be around $250, so that's two students. Developer's Day registration is $295, so it could almost pay for their admission for that day; it's not enough to help with their hotel or flight. --amk From guido@python.org Tue Oct 2 19:41:09 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 02 Oct 2001 14:41:09 -0400 Subject: [Python-Dev] Bookstore proceeds In-Reply-To: Your message of "Tue, 02 Oct 2001 14:37:34 EDT." <20011002143734.J21687@ute.mems-exchange.org> References: <200110021828.f92ISXk25410@odiug.digicool.com> <20011002143734.J21687@ute.mems-exchange.org> Message-ID: <200110021841.f92If9T26208@odiug.digicool.com> > Will $500 be of significant assistance? Ask them. > Note on cost: student fees seem to be around $250, so that's two > students. Developer's Day registration is $295, so it could almost > pay for their admission for that day; it's not enough to help with > their hotel or flight. Flights are cheap. Somone locally might be able to put them up. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Tue Oct 2 19:57:25 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 13:57:25 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA08C5.240EB4BC@ActiveState.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> Message-ID: <15290.3605.641031.46951@beluga.mojam.com> >> I suspect not. It's special purpose is to parse or generate XML-RPC, >> so you know ahead of time that the end result is the only thing you >> need. Paul> One reason to use a full XML parser is you get Unicode cheaply. I Paul> don't see Unicode as a feature that you add in a weekend at the Paul> end... XML-RPC's relationship to Unicode is ill-defined. The spec that Dave Winer wrote requires all data to be US-ASCII, so XML-RPC isn't really XML-compliant. (You'll have to take up issues of standards compliance with Dave.) Still, Unicode or not, the notion that XML-RPC is a data serialization mechanism instead of a compound data markup language means you don't need to provide hooks for processing each element, so full-blown XML parsers tend to be overkill as py-xmlrpc demonstrates. No matter how hard Shilad finds it to add Unicode support to his package, it's still likely to be miles ahead of other XML parsers. Skip From bckfnn@worldonline.dk Tue Oct 2 20:12:53 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Tue, 02 Oct 2001 19:12:53 GMT Subject: [Python-Dev] Bookstore proceeds In-Reply-To: <200110021828.f92ISXk25410@odiug.digicool.com> References: <200110021828.f92ISXk25410@odiug.digicool.com> Message-ID: <3bba10cd.14259694@mail.wanadoo.dk> [Andrew Kuchling] > Question: what do with the money? Right now the best candidate is to > pay the Python10 conference fees for some of Jeffrey Elkner's > students. I can't think of anything else to do with the money; anyone > have any brilliant suggestions? [Guido] >Subsidize Finn Bock and Samuele Pedroni coming to Python10? "Thanks, but No Thanks" from me. While I would dearly like to meet you all, the travel is too much of a pain for my taste. regards, finn From martin@loewis.home.cs.tu-berlin.de Tue Oct 2 21:09:51 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 2 Oct 2001 22:09:51 +0200 Subject: [Python-Dev] Demo/dns Message-ID: <200110022009.f92K9p301782@mira.informatik.hu-berlin.de> > More radically, we could delete all the code in Demo/dns, and just > leave behind a little README pointing at Anthony's project. +1. Martin From guido@python.org Tue Oct 2 21:18:59 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 02 Oct 2001 16:18:59 -0400 Subject: [Python-Dev] Demo/dns In-Reply-To: Your message of "Tue, 02 Oct 2001 22:09:51 +0200." <200110022009.f92K9p301782@mira.informatik.hu-berlin.de> References: <200110022009.f92K9p301782@mira.informatik.hu-berlin.de> Message-ID: <200110022018.f92KIxV29996@odiug.digicool.com> > > More radically, we could delete all the code in Demo/dns, and just > > leave behind a little README pointing at Anthony's project. [MvL] > +1. If nobody disagrees, I'll do this. It's unclear where to leave a pointer -- a Demo/dns directory without code feels a little strange. Maybe we don't need to leave a pointer? --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@ActiveState.com Tue Oct 2 21:44:11 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 13:44:11 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> Message-ID: <3BBA271B.63B07CEF@ActiveState.com> Skip Montanaro wrote: > >... > > XML-RPC's relationship to Unicode is ill-defined. The spec that Dave Winer > wrote requires all data to be US-ASCII, so XML-RPC isn't really > XML-compliant. (You'll have to take up issues of standards compliance with > Dave.) Most XML-RPC implementations support Unicode, Dave Winer notwithstanding. Plus, the XML-RPC spec says nothing to indicate that XML-RPC documents may not be encoded in either of XML's two built-in encodings (even if the data is restricted to ASCII values). > Still, Unicode or not, the notion that XML-RPC is a data serialization > mechanism instead of a compound data markup language means you don't need to > provide hooks for processing each element, so full-blown XML parsers tend to > be overkill as py-xmlrpc demonstrates. I don't see how that follows. py-xmlrpc needs to handle different than so it needs to have a "hook" for each of those element types. Having a fixed list of hooks or an extensible array of them should not be much different from a performance point of view. Yes, an incomplete XML parser could be faster if it ignores Unicode, ignores character references, and does not do all of the error checking required by the spec. I'm not sure if this would really improve performance anyhow. py-xmlrpc is probably faster because it doesn't call out to Python code until the entire message has been parsed. xmlrpclib on the other hand, is entirely written in Python. Is there a Python XML-RPC implementation that uses no Python code but does use a true XML parser? > ... No matter how hard Shilad finds it > to add Unicode support to his package, it's still likely to be miles ahead > of other XML parsers. I think you are exaggerating the benefit of having a fixed vocabulary. There is hardly any performance boost possible based on that one detail. Paul Prescod From martin@loewis.home.cs.tu-berlin.de Tue Oct 2 22:30:13 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 2 Oct 2001 23:30:13 +0200 Subject: [Python-Dev] Integrating Expat Message-ID: <200110022130.f92LUDt02414@mira.informatik.hu-berlin.de> > How would libxml fit into this picture ? Uncertain. I could find out the following facts: Python Wrappers: Dave Kuhlmann has written some, see http://www.rexx.com/~dkuhlman/ Licensing: Available through either LGPL or W3C IPR (Daniel Veillard from Redhat is the maintainer, he used to work for the W3C). Portability: Apparently tested on Unix and Win32. Uses zlib and iconv when available. Size: 85kLOC C source code (compare to 11kLOC in Expat) Performance: I could not find any results on this subject, comparing libxml with other parsers. Regards, Martin From nas@python.ca Tue Oct 2 22:51:14 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 2 Oct 2001 14:51:14 -0700 Subject: [Python-Dev] Demo/dns In-Reply-To: <200110022018.f92KIxV29996@odiug.digicool.com>; from guido@python.org on Tue, Oct 02, 2001 at 04:18:59PM -0400 References: <200110022009.f92K9p301782@mira.informatik.hu-berlin.de> <200110022018.f92KIxV29996@odiug.digicool.com> Message-ID: <20011002145114.A20139@glacier.arctrix.com> Guido van Rossum wrote: > It's unclear where to leave a pointer -- a Demo/dns directory without > code feels a little strange. Misc/NEWS? Neil From skip@pobox.com (Skip Montanaro) Tue Oct 2 22:53:15 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 16:53:15 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA271B.63B07CEF@ActiveState.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> Message-ID: <15290.14155.432881.855383@beluga.mojam.com> >> Still, Unicode or not, the notion that XML-RPC is a data >> serialization mechanism instead of a compound data markup language >> means you don't need to provide hooks for processing each element, so >> full-blown XML parsers tend to be overkill as py-xmlrpc demonstrates. Paul> I don't see how that follows. py-xmlrpc needs to handle Paul> different than so it needs to have a "hook" for each of Paul> those element types. Having a fixed list of hooks or an extensible Paul> array of them should not be much different from a performance Paul> point of view. Sure, and mean different things, but will always mean the same thing in an XML-RPC context. There's no need to provide any hooks. Once you've successfully parsed a you get a Python dictionary. As far as I can tell sgmlop is always going to be slower than py-xmlrpc because it must callback to an Unmarshaller instance for each tag. The only option currently available is the Unmarshaller class written in Python. Pythonware has a FastParser/FastUnmarshaller pair available now which I don't have access to. Perhaps it exhibits encode/decode speeds similar to py-xmlrpc. You'll have to ask Fredrik. Py-xmlrpc was written with the knowledge that intermediate results aren't useful and that as you put it, it has a fixed vocabulary. Why structure a parser to accommodate situations that aren't needed? Paul> Yes, an incomplete XML parser could be faster if it ignores Paul> Unicode, ignores character references, and does not do all of the Paul> error checking required by the spec. I'm not sure if this would Paul> really improve performance anyhow. Does py-xmlrpc have a ways to go? Sure. It's still pretty new software, so give it time. You seem to be dismissing it completely because it's not as mature as, say, Expat. I doubt it will lose a factor of 8 in encoding speed or a factor of 24 in decoding speed (the current speed advantages I measure over xmlrpclib 1.0b4 using sgmlop) when those things are all added. I'm not sure all those things will ever be needed, but you're welcome to think they will. Paul> py-xmlrpc is probably faster because it doesn't call out to Python Paul> code until the entire message has been parsed. xmlrpclib on the Paul> other hand, is entirely written in Python. Is there a Python Paul> XML-RPC implementation that uses no Python code but does use a Paul> true XML parser? That's precisely why py-xmlrpc is faster. Should it behave some other way? I don't think there is another XML-RPC parser out there that is available from Python but that doesn't use Python. >> ... No matter how hard Shilad finds it to add Unicode support to his >> package, it's still likely to be miles ahead of other XML parsers. Paul> I think you are exaggerating the benefit of having a fixed Paul> vocabulary. There is hardly any performance boost possible based Paul> on that one detail. I don't understand see how you can't make that connection. XML-RPC has a fixed vocabulary and never needs to look at intermediate results. It sounds to me like all you have is a hammer so everything looks like a nail. There are places for general-purpose XML parsers and places for special-purpose XML parsers. In this particular context I only care about how fast I can push objects between a client and server using XML-RPC. I apologize if the subject seems more general than I intended. My only intention was to compare the data serialization performance of various tools. I didn't include "XML-RPC" in the subject of this thread because I tossed in marshal and cPickle results as well, simply for comparison. Skip From pedroni@inf.ethz.ch Tue Oct 2 22:57:35 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 2 Oct 2001 23:57:35 +0200 Subject: [Python-Dev] Bookstore proceeds References: <200110021828.f92ISXk25410@odiug.digicool.com> Message-ID: <00d601c14b8d$40049a80$8a73fea9@newmexico> [Guido] > Subsidize Finn Bock and Samuele Pedroni coming to Python10? Thanks for the proposal. Surely I would like to meet you all but - financial issues aside - while nice for me, I don't think my presence would be that much constructive and Finn - who's the main mantainer and has done a whole lot - deserves more than me to eventually meet you and the users and receive their thanks. I'm not your guy and Finn don't want to fly :( Samuele. From fredrik@pythonware.com Tue Oct 2 23:18:02 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 3 Oct 2001 00:18:02 +0200 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> Message-ID: <050d01c14b90$213834b0$b3fa42d5@hagrid> paul wrote: > > XML-RPC's relationship to Unicode is ill-defined. The spec that Dave Winer > > wrote requires all data to be US-ASCII, so XML-RPC isn't really > > XML-compliant. (You'll have to take up issues of standards compliance with > > Dave.) > > Most XML-RPC implementations support Unicode, Dave Winer > notwithstanding. Plus, the XML-RPC spec says nothing to indicate that > XML-RPC documents may not be encoded in either of XML's two built-in > encodings (even if the data is restricted to ASCII values). the specification says that XML-RPC uses XML and HTTP. it doesn't say anything about a Dave-specific subset of XML or HTTP... (like so many other parts of the specification, the "string" type isn't exactly well-specified. the specification first says that strings contains ASCII characters, and later that "any characters are allowed in a string" and that "a string can be used to encode binary data") > Yes, an incomplete XML parser could be faster if it ignores Unicode, > ignores character references, and does not do all of the error checking > required by the spec. I'm not sure if this would really improve > performance anyhow. well, sgmlop is a bit faster than expat (up to 50%, in some tests). expat does a bit more error checking. > xmlrpclib on the other hand, is entirely written in Python. Is there a > Python XML-RPC implementation that uses no Python code but does > use a true XML parser? the _xmlrpclib accelerator (see the xmlrpclib.py source) uses expat, with a really fast C layer. judging from Skip's benchmarks, expat is a bit slower than the py-xmlrpc parser (which is why I asked). From paul@pfdubois.com Tue Oct 2 23:16:16 2001 From: paul@pfdubois.com (Paul F. Dubois) Date: Tue, 2 Oct 2001 15:16:16 -0700 Subject: [Python-Dev] Pmw broken under 2.2a4? Message-ID: <000101c14b8f$db4938a0$3d01a8c0@plstn1.sfba.home.com> Building Pmw 0.8.5 under 2.2a4 I got syntax errors of the 'inconsistent use of tabs and spaces' kind. I'm not sure who I should tell but I speculate that I just did. From paul@ActiveState.com Tue Oct 2 23:15:49 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 15:15:49 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <15290.14155.432881.855383@beluga.mojam.com> Message-ID: <3BBA3C95.1CC094CC@ActiveState.com> Skip Montanaro wrote: > >... > > Sure, and mean different things, but will always > mean the same thing in an XML-RPC context. There's no need to provide any > hooks. There are two different issues. One is parsing: taking a string of bytes and interpreting them as XML. The other is passing this information to the Python programmer. The handling of "hooks" is on the backend, passing the information to the Python programmer. I interpreted Fredrick's question as being about the front end: does it use a real XML parser or not. >... > Does py-xmlrpc have a ways to go? Sure. It's still pretty new software, so > give it time. You seem to be dismissing it completely because it's not as > mature as, say, Expat. I'm not asking it to be as mature as Expat. I'm asking why it didn't *use* Expat or some other parser. Expat would recognize structs and arrays and pass them to C code which builds Python objects. Then those Python objects can be passed to Python. >... > Paul> py-xmlrpc is probably faster because it doesn't call out to Python > Paul> code until the entire message has been parsed. xmlrpclib on the > Paul> other hand, is entirely written in Python. Is there a Python > Paul> XML-RPC implementation that uses no Python code but does use a > Paul> true XML parser? > > That's precisely why py-xmlrpc is faster. Should it behave some other way? > I don't think there is another XML-RPC parser out there that is available > from Python but that doesn't use Python. Okay, so we agree that the fast part is probably not so much the parser but the handing of data to Python. So why rewrite a parser? Nothing requires an Expat-using XML-RPC implementation to call back into Python for every element. It can collect the results in C and then call Python when it has values. >... > I don't understand see how you can't make that connection. XML-RPC has a > fixed vocabulary and never needs to look at intermediate results. Let me suggest an analogy. Someone writes "CGIPython". It uses a specially optimized parser designed for parsing only Python CGI scripts. Do you think it would run much faster than the regular Python parser? Well, syntactically CGI scripts are basically the same as ordinary Python programs so why would you *want* a specialized parser? Parsing angle brackets is the same whether they are in an XML-RPC message or a Docbook document, just as parsing Python is the same, whether it is a CGI or a GUI app. > ... It sounds > to me like all you have is a hammer so everything looks like a nail. There > are places for general-purpose XML parsers and places for special-purpose > XML parsers. In this particular context I only care about how fast I can > push objects between a client and server using XML-RPC. I don't personally see much benefit using XML if you don't adhere to the XML spec. Just perusing the code quickly I believe I've found a few bugs that it would not have had if it built on Expat or some other XML parser. 1. It doesn't handle ? syntax. 2. It doesn't handle (extra whitespace) 3. I strongly suspect it won't handle comments in the XML. 4. It won't handle the mandatory UTF-16 encoding from XML 5. It won't handle CDATA sections. Paul Prescod From paul@ActiveState.com Tue Oct 2 23:19:05 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 15:19:05 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <050d01c14b90$213834b0$b3fa42d5@hagrid> Message-ID: <3BBA3D59.441625E9@ActiveState.com> Fredrik Lundh wrote: > >... > > the _xmlrpclib accelerator (see the xmlrpclib.py source) uses expat, > with a really fast C layer. judging from Skip's benchmarks, expat is a > bit slower than the py-xmlrpc parser (which is why I asked). I have a feeling py-xmlrpc will slow down a bit when it is internationalized: if (strncmp(*cp, "", 5) == 0) res = decodeInt(cp, ep, lines); else if (strncmp(*cp, "", 4) == 0) res = decodeI4(cp, ep, lines); .... Paul Prescod From firstalertnews@yahoo.com Tue Oct 2 21:51:59 2001 From: firstalertnews@yahoo.com (firstalertnews@yahoo.com) Date: Tue, 02 Oct 2001 13:51:59 -0700 Subject: [Python-Dev] FIRST ALERT - BTLY: OTCBB Message-ID: FIRSTALERT'S COMPANY PROFILE FOR THE WEEK OF OCTOBER 1st BENTLEY COMMUNICATIONS CORP: (OTC BB: BTLY) Good day Investors! After the last two weeks of trading everyone is looking for new investments. This week's company profile highlights Bentley Communications Corp. (OTC BB: BTLY). Bentley brings consumers into the world of cutting-edge international gaming websites. Through proprietary technology and advantageous partnership agreements, Bentley operates licensed and fully-insured online gaming websites and sublicenses I-gaming software to businesses worldwide. The Company's flagship I-property, SunriseCityCasino, offers games such as Blackjack, Roulette, Slots, Video Poker and provides users instant access to online betting lines for major sporting events worldwide. Bentley is poised to capture increasing share of the robust Internet sector of I-gaming. By 2002, the global I-gaming market is projected to exceed $1 billion and increase dramatically to more than $9.8 billion by 2005 (source: Merrill Lynch). Bentley targets its websites and software for the world's most popular I-gaming regions, including the United States, Asia and Australia. The Company's growth strategies are directed by Gordon Lee, a polished entrepreneur who has founded several successful emerging-growth companies. Under his leadership, Bentley is poised to enhance its market capitalization substantially over the next 24-months. CORPORATE SUMMARY Bentley Communications Corp. (OTC BB: BTLY) specializes in operating Internet wagering sites. The Company is licensed in both the Commonwealth of Dominica and Antigua to operate sites inclusive of sportsbook, horseracing and bingo. Bentley also sublicenses proprietary technology and software that enable clients to operate their own online gaming sites under the Bentley license. The Company recently announced the launch of a world-class online gaming site SunriseCityCasino.com. Fully licensed and insured, Sunrise offers more than 20 casino games and features a sportsbook with real-time betting lines on worldwide events. Bentley is embarking upon an aggressive growth strategy that comprises geographical expansion into Europe and Asia and new product lines, including the selling of international lottery tickets online. Bentley is also prepared to benefit from new legislation in the state of California. Effective January 1, 2002, telephone and Internet betting on horse racing will be legal in California. Bentley is developing the site offtrackbettingonline.com for launch on 1/2/02, in order to capitalize on this new market in the United States. With numerous revenue streams in place (online casino and sportsbook betting; sublicensing set-up and maintenance fees) and substantial opportunities to grow revenues and profits in the near-term, Bentley is poised to become a leader in the booming online international gaming market. COMPANY DATA Company Name: Bentley Comm Corp (BTLY) Exchange: OTC BB Shares Outstanding: 26,660,000 Market Cap: 2.4 Million 52-Week High: 0.875 on Wednesday, September 27, 2000 52-Week Low: 0.039 on Tuesday, July 31, 2001 Average Price: 0.0990 (50-day) 0.1337 (200-day) Average Volume: 110,100 (50-day) 73,600 (200-day) Fiscal Year End: June 30 Market Makers: 17 INDUSTRY STATISTICS According to Merrill Lynch, the number of online bettors is expected to increase dramatically from 12 million in 2001 to 81 million in 2005, translating to a market size of $769 million and $9.8 billion respectively. Bentley's primary offerings-Internet casinos and sports betting, currently generate the highest gross win (i.e. the amount of cash lost by consumers) in the industry, followed by online lotteries (a developing new product line for Bentley). Bentley's distinct competitive advantages, leading-edge security and fraud protection; a diversified product mix; turn-key proprietary software and technologies and strategic partnerships, will enable the Company to capture increasing share of this dynamic international marketplace. HIGHLIGHTS High-Growth Market. The worldwide Internet gaming market is expected to grow well over 450% over the next two years. Bentley is well-positioned to capitalize on this high-growth sector through the roll-out of several new product lines and regional expansion worldwide. Strategic Partnerships. Bentley has aligned itself with leading technology, banking and security firms to create world-class I-gaming software and Internet websites The Company is associated with the largest e-mail marketing company in the world, Salesmation and has an exclusive agreement with First International Bank of Dominica to provide real-timesecure online financial transactions. Sales & Profitability. Online casinos and sports betting are key revenue generators for Bentley. Additional revenues are generated from sub licensing fees, ranging from $30-50K and the management and maintenance payments of 10% of clients' net revenues. As new products are rolled out, Bentley anticipates to exceed 100% annual revenue growth and achieve profitability during 2001 and beyond. Proven Management. Bentley President and CEO Gordon Lee has a 20-year track record of directing private and public start-ups into successfully operating companies. Mr. Lee has held executive posts including founder, partner, officer, and director at companies such as USA Video Corporation (OTC BB: USVO), Startek.com Inc.(OTC BB: STEK), Future Media Technologies (OTC: FMTF) and American IDC Corp. (OTC BB: ACNI). Upside Opportunity. Bentley is implementing a multifold growth plan that includes international expansion into the fastest growing I-gaming region of Asia; roll-out of new, higher margin product lines such online lottery sales and the launch of additional websites that capitalize on I-gaming growth trends. These efforts should lead to accelerating revenues and increasing shareholder value. EXPERIENCED LEADERSHIP Gordon Lee, president and CEO of Bentley Communications Corp, has been involved with emerging growth companies in the private and public arena for more than 20 years. His astute business acumen has been applied toward a variety of industries, including hi-tech, communications, e-commerce, mining and oil and gas and he has been featured in more than 200 influential media publications. TO OPT OT PLEASE VISIT THE FOLLOWING SITE: http://65.12.58.131/optout.html DISCLAIMER This report is provided as an information service only, and the statements and opinions in this report should not be construed as an offer or solicitation to buy or sell any security. We accept no liability for any loss arising from an investor's reliance on or use of this report. Please do your own due diligence and research before making any investment decision. An investment in the company outlined herein is highly speculative and should not be considered unless you can afford a complete loss of your investment. Please, always invest with caution and within your means. You can find a great resource for investor education at the U.S. Securities and Exchange's website at http://www.sec.gov/investor.shtml. Also, please see the SEC filings of the company profiled herein, including the company's most recent annual and quarterly reports. EquityThunder Investment Corporation received 50,000 free trading shares of Bentley Communications (BTLY) from a third party in exchange for the drafting and distribution of this report. EquityThunder Investment Corporation intends to sell its shares. From guido@python.org Wed Oct 3 00:13:00 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 02 Oct 2001 19:13:00 -0400 Subject: [Python-Dev] Demo/dns In-Reply-To: Your message of "Tue, 02 Oct 2001 14:51:14 PDT." <20011002145114.A20139@glacier.arctrix.com> References: <200110022009.f92K9p301782@mira.informatik.hu-berlin.de> <200110022018.f92KIxV29996@odiug.digicool.com> <20011002145114.A20139@glacier.arctrix.com> Message-ID: <200110022313.TAA18864@cj20424-a.reston1.va.home.com> > Guido van Rossum wrote: > > It's unclear where to leave a pointer -- a Demo/dns directory without > > code feels a little strange. > > Misc/NEWS? > > Neil Good idea. Done. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Wed Oct 3 01:33:59 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 19:33:59 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA3C95.1CC094CC@ActiveState.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <15290.14155.432881.855383@beluga.mojam.com> <3BBA3C95.1CC094CC@ActiveState.com> Message-ID: <15290.23799.884095.945168@beluga.mojam.com> >> That's precisely why py-xmlrpc is faster. Should it behave some >> other way? I don't think there is another XML-RPC parser out there >> that is available from Python but that doesn't use Python. Paul> Okay, so we agree that the fast part is probably not so much the Paul> parser but the handing of data to Python. So why rewrite a parser? Paul> Nothing requires an Expat-using XML-RPC implementation to call Paul> back into Python for every element. It can collect the results in Paul> C and then call Python when it has values. You're asking the wrong person. Shilad will be the only person who can describe his motivations. We happen to work in the same building, but we don't work for the same company. That's a coincidence about on par with the chances of winning the Powerball lottery. We never met each other formally until about a week ago. Not trying to put words in his mouth, but my guess would be that he was not approaching it as an XML problem, but as a parsing problem. >> I don't understand see how you can't make that connection. XML-RPC >> has a fixed vocabulary and never needs to look at intermediate >> results. Paul> Let me suggest an analogy. Someone writes "CGIPython". It uses a Paul> specially optimized parser designed for parsing only Python CGI Paul> scripts. Do you think it would run much faster than the regular Paul> Python parser? Bad analogy. CGI scripts can contain the entire realm of "stuff" that goes into any other Python program. XML-RPC encodings can't contain arbitrary XML tags or attributes. A better analogy would have been (Martin's I think) hypothetical Swallow - a subset of Python that could be efficiently compiled. Paul> I don't personally see much benefit using XML if you don't adhere Paul> to the XML spec. Just perusing the code quickly I believe I've Paul> found a few bugs that it would not have had if it built on Expat Paul> or some other XML parser. Paul, you have to stop looking at XML-RPC with your Elton John-style XML-colored glasses. XML-RPC is not meant to be some sort of highly structured hierarchical data representation that you can sniff around in with arbitrary XML tools of one sort or another. That its on-the-wire representation happens to be XML is almost ridiculously unimportant. Dave Winer created an RPC tool that used XML at about the same time every computer journalist was wetting their pants every time they heard the letters X-M-L. Many implementations were able to leverage existing XML parsing tools to get going quickly, and Dave got some well-deserved publicity that he and XML-RPC wouldn't have gotten if he'd chosen some other serliazation format like Pickle, or invented something new. Next step: make it go faster. Can that be done with standard XML tools? Yeah, I'm sure it can be. Not everybody approaches the problem with the same background you have though. Paul> 1. It doesn't handle ? syntax. Paul> 2. It doesn't handle (extra whitespace) Paul> 3. I strongly suspect it won't handle comments in the XML. Paul> 4. It won't handle the mandatory UTF-16 encoding from XML Paul> 5. It won't handle CDATA sections. Fine. I'm sure Shilad appreciates the input. I think your approach to bug detection and reporting could have been a bit less heavy handed. As for handling things like CDATA, UTF-16 and extra whitespace after tag names, I suspect some other XML-RPC packages would exhibit similar problems if they were exposed to a standards-toting XML gunslinger like yourself. That it's not a problem in practice is probably because the set of XML-RPC encoding and decoding software is fairly small and that the stuff that encodes into XML-RPC is fairly well-behaved. XML-RPC's widespread availability and practical interoperability (the XML-RPC website lists 48 implementations) probably owes more to the cooperative nature of the people involved than the purity of the parsers. Skip From skip@pobox.com (Skip Montanaro) Wed Oct 3 01:41:04 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 19:41:04 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA3D59.441625E9@ActiveState.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <050d01c14b90$213834b0$b3fa42d5@hagrid> <3BBA3D59.441625E9@ActiveState.com> Message-ID: <15290.24224.866683.240585@beluga.mojam.com> Paul> I have a feeling py-xmlrpc will slow down a bit when it is Paul> internationalized: Paul> if (strncmp(*cp, "", 5) == 0) Paul> res = decodeInt(cp, ep, lines); Paul> else if (strncmp(*cp, "", 4) == 0) Paul> res = decodeI4(cp, ep, lines); Paul> .... Paul, If you want to find and fix bugs in py-xmlrpc or help the author improve the quality of his tools, please send your reports directly to Shilad Sen (shilad@sourcelight.com). Py-xmlrpc has nothing to do with the Python core. I apologize for even including it in the table I posted. Shilad didn't deserve any of the bad press you've given him here. Sending snickering notes to python-dev about the code is not helpful, and only serves to lessen the value I place on your other opinions. Skip From paul@ActiveState.com Wed Oct 3 02:11:47 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 18:11:47 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <15290.14155.432881.855383@beluga.mojam.com> <3BBA3C95.1CC094CC@ActiveState.com> <15290.23799.884095.945168@beluga.mojam.com> Message-ID: <3BBA65D3.3AE8FAA6@ActiveState.com> Skip Montanaro wrote: > >... > > Bad analogy. CGI scripts can contain the entire realm of "stuff" that goes > into any other Python program. XML-RPC encodings can't contain arbitrary > XML tags or attributes. A better analogy would have been (Martin's I think) > hypothetical Swallow - a subset of Python that could be efficiently > compiled. But there is no evidence that this subset of XML can be more efficiently parsed than any other. XML parsing consists primarily of recognizing angle brackets and a few other characters, and passing around some extra data. Any performance loss from a "full" XML parser will shrink as people submit bug reports that require a "simplified" XML parser to conform to the XML spec (Unicode, CDATA, etc.). I strongly agree that a dedicated C-written XML-RPC implementation can be faster than one written based on Python and Expat. I haven't yet seen evidence that you can both conform with the standards and get much of a speedup over one that is built on a fast XML Parser such as Eric Kidd's XML-RPC C or xmlrpc-epi (both on SourceForge). >... > Paul, you have to stop looking at XML-RPC with your Elton John-style > XML-colored glasses. XML-RPC is not meant to be some sort of highly > structured hierarchical data representation that you can sniff around in > with arbitrary XML tools of one sort or another. That its on-the-wire > representation happens to be XML is almost ridiculously unimportant. XML-RPC uses XML for exactly the same reason every other application of XML uses XML. Precisely so that you will not have to write yet another parser for it. That's the central reason *for* XML. That's the only advantage XML has over cPickle -- that you can be sure whatever language you have, it will have an XML parser available built in. > Fine. I'm sure Shilad appreciates the input. I think your > approach to bug detection and reporting could have been a bit > less heavy handed. I'm not trying to embarrass Shilad. The software isn't at 1.0 yet. Maybe he hasn't got around to choosing an XML parser. I'm trying to point out (more to you, than to him!) that there is a good reason to build on the work other people have done. If pyxmlrpc is faster today it is probably because it doesn't conform to the specs. When it does conform, it won't be faster anymore. > As for handling things like CDATA, UTF-16 and extra whitespace after tag > names, I suspect some other XML-RPC packages would exhibit similar problems > if they were exposed to a standards-toting XML gunslinger like yourself. > That it's not a problem in practice is probably because the set of XML-RPC > encoding and decoding software is fairly small and that the stuff that > encodes into XML-RPC is fairly well-behaved. Every XML-RPC implementation I have ever used (Python, Perl, C, C++, PHP) is based upon one pure XML parser or another. Most use Expat. Paul Prescod From shilad@sourcelight.com Wed Oct 3 02:28:53 2001 From: shilad@sourcelight.com (Shilad Sen) Date: Tue, 2 Oct 2001 20:28:53 -0500 (CDT) Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <15290.23799.884095.945168@beluga.mojam.com> from "Skip Montanaro" at Oct 02, 2001 07:33:59 PM Message-ID: <200110030128.f931Sr005236@jericho.sourcelight.com> Skip has been kind enough to copy me on the bulk of correspondence regarding py-xmlrpc versus other xmlrpc parsing options. py-xmlrpc began as a short hack to accomplish specific things that xmlrpclib couldn't easily accomodate. I used a hand build parser because I thought it would be fun and easy (it was!). Paul, you are correct in that my library doesn't support the 5 items you mentioned. I am aware of these, but they are actually not officially supported by the spec either. XML-RPC is a bit strange in that the spec does not allow or require true XML. My library has been adopted far more than I would have guessed, and I have had many questions about things like SSL support (which is not up to spec either). As a result, I am almost finished with a rewrite that has all the transport and protocol components nicely split up. I have on my list of todo's switching the hand coded parser to expat. My own parser works just fine, though, and I haven't had any complaints so that is relatively low on the list. My library is certainly not as flexible as xmlrpclib in it's current format. I'm hoping that the rewrite will move it to a nice place in the performance / flexibility spectrum. As a side effect, it will have a nice extensible standalone HTTP client and server that offers better performance for people who really need it. I am perfectly aware of py-xmlrpc's shortcomings. On the other hand it is exactly what the app we use needs, and I would be surprised if there aren't others who have similar needs. My hope is that with the next major release, the library will move a bit closer to a place that suits people like Paul. Meanwhile, it works nicely for applications where performance requirements are absolutely critical. Shilad Sen > > >> That's precisely why py-xmlrpc is faster. Should it behave some > >> other way? I don't think there is another XML-RPC parser out there > >> that is available from Python but that doesn't use Python. > > Paul> Okay, so we agree that the fast part is probably not so much the > Paul> parser but the handing of data to Python. So why rewrite a parser? > Paul> Nothing requires an Expat-using XML-RPC implementation to call > Paul> back into Python for every element. It can collect the results in > Paul> C and then call Python when it has values. > > You're asking the wrong person. Shilad will be the only person who can > describe his motivations. We happen to work in the same building, but we > don't work for the same company. That's a coincidence about on par with the > chances of winning the Powerball lottery. We never met each other formally > until about a week ago. Not trying to put words in his mouth, but my guess > would be that he was not approaching it as an XML problem, but as a parsing > problem. > > >> I don't understand see how you can't make that connection. XML-RPC > >> has a fixed vocabulary and never needs to look at intermediate > >> results. > > Paul> Let me suggest an analogy. Someone writes "CGIPython". It uses a > Paul> specially optimized parser designed for parsing only Python CGI > Paul> scripts. Do you think it would run much faster than the regular > Paul> Python parser? > > Bad analogy. CGI scripts can contain the entire realm of "stuff" that goes > into any other Python program. XML-RPC encodings can't contain arbitrary > XML tags or attributes. A better analogy would have been (Martin's I think) > hypothetical Swallow - a subset of Python that could be efficiently > compiled. > > Paul> I don't personally see much benefit using XML if you don't adhere > Paul> to the XML spec. Just perusing the code quickly I believe I've > Paul> found a few bugs that it would not have had if it built on Expat > Paul> or some other XML parser. > > Paul, you have to stop looking at XML-RPC with your Elton John-style > XML-colored glasses. XML-RPC is not meant to be some sort of highly > structured hierarchical data representation that you can sniff around in > with arbitrary XML tools of one sort or another. That its on-the-wire > representation happens to be XML is almost ridiculously unimportant. Dave > Winer created an RPC tool that used XML at about the same time every > computer journalist was wetting their pants every time they heard the > letters X-M-L. Many implementations were able to leverage existing XML > parsing tools to get going quickly, and Dave got some well-deserved > publicity that he and XML-RPC wouldn't have gotten if he'd chosen some other > serliazation format like Pickle, or invented something new. Next step: make > it go faster. Can that be done with standard XML tools? Yeah, I'm sure it > can be. Not everybody approaches the problem with the same background you > have though. > > Paul> 1. It doesn't handle ? syntax. > > Paul> 2. It doesn't handle (extra whitespace) > > Paul> 3. I strongly suspect it won't handle comments in the XML. > > Paul> 4. It won't handle the mandatory UTF-16 encoding from XML > > Paul> 5. It won't handle CDATA sections. > > Fine. I'm sure Shilad appreciates the input. I think your approach to bug > detection and reporting could have been a bit less heavy handed. > > As for handling things like CDATA, UTF-16 and extra whitespace after tag > names, I suspect some other XML-RPC packages would exhibit similar problems > if they were exposed to a standards-toting XML gunslinger like yourself. > That it's not a problem in practice is probably because the set of XML-RPC > encoding and decoding software is fairly small and that the stuff that > encodes into XML-RPC is fairly well-behaved. > > XML-RPC's widespread availability and practical interoperability (the > XML-RPC website lists 48 implementations) probably owes more to the > cooperative nature of the people involved than the purity of the parsers. > > Skip > From barry@zope.com Wed Oct 3 02:39:04 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 2 Oct 2001 21:39:04 -0400 Subject: [Python-Dev] Cookie.py and `:' in the key Message-ID: <15290.27704.19609.960912@anthem.wooz.org> In Mailman, I use a version of Cookie.py written by Timothy dated from 1998. I'm now trying to see if I can get rid of my independent copy and just use Cookie.py in the Python 2.x standard library. I've hit a snag; in Mailman's copy, it is legal to have a `:' in the key name, but in Python's Cookie.py it isn't: -------------------- snip snip -------------------- % python Python 2.1.1 (#1, Aug 31 2001, 17:07:00) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> import Cookie >>> c = Cookie.Cookie() >>> c['foo:bar'] = 'hello' Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.1/Cookie.py", line 583, in __setitem__ self.__set(key, rval, cval) File "/usr/local/lib/python2.1/Cookie.py", line 576, in __set M.set(key, real_value, coded_value) File "/usr/local/lib/python2.1/Cookie.py", line 456, in set raise CookieError("Illegal key value: %s" % key) Cookie.CookieError: Illegal key value: foo:bar >>> % PYTHONPATH=/path/to/mailman/misc python Python 2.1.1 (#1, Aug 31 2001, 17:07:00) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> import Cookie >>> c = Cookie.Cookie() >>> c['foo:bar'] = 'hello' >>> print c Set-Cookie: foo:bar=hello; >>> -------------------- snip snip -------------------- I don't see any reason why `:' shouldn't be allowed in Set-Cookie: value, but maybe I'm missing something in the RFCs. This patch fixes the problem but perhaps not in the right way. Comments? -Barry -------------------- snip snip -------------------- Index: Cookie.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/Cookie.py,v retrieving revision 1.11 diff -u -r1.11 Cookie.py --- Cookie.py 2001/08/02 07:15:29 1.11 +++ Cookie.py 2001/10/03 01:38:39 @@ -249,7 +249,7 @@ # _LegalChars is the list of chars which don't require "'s # _Translator hash-table for fast quoting # -_LegalChars = string.ascii_letters + string.digits + "!#$%&'*+-.^_`|~" +_LegalChars = string.ascii_letters + string.digits + "!#$%&'*+-.^_`|~:" _Translator = { '\000' : '\\000', '\001' : '\\001', '\002' : '\\002', '\003' : '\\003', '\004' : '\\004', '\005' : '\\005', From paul@ActiveState.com Wed Oct 3 02:47:06 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 18:47:06 -0700 Subject: [Python-Dev] Performance of various marshallers References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <050d01c14b90$213834b0$b3fa42d5@hagrid> <3BBA3D59.441625E9@ActiveState.com> <15290.24224.866683.240585@beluga.mojam.com> Message-ID: <3BBA6E1A.363F4E79@ActiveState.com> I apologize if I embarassed Shilad. I don't know him so I don't know how he will take a public critique of his code. For all I know, he agrees with me and merely hasn't got around to adding in an XML parser. On the one hand, I can see how it would be nicer to discuss it directly with you and him, but on the other, it is a real technical issue that deserves public discussion. I felt (and feel) that you've made a technical mistake in attributing py-xmlrpc's speed to its having a fixed tagset and I only posted code to demonstrate where the real speedup comes from. I've spent my whole life working around bugs in hand-rolled XML (and SGML) parsers that are supposed to be faster than general ones but end up not being so. I react almost as intemperately when someone tells me that their app embeds a new scripting language that they invented over the weekend. Although I do think that the current parsing approach taken in py-xmlrpc is flawed, I do think that the overall idea is good. It makes sense to parse XML-RPC purely in C without using Python callbacks. Paul Prescod From paul@ActiveState.com Wed Oct 3 02:57:35 2001 From: paul@ActiveState.com (Paul Prescod) Date: Tue, 02 Oct 2001 18:57:35 -0700 Subject: [Python-Dev] Performance of various marshallers References: <200110030128.f931Sr005236@jericho.sourcelight.com> Message-ID: <3BBA708F.D8554BBC@ActiveState.com> Shilad Sen wrote: > > Skip has been kind enough to copy me on the bulk of correspondence > regarding py-xmlrpc versus other xmlrpc parsing options. Thanks for your good-natured response. >... > Paul, you are correct in that my library doesn't support the 5 items you > mentioned. I am aware of these, but they are actually not officially > supported by the spec either. XML-RPC is a bit strange in that the spec > does not allow or require true XML. I think that if a spec claims to be based on XML and does not explicitly disclaim support for built-in XML features, then it allows them. For instance if it doesn't say that C syntax is illegal, then there is no reason to believe it is. > My library has been adopted far more than I would have guessed, and I > have had many questions about things like SSL support (which is not up > to spec either). As a result, I am almost finished with a rewrite that > has all the transport and protocol components nicely split up. I have > on my list of todo's switching the hand coded parser to expat. My own > parser works just fine, though, and I haven't had any complaints > so that is relatively low on the list. That's fine with me. If your simplified parser turns out to be significantly faster than Expat (too early to say) then you could even keep it around as an option when the client and the server are both known to be using the same subset of XML. > ... My hope is that with the next > major release, the library will move a bit closer to a place that suits > people like Paul. Meanwhile, it works nicely for applications where > performance requirements are absolutely critical. Did you consider wrapping one of the existing XML-RPC libraries written in C? When we needed a reentrant XML-RPC library for PHP, we wrapped Eric Kidd's xmlrpc-c. Paul Prescod From shilad@sourcelight.com Wed Oct 3 03:12:43 2001 From: shilad@sourcelight.com (Shilad Sen) Date: Tue, 2 Oct 2001 21:12:43 -0500 (CDT) Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA708F.D8554BBC@ActiveState.com> from "Paul Prescod" at Oct 02, 2001 06:57:35 PM Message-ID: <200110030212.f932Chn05354@jericho.sourcelight.com> >From Paul: > > That's fine with me. If your simplified parser turns out to be > significantly faster than Expat (too early to say) then you could even > keep it around as an option when the client and the server are both > known to be using the same subset of XML. > That's exactly what I intended to do. > > Did you consider wrapping one of the existing XML-RPC libraries written > in C? When we needed a reentrant XML-RPC library for PHP, we wrapped > Eric Kidd's xmlrpc-c. > I actually seriously considered using Eric Kidd's library. We have a need for an event model that would have been pretty painful to support through the w3c libraries, and that is what made me decide to use my own. My event system is very similar to medusa's, while w3c's system is quite a bit more heavyweight. After spending some time with the w3c libraries, I also decided I really didn't feel great about the general coding style of the libraries. Thats my own personal preference. I could definitely be persuaded otherwise. Shilad From skip@pobox.com (Skip Montanaro) Wed Oct 3 03:16:40 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 21:16:40 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA65D3.3AE8FAA6@ActiveState.com> References: <15289.7206.266140.820041@beluga.mojam.com> <15289.10654.162126.780312@grendel.zope.com> <03aa01c14b15$e329ac00$b3fa42d5@hagrid> <15289.45881.920165.427067@beluga.mojam.com> <3BBA08C5.240EB4BC@ActiveState.com> <15290.3605.641031.46951@beluga.mojam.com> <3BBA271B.63B07CEF@ActiveState.com> <15290.14155.432881.855383@beluga.mojam.com> <3BBA3C95.1CC094CC@ActiveState.com> <15290.23799.884095.945168@beluga.mojam.com> <3BBA65D3.3AE8FAA6@ActiveState.com> Message-ID: <15290.29960.899139.647577@beluga.mojam.com> >> Paul, you have to stop looking at XML-RPC with your Elton John-style >> XML-colored glasses. XML-RPC is not meant to be some sort of highly >> structured hierarchical data representation that you can sniff around >> in with arbitrary XML tools of one sort or another. That its >> on-the-wire representation happens to be XML is almost ridiculously >> unimportant. Paul> XML-RPC uses XML for exactly the same reason every other Paul> application of XML uses XML. I disagree with that. Lots of applications use XML because it's got that pants-wetting capability I described earlier. >> Fine. I'm sure Shilad appreciates the input. I think your approach >> to bug detection and reporting could have been a bit less heavy >> handed. Paul> I'm not trying to embarrass Shilad. The software isn't at 1.0 Paul> yet. Maybe he hasn't got around to choosing an XML parser. Or maybe he has a different set of constraints than you. Paul> I'm trying to point out (more to you, than to him!) that there is Paul> a good reason to build on the work other people have done. If Paul> pyxmlrpc is faster today it is probably because it doesn't conform Paul> to the specs. When it does conform, it won't be faster anymore. Why point this out to me? I am essentially just an XML-RPC user, not an implementer. I happen to be interested in making my XML-RPC-using code run faster. If I have to make some sacrifices I could care less, as long as my clients and my servers can talk to one another. >> As for handling things like CDATA, UTF-16 and extra whitespace after >> tag names, I suspect some other XML-RPC packages would exhibit >> similar problems if they were exposed to a standards-toting XML >> gunslinger like yourself. That it's not a problem in practice is >> probably because the set of XML-RPC encoding and decoding software is >> fairly small and that the stuff that encodes into XML-RPC is fairly >> well-behaved. Paul> Every XML-RPC implementation I have ever used (Python, Perl, C, Paul> C++, PHP) is based upon one pure XML parser or another. Most use Paul> Expat. Oh well. S From skip@pobox.com (Skip Montanaro) Wed Oct 3 03:23:31 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 2 Oct 2001 21:23:31 -0500 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: <3BBA708F.D8554BBC@ActiveState.com> References: <200110030128.f931Sr005236@jericho.sourcelight.com> <3BBA708F.D8554BBC@ActiveState.com> Message-ID: <15290.30371.513079.764350@beluga.mojam.com> Paul> I think that if a spec claims to be based on XML and does not Paul> explicitly disclaim support for built-in XML features, then it Paul> allows them. For instance if it doesn't say that C syntax is Paul> illegal, then there is no reason to believe it is. Paul, You probably know as well as anyone that the one and only person you should talking to about XML-RPC and its XML compliance (or lack thereof) is Dave Winer. Feel free to read through the archives of the xmlrpc@yahoogroups.com mailing list if you haven't already. If you can move Dave from his current position, more power to you. You'll do something that many other people have been incapable of doing. I'm done with this topic. It's gotten way too far from python-dev-related topics. Probably should have cut it out of the cc list awhile back. Skip From guido@python.org Wed Oct 3 04:05:32 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 02 Oct 2001 23:05:32 -0400 Subject: [Python-Dev] Performance of various marshallers In-Reply-To: Your message of "Tue, 02 Oct 2001 21:23:31 CDT." <15290.30371.513079.764350@beluga.mojam.com> References: <200110030128.f931Sr005236@jericho.sourcelight.com> <3BBA708F.D8554BBC@ActiveState.com> <15290.30371.513079.764350@beluga.mojam.com> Message-ID: <200110030305.XAA26130@cj20424-a.reston1.va.home.com> > I'm done with this topic. It's gotten way too far from python-dev-related > topics. Probably should have cut it out of the cc list awhile back. Amen. --Guido van Rossum (home page: http://www.python.org/~guido/) From timo@alum.mit.edu Wed Oct 3 06:18:36 2001 From: timo@alum.mit.edu (Timothy O'Malley) Date: Wed, 3 Oct 2001 01:18:36 -0400 Subject: [Python-Dev] Re: Cookie.py and `:' in the key In-Reply-To: <15290.27704.19609.960912@anthem.wooz.org> Message-ID: <1890D9DE-B7BE-11D5-B9BF-00306586E61E@alum.mit.edu> hola. On Tuesday, October 2, 2001, at 09:39 PM, Barry A. Warsaw wrote: > In Mailman, I use a version of Cookie.py written by Timothy dated from > 1998. I'm now trying to see if I can get rid of my independent > copy and just use Cookie.py in the Python 2.x standard library. According to a very strict reading of the appropriate specifications (RFC 2109 for cookies, which in turn references terms defined in RFC 2068 for HTTP), a colon is not legal in a value unless it is in a quoted value: Many HTTP/1.1 header field values consist of words separated by LWS or special characters. These special characters MUST be in a quoted string to be used within a parameter value. token = 1* tspecials = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT Even so, I think I agree that a strict interpretation isn't very useful in practice. In this case, for instance, the intended value is clear and obvious. I've gone back and forth on this -- should the implementation be true to the spec or should it follow its own rules for clear and obvious? Should people desire the "clear and obvious" over the "strict interpretation", I think your fix is dead on. TimO From skip@pobox.com (Skip Montanaro) Wed Oct 3 08:40:28 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 3 Oct 2001 02:40:28 -0500 Subject: [Python-Dev] dummy __del__ in SocketServer.BaseRequestHandler Message-ID: <15290.49388.86639.812781@beluga.mojam.com> Given that the presence of a __del__ method keeps the garbage collector from reclaiming cyclic garbage, should SocketServer.BaseRequestHandler define a __del__ method that just executes "pass"? Commenting it out seems to have gotten rid of some "uncollectable" messages from the garbage collector. Every other __del__ method in the top level of the standard library actually does something. Skip From guido@python.org Wed Oct 3 12:53:42 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 03 Oct 2001 07:53:42 -0400 Subject: [Python-Dev] dummy __del__ in SocketServer.BaseRequestHandler In-Reply-To: Your message of "Wed, 03 Oct 2001 02:40:28 CDT." <15290.49388.86639.812781@beluga.mojam.com> References: <15290.49388.86639.812781@beluga.mojam.com> Message-ID: <200110031153.HAA27628@cj20424-a.reston1.va.home.com> > Given that the presence of a __del__ method keeps the garbage collector from > reclaiming cyclic garbage, should SocketServer.BaseRequestHandler define a > __del__ method that just executes "pass"? Commenting it out seems to have > gotten rid of some "uncollectable" messages from the garbage collector. > Every other __del__ method in the top level of the standard library actually > does something. +1. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Wed Oct 3 20:18:05 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 3 Oct 2001 14:18:05 -0500 Subject: [Python-Dev] a bit confused about the new type/class stuff Message-ID: <15291.25709.109749.875704@beluga.mojam.com> I have a class hierarchy I'm trying to migrate to 2.2 using the latest stuff from CVS on both the Gtk and Python sides of the fence. Conceptually its skeleton looks like class Object: ... class Widget(Object): ... class Button(Widget): ... (It's a bunch of wrappers around Gtk widgets. The wrappers happen to use delegation instead of instantiation, so my Button class is not subclassed from gtk.Button. In particular, it's *not* one of the new subclassable types.) At one point in my code I test to see if one of my Button instances is in a list of Widget instances and I get a TypeError: TypeError: Button.__cmp__(x,y) requires y to be a 'Button', not a 'instance' If I test for b's inclusion in l, even if I draw b from l: >>> l [