From barry@zope.com Thu Nov 1 00:15:30 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 31 Oct 2001 19:15:30 -0500 Subject: [Python-Dev] Patch submitted for cross-platform newline support References: <20011031165523.85D36303181@snelboot.oratrix.nl> Message-ID: <15328.37922.53988.805890@anthem.wooz.org> >>>>> "JJ" == Jack Jansen writes: JJ> I'm also interested in discussing whether a patch like this is JJ> appropriate while we're in beta. On the one hand I would say JJ> it is, because the feature is disabled by default. On the JJ> other hand there are changes (albeit mainly cosmetic ones) in JJ> a large number of places. Another argument for allowing this JJ> even while in beta is that I really really want it for Mac OS JJ> X (but this might not be a very strong argument, I guess:-). I tend toward the more conservative, especially with changes that touch lots of .c files, so I'd say -1. But it looks from the patch dialog that Guido's already approved of this in principle, so I'll revise that to a -0. -Barry From skip@pobox.com (Skip Montanaro) Thu Nov 1 01:01:06 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 31 Oct 2001 19:01:06 -0600 Subject: [Python-Dev] statcache.lstat? In-Reply-To: <200110312238.RAA01093@cj20424-a.reston1.va.home.com> References: <15328.19926.368998.631005@beluga.mojam.com> <200110312120.QAA32561@cj20424-a.reston1.va.home.com> <15328.31849.14360.58802@beluga.mojam.com> <200110312238.RAA01093@cj20424-a.reston1.va.home.com> Message-ID: <15328.40658.940923.74009@beluga.mojam.com> >> I can use os.lstat (or os.stat) directly. I'm working on a file >> selector widget written in Python (and PyGtk). As people traverse >> the directory tree, it seems to make sense to cache the stat results. Guido> But why bother? And why not let them see changes in the Guido> filesystem? File selector goodies peek at all files in a directory, even the stuff you don't care about, at the very least to segregate them into directory and non-directory files. Caching stat info would probably help speed them up. I was trying to speed things up a bit and saw statcache. I had been using os.stat and os.lstat, so it was natural to wonder about the absence of statcache.lstat. Maybe it's best to simply rely on the underlying operating system's caching. >> If statcache is indeed a failed experiment, perhaps it should be >> deprecated. Guido> Fine with me. Here's a proposed addition to PEP 4: Module name: statcache Rationale: Of limited usefulness and complicates the life of the application programmer, who must manage the cache. Not widely used by other core libraries (they use os.stat instead). Also, it is not thread-safe. Date: 31-Oct-2001 Documentation: TBD Just say the word and I'll update it and submit a patch for the documentation. Oh, and congratulations on the imminent arrival. You will kill for a nice nap in a couple weeks. ;-) Skip From guido@python.org Thu Nov 1 03:47:44 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Oct 2001 22:47:44 -0500 Subject: [Python-Dev] Patch submitted for cross-platform newline support In-Reply-To: Your message of "Wed, 31 Oct 2001 19:15:30 EST." <15328.37922.53988.805890@anthem.wooz.org> References: <20011031165523.85D36303181@snelboot.oratrix.nl> <15328.37922.53988.805890@anthem.wooz.org> Message-ID: <200111010347.WAA01754@cj20424-a.reston1.va.home.com> > I tend toward the more conservative, especially with changes that > touch lots of .c files, so I'd say -1. But it looks from the patch > dialog that Guido's already approved of this in principle, so I'll > revise that to a -0. I'm not at all sure on whether this should be incorporated into 2.2. I approve it (or something like it) for 2.3; for 2.2, I'm hesitant but if Jack thinks it's needed for MacOS, and it's off by default, and a thorough code review shows no problems with it as long as it's off, I would be OK with it. In other words, you're the release manager; it's your call. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Nov 1 03:53:38 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Oct 2001 22:53:38 -0500 Subject: [Python-Dev] statcache.lstat? In-Reply-To: Your message of "Wed, 31 Oct 2001 19:01:06 CST." <15328.40658.940923.74009@beluga.mojam.com> References: <15328.19926.368998.631005@beluga.mojam.com> <200110312120.QAA32561@cj20424-a.reston1.va.home.com> <15328.31849.14360.58802@beluga.mojam.com> <200110312238.RAA01093@cj20424-a.reston1.va.home.com> <15328.40658.940923.74009@beluga.mojam.com> Message-ID: <200111010353.WAA01846@cj20424-a.reston1.va.home.com> > Here's a proposed addition to PEP 4: > > Module name: statcache > Rationale: Of limited usefulness and complicates the life of the > application programmer, who must manage the cache. Not > widely used by other core libraries (they use > os.stat instead). Also, it is not thread-safe. > Date: 31-Oct-2001 > Documentation: TBD > > Just say the word and I'll update it and submit a patch for the > documentation. If others on python-dev agree they can give the word. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Nov 1 05:25:43 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 1 Nov 2001 00:25:43 -0500 Subject: [Python-Dev] Patch submitted for cross-platform newline support References: <20011031165523.85D36303181@snelboot.oratrix.nl> <15328.37922.53988.805890@anthem.wooz.org> <200111010347.WAA01754@cj20424-a.reston1.va.home.com> Message-ID: <15328.56535.61658.906938@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> I tend toward the more conservative, especially with changes >> that touch lots of .c files, so I'd say -1. But it looks from >> the patch dialog that Guido's already approved of this in >> principle, so I'll revise that to a -0. GvR> I'm not at all sure on whether this should be incorporated GvR> into 2.2. I approve it (or something like it) for 2.3; for GvR> 2.2, I'm hesitant but if Jack thinks it's needed for MacOS, GvR> and it's off by default, and a thorough code review shows no GvR> problems with it as long as it's off, I would be OK with it. GvR> In other words, you're the release manager; it's your call. Sounds good. I'll reserve judgement for now. -Barry From barry@zope.com Thu Nov 1 05:31:09 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 1 Nov 2001 00:31:09 -0500 Subject: [Python-Dev] statcache.lstat? References: <15328.19926.368998.631005@beluga.mojam.com> <200110312120.QAA32561@cj20424-a.reston1.va.home.com> <15328.31849.14360.58802@beluga.mojam.com> <200110312238.RAA01093@cj20424-a.reston1.va.home.com> <15328.40658.940923.74009@beluga.mojam.com> <200111010353.WAA01846@cj20424-a.reston1.va.home.com> Message-ID: <15328.56861.833488.228610@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> If others on python-dev agree they can give the word. +1 From tim.one@home.com Thu Nov 1 07:42:39 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 1 Nov 2001 02:42:39 -0500 Subject: [Python-Dev] Slices and "==" optimization In-Reply-To: <3BDFB17A.78584E7C@lemburg.com> Message-ID: Before we special-case strings more, we should fix what we've got: Martin (IIRC) inserted a fast path at the start of do_richcmp early in the 2.2 cycle: if (v->ob_type == w->ob_type && (f = v->ob_type->tp_compare) != NULL && !PyInstance_Check(v)) { int c; richcmpfunc f1; if ((f1 = RICHCOMPARE(v->ob_type)) != NULL) { /* If the type has richcmp, try it first. try_rich_compare would try it two-sided, which is not needed since we've a single type only. */ res = (*f1)(v, w, op); if (res != Py_NotImplemented) return res; Py_DECREF(res); } c = (*f)(v, w); if (c < 0 && PyErr_Occurred()) return NULL; return convert_3way_to_object(op, c); } Unfortunately, strings lost their tp_compare slot (due to some other "optimization"?), so despite that this is *trying* to special-case rich compares of same-type builtin objects, for strings the v->ob_type->tp_compare != NULL guard at the start fails today, so this fast path isn't taken for string compares of any kind anymore. If it were taken, string_richcompare would get out quickly (without a strcmp) for EQ compare of identical string objects. Note that saying x == x is true regardless of the type of x would prevent defining a correct IEEE-754 equality operator (IOW, if we're saying __eq__ is user-defined, we have to let users define it any way they want -- and at least one major std requires an insane <0.3 wink> definition of equality). BTW, PyInstance_Check() there looks suspect now too, since instances of new-style classes don't pass PyInstance_Check() (so *can* get into this fast path). From just@letterror.com Thu Nov 1 08:27:21 2001 From: just@letterror.com (Just van Rossum) Date: Thu, 1 Nov 2001 09:27:21 +0100 Subject: [Python-Dev] PEP 273: Import Modules from Zip Archives In-Reply-To: <200110312338.SAA01368@cj20424-a.reston1.va.home.com> Message-ID: <20011101092728-r01010800-42a901cc-0910-010c@10.0.0.23> [Guido] > - I'm not sure I care about having .so files inside packages on the > filesystem; they are useful in Zope, but for very hackish reasons. [Just] > Why? If I write a package which is mostly in Python, it feels very > natural to put the C extensions also in the package. [Guido] > Yes, it does, and as long as it works, I have no problem with that. > Distutils supports this too, AFAIK. Ah, ok, I think I misread your comment above as "I'm not sure I care about allowing .so files inside packages to begin with", hence my "why". Just From mal@lemburg.com Thu Nov 1 09:09:11 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 01 Nov 2001 10:09:11 +0100 Subject: [Python-Dev] method dispatch vs. switching (pickle faster in 2.2 ?) References: <3BE06922.5C43E09F@lemburg.com> <200110312131.QAA32734@cj20424-a.reston1.va.home.com> Message-ID: <3BE11137.123877E@lemburg.com> Guido van Rossum wrote: > > > While hacking on an XML pickler, I found that pickle.py got nearly > > twice as fast in 2.2 comparing to 2.1 and 2.0. > > What benchmark? I was looking at the roundtrip speed of pickling a list of integers. > > The code in pickle.py doesn't seem to have changed much. Anybody > > know where that speedup came from ? Can somebody on another > > (non-Linux) system please verify this. > > I believe DOM nodes are now new-style classes, for better or for worse > (it might create problems when combining with classic mixins). Could > that explain it? No. I'm writing my own little beast here which does not use DOM, expat or sgmlop. The results of the approach which tries to avoid Python function calls are interesting. I moved from the usual switch strategy of dispatching to instance methods to a large for-loop with lots of "if x is y: ...". While I suspected the latter to be faster on average, I found that this is not the case. I still have to investigate where the performance goes, but the result kind of surprised me. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From anthony@interlink.com.au Thu Nov 1 12:21:33 2001 From: anthony@interlink.com.au (Anthony Baxter) Date: Thu, 01 Nov 2001 23:21:33 +1100 Subject: [Python-Dev] platform-specific bug fixes... Message-ID: <200111011221.fA1CLX422459@mbuna.arbhome.com.au> as an example, take Modules/termios.c: there's a bunch of #include magic to (I assume) help the build process on different boxes. I don't have access to all of these boxes to test it. In general, I'm going to _assume_ that platform-specific fixes on the trunk have been made and then tested to be sane - and that they're ok for the branch. In particular, the Max and OS/2 stuff could generously be referred to as deep dark bad magic from where I am... From guido@python.org Thu Nov 1 12:56:02 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 01 Nov 2001 07:56:02 -0500 Subject: [Python-Dev] method dispatch vs. switching (pickle faster in 2.2 ?) In-Reply-To: Your message of "Thu, 01 Nov 2001 10:09:11 +0100." <3BE11137.123877E@lemburg.com> References: <3BE06922.5C43E09F@lemburg.com> <200110312131.QAA32734@cj20424-a.reston1.va.home.com> <3BE11137.123877E@lemburg.com> Message-ID: <200111011256.HAA03366@cj20424-a.reston1.va.home.com> > Guido van Rossum wrote: > > > > > While hacking on an XML pickler, I found that pickle.py got nearly > > > twice as fast in 2.2 comparing to 2.1 and 2.0. > > > > What benchmark? > > I was looking at the roundtrip speed of pickling a list of integers. > > > > The code in pickle.py doesn't seem to have changed much. Anybody > > > know where that speedup came from ? Can somebody on another > > > (non-Linux) system please verify this. > > > > I believe DOM nodes are now new-style classes, for better or for worse > > (it might create problems when combining with classic mixins). Could > > that explain it? > > No. I'm writing my own little beast here which does not use DOM, > expat or sgmlop. > > The results of the approach which tries to avoid Python > function calls are interesting. I moved from the usual > switch strategy of dispatching to instance methods to a > large for-loop with lots of "if x is y: ...". While I suspected > the latter to be faster on average, I found that this is > not the case. I still have to investigate where the performance > goes, but the result kind of surprised me. Sorry, it's too early for me to understand what the two alternatives are. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Thu Nov 1 13:45:18 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Thu, 1 Nov 2001 14:45:18 +0100 (MET) Subject: [Python-Dev] Telling directories from files (Was: statcache.lstat?) Message-ID: <200111011345.OAA27018@paros.informatik.hu-berlin.de> > File selector goodies peek at all files in a directory, even the > stuff you don't care about, at the very least to segregate them into > directory and non-directory files. Caching stat info would probably > help speed them up. I was trying to speed things up a bit and saw > statcache. Some of the recent systems have the file type in the directories (44ffs, NTFS, ext2 with "filetype" feature); in addition, recent C libraries expose the file type through getdents. E.g. on Linux, struct dirent is struct dirent64 { __ino64_t d_ino; __off64_t d_off; unsigned short int d_reclen; unsigned char d_type; char d_name[256]; }; In this type, d_type can take the values DT_UNKNOWN DT_FIFO DT_CHR DT_DIR DT_BLK DT_REG DT_LNK DT_SOCK DT_WHT Likewise, on Win32, FindFirstFile will return not only the file name, but also whether it is a directory, access times, alternate names, etc. I always meant to expose the file type to Python, to speed up things that only do stat to tell apart directories from non-directories. Of course, doing this in a portable, backwards-compatible manner may become a challenge, especially if you want os.listdir to expose this information. I think I'll write a PEP. Regards, Martin From com-nospam@ccraig.org Thu Nov 1 14:03:57 2001 From: com-nospam@ccraig.org (Christopher A. Craig) Date: 01 Nov 2001 09:03:57 -0500 Subject: [Python-Dev] Future division detection Message-ID: I am in the process of porting my cRat module to take advantage of some of the new features in Python 2.2 (and possibly making a patch out of it to address PEP239 (though not for Python 2.2 obviously)). While doing this I was thinking that I would change true_division on ints and floats to return a rational and change the rational code to return a long if the denominator is 1. This works great, except that if future division is off then rationals can suddenly become longs and do not automatically cast back. This makes it virtually impossible to guarantee a correct result to nearly any rational computation that involves a division. So I wanted to know if there is some way to detect, at the object level, if the CO_FUTURE_DIVISION feature is active. rational'ly y'rs -- Christopher A. Craig From jack@oratrix.nl Thu Nov 1 14:12:38 2001 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 01 Nov 2001 15:12:38 +0100 Subject: [Python-Dev] Patch submitted for cross-platform newline support In-Reply-To: Message by "Tim Peters" , Wed, 31 Oct 2001 18:44:44 -0500 , Message-ID: <20011101141243.95C821162D7@oratrix.oratrix.nl> Recently, "Tim Peters" said: > Radical idea: don't do anything to turn on "universal newlines" -- say it's > just what "text mode" means in Python. Then you only have to worry about > picking a letter to turn it off . This is how I started. But I changed it because a file in universal newline input mode is going to be slower than in normal text input mode. Especially when I looked at the code for doing readline() on Windows to squeeze out the last few nanoseconds I thought that this should probably be an option, not the default. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim@interet.com Thu Nov 1 14:31:33 2001 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 01 Nov 2001 09:31:33 -0500 Subject: [Python-Dev] PEP 273: Import Modules from Zip Archives References: <3BE0210F.12966.4496B695@localhost> <3BE03556.11078.44E5ECD7@localhost> <200110312241.RAA01122@cj20424-a.reston1.va.home.com> Message-ID: <3BE15CC5.91101560@interet.com> Guido van Rossum wrote: > > - Writing stuff back to .zip files is totally the wrong approach. > > - I don't care about having .so files inside packages *in zipfiles*. > > - I'm not sure I care about having .so files inside packages on the > filesystem; they are useful in Zope, but for very hackish reasons. > > - If the zip file has the .py file but no .pyc or the wrong .pyc, tant > pis. Let it be slower. (But if it has the .pyo, use that.) Let me look at coding the above. It seems like a good approach. I will update the PEP too. JimA From anthony@interlink.com.au Thu Nov 1 16:02:06 2001 From: anthony@interlink.com.au (Anthony Baxter) Date: Fri, 02 Nov 2001 03:02:06 +1100 Subject: [Python-Dev] 2.1.2: patching Modules/ Message-ID: <200111011602.fA1G26x24888@mbuna.arbhome.com.au> Plenty of fun on the ol' branch tonight. Here's some notes from my patching run tonight. Feedback solicited. ta, Anthony Wontfix: many files: docstring changes from, e.g: -"count(s, sub[, start[, end]]) -> int\n\ -\n\ to +"count(s, sub[, start[, end]]) -> int\n" +"\n" Did these actually fix anything? many files: compiler warning fixes (unless I'm there already). Got enough to do as it is. many files: config.h -> pyconfig.h: Don't see a point - if it's broken in 2.1.1, it'll still be broken in 2.1.2. Or is this bad enough to fix? Modules/getpath.c: sys.executable gets it wrong (bug #424002) many files: CPPFLAGS broken out (patch #414991) Modules/parsermodule.c: fix for #431886, not allowing single test , Modules/zlibmodule.c: allow threads in zlib, misc other fixes - diff is nearly the entire file. The fix for patch #403753 would be good, but is an incompatibility. Hm. many files: The RISC/OS patch. This seems scary. many files: SF [#466125] PyLong_AsLongLong works for any integer. Open to arguments, but it looks like a feature to me. Modules/resource.c: fix to enable on cygwin Unsure: xreadlines.c:1.8 date: 2001/08/29 23:50:42; author: nascheme; state: Exp; lines: +1 -1 Remove bogus PyGC_HEAD_SIZE. this looks fine - someone want to just confirm? gcmodule: the lock to stop multiple calls to collect() - pretty sure I've got a working patch for this, but it's hard to test. Still to-do (if someone with excess roundtuits ... :) Look at the sre fixes, work out which are bugfixes _curses* linux glibc2.0 fixes another couple of files in Modules/ -- Anthony Baxter It's never too late to have a happy childhood. From fdrake@acm.org Thu Nov 1 15:40:29 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 1 Nov 2001 10:40:29 -0500 Subject: [Python-Dev] pickle faster in 2.2 ? In-Reply-To: <200110312131.QAA32734@cj20424-a.reston1.va.home.com> References: <3BE06922.5C43E09F@lemburg.com> <200110312131.QAA32734@cj20424-a.reston1.va.home.com> Message-ID: <15329.27885.780201.991410@grendel.zope.com> Guido van Rossum writes: > I believe DOM nodes are now new-style classes, for better or for worse > (it might create problems when combining with classic mixins). Could > that explain it? Using minidom specifically, NodeList objects are new-style, but the Nodes are still all old-style. The __getattr__/__setattr__ stuff has all been replaced with property-based implementation as well. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From nas@python.ca Thu Nov 1 16:39:40 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 1 Nov 2001 08:39:40 -0800 Subject: [Python-Dev] 2.1.2: patching Modules/ In-Reply-To: <200111011602.fA1G26x24888@mbuna.arbhome.com.au>; from anthony@interlink.com.au on Fri, Nov 02, 2001 at 03:02:06AM +1100 References: <200111011602.fA1G26x24888@mbuna.arbhome.com.au> Message-ID: <20011101083940.C26951@glacier.arctrix.com> Anthony Baxter wrote: > xreadlines.c:1.8 > date: 2001/08/29 23:50:42; author: nascheme; state: Exp; lines: +1 -1 > Remove bogus PyGC_HEAD_SIZE. > this looks fine - someone want to just confirm? PyXReadlinesObject does not support GC so removing it shouldn't hurt anything. Neil From loewis@informatik.hu-berlin.de Thu Nov 1 17:24:58 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Thu, 1 Nov 2001 18:24:58 +0100 (MET) Subject: [Python-Dev] 2.1.2: patching Modules/ Message-ID: <200111011724.SAA29749@paros.informatik.hu-berlin.de> [\n\ -> \n" conversion] > Did these actually fix anything? Looking at the check-in message to mathmodule.c 2.61, we see author: tim_one; # Mechanical fiddling to make this easier to work with in my editor. So I guess the answer is: "yes, it makes Tim's editor happy" :-) Apart from that, it apparently didn't fix any further problems. > many files: config.h -> pyconfig.h: Don't see a point - if it's broken in > 2.1.1, it'll still be broken in 2.1.2. Indeed. Furthermore, applications (including distutils) may rely on knowing the name of the config.h; those would break with the change. > Modules/zlibmodule.c: allow threads in zlib, misc other fixes - diff > is nearly the entire file. The fix for patch #403753 would be good, > but is an incompatibility. Hm. I don't think this qualifies as a "bug fix". It fixes a problem, yes, but applications have to be changed to make use of the feature. > gcmodule: the lock to stop multiple calls to collect() - pretty sure > I've got a working patch for this, but it's hard to test. Why can't you use the patch as-is? It would apply without problems, if generation0 hadn't been renamed to _PyGC_generation0. > _curses* If "didn't build before, does now" also is a candidate for inclusion, I think you could more or less use _cursesmodule.c as-is. If you only want fixes for systems on which it already worked, it will be more difficult. Regards, Martin From niemeyer@conectiva.com Thu Nov 1 18:40:10 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 1 Nov 2001 16:40:10 -0200 Subject: [Python-Dev] 2.2 release Message-ID: <20011101164009.A2318@ibook.distro.conectiva> --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello! We're deciding our schedule for the next release of Conectiva Linux. We'd like to include, if possible, the next version of Python. Do you have any prevision about the release date of the final version? I know it's hard to predict, but we'd like just a hint to include or exclude the possibility. Thank you! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --DocE+STaALJfprDB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE74ZcJIlOymmZkOgwRAi8RAJwLwcGhMw+ENV3RY2wCwkv7fHv7nQCeO/CG 0r9c9MuZs3MFKs6OGre3Emo= =UPnX -----END PGP SIGNATURE----- --DocE+STaALJfprDB-- From nas@python.ca Thu Nov 1 18:51:09 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 1 Nov 2001 10:51:09 -0800 Subject: [Python-Dev] 2.2 release In-Reply-To: <20011101164009.A2318@ibook.distro.conectiva>; from niemeyer@conectiva.com on Thu, Nov 01, 2001 at 04:40:10PM -0200 References: <20011101164009.A2318@ibook.distro.conectiva> Message-ID: <20011101105109.A27478@glacier.arctrix.com> Gustavo Niemeyer wrote: > Do you have any prevision about the release date of the final version? > I know it's hard to predict, but we'd like just a hint to include or > exclude the possibility. Here's a hint: http://python.sourceforge.net/peps/pep-0251.html Right now it looks like we will meet the release date. Neil From guido@python.org Thu Nov 1 18:54:36 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 01 Nov 2001 13:54:36 -0500 Subject: [Python-Dev] 2.2 release In-Reply-To: Your message of "Thu, 01 Nov 2001 16:40:10 -0200." <20011101164009.A2318@ibook.distro.conectiva> References: <20011101164009.A2318@ibook.distro.conectiva> Message-ID: <200111011854.fA1Isa924216@odiug.zope.com> > Hello! > > We're deciding our schedule for the next release of Conectiva Linux. We'd > like to include, if possible, the next version of Python. Do you have > any prevision about the release date of the final version? I know it's > hard to predict, but we'd like just a hint to include or exclude the > possibility. > > Thank you! Our release schedule is public: http://python.sourceforge.net/peps/pep-0251.html This aims for Dec 19. We may be a week or so off, but usually not much beyond that. No promises though. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Thu Nov 1 19:10:42 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 1 Nov 2001 13:10:42 -0600 Subject: [Python-Dev] Problems with dbhash in Python-2.1? (fwd) Message-ID: <15329.40498.282789.129235@beluga.mojam.com> --U1goaZGhxW Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit I forward this note from Roy Smith along because it might be a setup.py bug that could be fixed easily for the 2.1.2 release. My guess is that distutils couldn't find the necessary libraries or header files, so declined to build bsddb. If Roy Smith can locate them on the solaris and debian machines, the patch to setup.py should be pretty straightforward. Roy, can you dig up that info? Skip --U1goaZGhxW Content-Type: message/rfc822 Content-Description: forwarded message Content-Transfer-Encoding: 7bit Return-Path: Received: from granite.pobox.com (granite.pobox.com [207.8.152.160]) by manatee.mojam.com (8.11.0/8.11.0) with ESMTP id fA1J1kh08286 for ; Thu, 1 Nov 2001 13:01:46 -0600 Received: from granite (localhost [127.0.0.1]) by granite.pobox.com (Postfix) with ESMTP id 58EC67DF0D for ; Thu, 1 Nov 2001 14:01:46 -0500 (EST) Delivered-To: skip@pobox.com Received: from mail.python.org (mail.python.org [63.102.49.29]) by granite.pobox.com (Postfix) with ESMTP id D2E2A7DF30 for ; Thu, 1 Nov 2001 14:01:45 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=mail.python.org) by mail.python.org with esmtp (Exim 3.21 #1) id 15zN5N-0002eN-00; Thu, 01 Nov 2001 14:01:05 -0500 Path: news.baymountain.net!uunet!ash.uu.net!lore.csc.com!nntp.abs.net!howland.erols.net!panix!news.panix.com!panix2.panix.com!not-for-mail Newsgroups: comp.lang.python Organization: PANIX -- Public Access Networks Corp. Lines: 23 Message-ID: <9rs60o$b0n$1@panix2.panix.com> NNTP-Posting-Host: panix2.panix.com X-Trace: news.panix.com 1004641112 1428 166.84.1.2 (1 Nov 2001 18:58:32 GMT) X-Complaints-To: abuse@panix.com NNTP-Posting-Date: 1 Nov 2001 18:58:32 GMT Xref: news.baymountain.net comp.lang.python:129718 Errors-To: python-list-admin@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.0.6 (101270) Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: From: roy@panix.com (Roy Smith) Sender: python-list-admin@python.org To: python-list@python.org Subject: Problems with dbhash in Python-2.1? Date: 1 Nov 2001 13:58:32 -0500 I've got an application written in Python-2.0, which used the dbhash module. I've had two different people report similar problems running it under Python-2.1. One of them on Debian Linux, the other on Solaris-8. The solaris guy reports the following stack trace: Traceback (most recent call last): File "/export/home/emermels/source/src/tools/cvt1418.py", line 6, in ? import smic File "/export/home/emermels/source/src/tools/pylib/smic.py", line 9, in ? import dbhash File "/usr/local/lib/python2.1/dbhash.py", line 5, in ? import bsddb ImportError: No module named bsddb The Debian guy got some other strange error involving bsddb (which unfortunately I didn't save, because when he down-graded to 2.0 the problem went away). Is this a known problem? -- http://mail.python.org/mailman/listinfo/python-list --U1goaZGhxW-- From niemeyer@conectiva.com Thu Nov 1 19:27:26 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 1 Nov 2001 17:27:26 -0200 Subject: [Python-Dev] 2.2 release In-Reply-To: <200111011854.fA1Isa924216@odiug.zope.com>; from guido@python.org on Thu, Nov 01, 2001 at 01:54:36PM -0500 References: <20011101164009.A2318@ibook.distro.conectiva> <200111011854.fA1Isa924216@odiug.zope.com> Message-ID: <20011101172726.B2415@ibook.distro.conectiva> --QTprm0S8XgL7H0Dt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > Our release schedule is public: >=20 > http://python.sourceforge.net/peps/pep-0251.html I was not aware of this PEP. I won't bother you about this in the future. > This aims for Dec 19. We may be a week or so off, but usually not > much beyond that. This will break our first freeze step (we freeze in several steps), but will probably be feasible. > No promises though. :-) Sure.. ;-) Thank you! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --QTprm0S8XgL7H0Dt Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE74aIeIlOymmZkOgwRAk3BAJ93iVRGEMOoXQ4F7mGDuZyQcza6UwCgzVdO uP8TgQfxcpRAyVWoxsECuMU= =LUm6 -----END PGP SIGNATURE----- --QTprm0S8XgL7H0Dt-- From fdrake@acm.org Thu Nov 1 19:27:07 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 1 Nov 2001 14:27:07 -0500 Subject: [Python-Dev] 2.2 release In-Reply-To: <20011101172726.B2415@ibook.distro.conectiva> References: <20011101164009.A2318@ibook.distro.conectiva> <200111011854.fA1Isa924216@odiug.zope.com> <20011101172726.B2415@ibook.distro.conectiva> Message-ID: <15329.41483.505859.486318@grendel.zope.com> Gustavo Niemeyer writes: > > Our release schedule is public: > > > > http://python.sourceforge.net/peps/pep-0251.html > > I was not aware of this PEP. I won't bother you about this in the > future. You should be aware that for each major (X.Y) release, a new PEP is drafted. You can get a list of all PEP documents at: http://python.sourceforge.net/peps/pep-0000.html -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mclay@nist.gov Thu Nov 1 22:43:17 2001 From: mclay@nist.gov (Michael McLay) Date: Thu, 1 Nov 2001 18:43:17 -0400 Subject: [Python-Dev] Change in evaluation order in new object model Message-ID: <200111012242.RAA01504@email.nist.gov> I was suprised by a change to the order of evaluation of members in the new object type. I haven't found an explanation for why the change was made. I was comfortable with the way it worked. Is there an advantage to the change?. In the classic python model the interpreter looked in the instance dictionary and if the name wasn't there it looked in the class dictionary. The following illustrates this evaluation order. >>> class C: def __init__(self): self.a = 4 >>> c = C() >>> c.a 4 >>> C.a = 6 >>> c.a 4 >>> c.a = 8 >>> c.a 8 >>> With the new slots mechanism the order has been reversed. The class level dictionary is searched and then the slots are evaluated. >>> class B(object): __slots__ = ['a','b','c'] >>> b = B() >>> b.a = 4 >>> b.a 4 >>> B.a = 6 >>> b.a 6 >>> b.a = 8 Traceback (most recent call last): File "", line 1, in ? b.a = 8 AttributeError: 'B' object attribute 'a' is read-only From Anthony Baxter Fri Nov 2 01:37:18 2001 From: Anthony Baxter (Anthony Baxter) Date: Fri, 02 Nov 2001 12:37:18 +1100 Subject: [Python-Dev] Re: 2.1.2: patching Modules/ In-Reply-To: Message from Martin von Loewis of "Thu, 01 Nov 2001 18:24:58 BST." <200111011724.SAA29749@paros.informatik.hu-berlin.de> Message-ID: <200111020137.fA21bI931413@mbuna.arbhome.com.au> >>> Martin von Loewis wrote > [\n\ -> \n" conversion] > > Did these actually fix anything? > So I guess the answer is: "yes, it makes Tim's editor happy" :-) Apart > from that, it apparently didn't fix any further problems. Fair nuff. I'm not completely up on the various intricacies of various bizarro C compilers so I thought I'd check. > Indeed. Furthermore, applications (including distutils) may rely on > knowing the name of the config.h; those would break with the change. Excellent point. > > [zlib fix] > I don't think this qualifies as a "bug fix". It fixes a problem, yes, > but applications have to be changed to make use of the feature. And unlike the sendall() fix, there's no sections of the std library triggering the bug. > Why can't you use the patch as-is? It would apply without problems, if > generation0 hadn't been renamed to _PyGC_generation0. Minor additional tweaks (making stuff global), but this is now in. > If "didn't build before, does now" also is a candidate for inclusion, > I think you could more or less use _cursesmodule.c as-is. If you only > want fixes for systems on which it already worked, it will be more > difficult. This issue is one of "not stopping it building where it did before..." I'm just going to wander through it and check for nasties... Anthony From thomas@xs4all.net Fri Nov 2 02:03:03 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 2 Nov 2001 03:03:03 +0100 Subject: [Python-Dev] Future of SSL In-Reply-To: <200110281220.HAA09692@cj20424-a.reston1.va.home.com> References: <20011027014211.A11092@lilith.hqd-internal> <200110270226.WAA23767@cj20424-a.reston1.va.home.com> <20011026235527.A2919@trump.amber.org> <200110271556.LAA25312@cj20424-a.reston1.va.home.com> <20011027214357.B6993@trump.amber.org> <200110281220.HAA09692@cj20424-a.reston1.va.home.com> Message-ID: <20011102030303.I701@xs4all.nl> On Sun, Oct 28, 2001 at 07:20:30AM -0500, Guido van Rossum wrote: > PS. One issue with adding more crypto to Python could be US export > issues. It's possible that new export limitations for crypto software > are made law by a congress that doesn't understand the issues, and > then the US Python distribution could be in trouble (even though our > site in the the Netherlands, we build the distributions here in the > US). Back at CNRI, we couldn't release the SSL wrappers, which don't > contain any crypto code but enable linking with it, before an > extensive and expensive legal review, and then we had to wait until > after a certain date, at which some of the crypto export restrictions > were lifted. Sorry for this fairly late response, but I've been slacking the python-dev mailbox for half a month (I just finished reading just over 600 mails, and boy, are my arms tired.) If we are really worried about having the SSL configure checks, let alone SSL hooks, we could minimize even that by providing a 'crypto' package that _replaces_ socket.py with one with SSL support. socket.py is a small dinky thing, after all, that imports most stuff from _socketmodule.so. The actual code would live in a separate module, and the entire thing could easily be made a separate patch -- so that if the US government goes medieval on us, we can easily seperate the SSL part from the main tarball and place it on www.python.org by itself. A burden, but less so than having five developers in prison ;-) On the other hand, I would much prefer an 'ssl' module with an interface similar to the socket module, and to hell with backward compatibility :) And I'm also curious what effect the recent court ruling regarding the DeCSS distribution will have; from what I read, it states that source code is a form of expression and thus falls under the first amendment of the American constitution. It goes on to say that """Indeed, the [US] Supreme Court has never upheld a prior restraint [on pure speech], even faced with the competing interest of national security or the Sixth Amendment right to a fair trial.'""" If I were even remotely religious, I would pray (and beg humbly on my knees) to god that this decision stands in higher courts and is respected by other judges in (to us) similar cases, and recognized by whatever law-designing forces the US government has. Sleepless-ramblings-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From effbot@telia.com Fri Nov 2 14:08:28 2001 From: effbot@telia.com (Fredrik Lundh) Date: Fri, 2 Nov 2001 15:08:28 +0100 Subject: [Python-Dev] who's maintaining tools/idle? Message-ID: <00ba01c163a7$db2f4140$ced241d5@hagrid> just noticed that CVS is merging changes into icons/minusnode.gif every time I do a cvs update. doesn't sound right. turns out that minusnode.gif and plusnode.gif has been checked in as text files (no -kb option). how can I fix this? (I should know, but I only use perforce these days, and my memory isn't what it used to be...) From fdrake@acm.org Fri Nov 2 14:27:23 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 2 Nov 2001 09:27:23 -0500 Subject: [Python-Dev] who's maintaining tools/idle? In-Reply-To: <00ba01c163a7$db2f4140$ced241d5@hagrid> References: <00ba01c163a7$db2f4140$ced241d5@hagrid> Message-ID: <15330.44363.801180.278491@grendel.zope.com> Fredrik Lundh writes: > just noticed that CVS is merging changes into icons/minusnode.gif > every time I do a cvs update. doesn't sound right. > > turns out that minusnode.gif and plusnode.gif has been checked in > as text files (no -kb option). You can adjust the -k* options using "cvs admin". I just did: cvs admin -kb plusnode.gif minusnode.gif to fix it for those two files; you might get one more update to correct what you have now, but things should work as expected after that. Thanks for pointing this out! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Fri Nov 2 16:22:34 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 02 Nov 2001 17:22:34 +0100 Subject: [Python-Dev] Change in evaluation order in new object model References: <200111012242.RAA01504@email.nist.gov> Message-ID: <3BE2C84A.B297DC7C@lemburg.com> Michael McLay wrote: > > I was suprised by a change to the order of evaluation of members in the new > object type. I haven't found an explanation for why the change was made. I > was comfortable with the way it worked. Is there an advantage to the change?. > > In the classic python model the interpreter looked in the instance dictionary > and if the name wasn't there it looked in the class dictionary. The > following illustrates this evaluation order. > ... > > With the new slots mechanism the order has been reversed. The class level > dictionary is searched and then the slots are evaluated. > > >>> class B(object): > __slots__ = ['a','b','c'] > > >>> b = B() > >>> b.a = 4 > >>> b.a > 4 > >>> B.a = 6 > >>> b.a > 6 > >>> b.a = 8 > Traceback (most recent call last): > File "", line 1, in ? > b.a = 8 > AttributeError: 'B' object attribute 'a' is read-only Could someone please first explain what these slots are used for in the first place :-? There must be some difference to standard class attributes... which is probably also the reason for the above behaviour (even though it does look like a bug to me). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 2 16:22:34 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 02 Nov 2001 17:22:34 +0100 Subject: [Python-Dev] Change in evaluation order in new object model References: <200111012242.RAA01504@email.nist.gov> Message-ID: <3BE2C84A.B297DC7C@lemburg.com> Michael McLay wrote: > > I was suprised by a change to the order of evaluation of members in the new > object type. I haven't found an explanation for why the change was made. I > was comfortable with the way it worked. Is there an advantage to the change?. > > In the classic python model the interpreter looked in the instance dictionary > and if the name wasn't there it looked in the class dictionary. The > following illustrates this evaluation order. > ... > > With the new slots mechanism the order has been reversed. The class level > dictionary is searched and then the slots are evaluated. > > >>> class B(object): > __slots__ = ['a','b','c'] > > >>> b = B() > >>> b.a = 4 > >>> b.a > 4 > >>> B.a = 6 > >>> b.a > 6 > >>> b.a = 8 > Traceback (most recent call last): > File "", line 1, in ? > b.a = 8 > AttributeError: 'B' object attribute 'a' is read-only Could someone please first explain what these slots are used for in the first place :-? There must be some difference to standard class attributes... which is probably also the reason for the above behaviour (even though it does look like a bug to me). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From DavidA@ActiveState.com Fri Nov 2 17:39:25 2001 From: DavidA@ActiveState.com (David Ascher) Date: Fri, 02 Nov 2001 09:39:25 -0800 Subject: [Python-Dev] Change in evaluation order in new object model References: <200111012242.RAA01504@email.nist.gov> <3BE2C84A.B297DC7C@lemburg.com> Message-ID: <3BE2DA4D.A811C6E7@ActiveState.com> "M.-A. Lemburg" wrote: > Could someone please first explain what these slots are used for > in the first place :-? There must be some difference to standard > class attributes... which is probably also the reason for the > above behaviour (even though it does look like a bug to me). My understanding is that the slots define the set of attributes-like things that instances of that class can have. It 'robs' the instances of such classes from having a __dict__ (at least conceptually), and provides much lighter weight attributes. Instances only need to store the _values_ in a struct/array, while the set of slot names and their order (in memory) is kept at the class level. This sort of thing is helpful from a memory point of view when you're dealing with e.g. large numbers of similar objects which all share the same attribute set (coordinates, Tk labels, whatnot). It doesn't explain (to me at least) why the lookup order is different. --david From andymac@bullseye.apana.org.au Fri Nov 2 12:16:29 2001 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Fri, 2 Nov 2001 23:16:29 +1100 (EDT) Subject: [Python-Dev] Re: OS/2 VAC++ patches In-Reply-To: Message-ID: On Tue, 30 Oct 2001, Andrew MacIntyre wrote: > 473749 - has several (minor IMO) stylistic issues (addition of OS/2 > #ifdefs). The intent of the changes looks OK, as do most of > the actual changes (those in OS/2 VAC specific files or > existing OS/2 #ifdefs). Michael has uploaded a revised patch addressing the stylistic points noted. > 474169 - looks good to go (changes inside existing OS/2 #ifdef) > 474500 - looks good to go (isolated to OS/2 specific file) 474500 has been committed (thanks Tim!). If possible I'd like to see both 473749 (the updated version) and 474169 make it into 2.2b2, to give 2.2 a chance to ship with the OS/2 VAC++ port buildable from the release sourceball (the EMX port won't make it into CVS for 2.2). Is someone (Tim?) prepared to consider committing these before 2.2b2 branches? Is there anything more I or Michael can do to make this happen? (as previously advised, I'm not going to be in a position to take on CVS commits until after 2.2b2) -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From mclay@nist.gov Fri Nov 2 18:02:27 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 2 Nov 2001 14:02:27 -0400 Subject: [Python-Dev] Change in evaluation order in new object model In-Reply-To: <3BE2C84A.B297DC7C@lemburg.com> References: <200111012242.RAA01504@email.nist.gov> <3BE2C84A.B297DC7C@lemburg.com> Message-ID: <200111021801.NAA19265@email.nist.gov> On Friday 02 November 2001 11:22 am, M.-A. Lemburg wrote: > Michael McLay wrote: > > I was suprised by a change to the order of evaluation of members in the > > new object type. I haven't found an explanation for why the change was > > made. I was comfortable with the way it worked. Is there an advantage > > to the change?. > > > > In the classic python model the interpreter looked in the instance > > dictionary and if the name wasn't there it looked in the class > > dictionary. The following illustrates this evaluation order. > > ... > > > > With the new slots mechanism the order has been reversed. The class > > level dictionary is searched and then the slots are evaluated. > > > > >>> class B(object): > > > > __slots__ = ['a','b','c'] > > > > >>> b = B() > > >>> b.a = 4 > > >>> b.a > > > > 4 > > > > >>> B.a = 6 > > >>> b.a > > > > 6 > > > > >>> b.a = 8 > > > > Traceback (most recent call last): > > File "", line 1, in ? > > b.a = 8 > > AttributeError: 'B' object attribute 'a' is read-only > > Could someone please first explain what these slots are used for > in the first place :-? There must be some difference to standard > class attributes... which is probably also the reason for the > above behaviour (even though it does look like a bug to me). Classes defined with slots do not use a __dict__ to hold the content of the class. Instead an instance of a type includes a table with one entry per slot name. The "member descriptor" for each slot defines the offset into the table for that slot. The size of each instances is reduced by the elimination of the __dict__. The lookup of a member is also faster because it uses a lookup of an offset instead of a dictionary lookup. The following example shows some of the characteristics of slots. >>> class B(object): __slots__ = ['a','b','c'] >>> b.__dict__ Traceback (most recent call last): File "", line 1, in ? b.__dict__ AttributeError: 'B' object has no attribute '__dict__' >>> dir(b) ['__class__', '__delattr__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__repr__', '__setattr__', '__slots__', '__str__', 'a', 'b', 'c'] The slot returns a None value if it is referenced before it is initialized >>> b = B() >>> b.a >>> b.a = 3 >>> b.a 3 If a reference is made to an undefined slot an AttributeError is generated. >>> b.d = 5 Traceback (most recent call last): File "", line 1, in ? b.d = 5 AttributeError: 'B' object has no attribute 'd' The declaration of slots using __slots__ is somewhat awkward and Guido's comment about the syntax indicates this may change. One immediate problem with the syntax is that it doesn't support associating doc strings with slot names. Andrew described an example of how the descriptor[1] capability will allow Python to be extended in interesting ways. It turns out to be relatively easy to extend the descriptor capability. I submitted a patch yesterday that adds support for doc strings and optional type checking to the members defined by slots. The syntax to add the new capabilities is similar to the __slots__, but uses dictionaries to assign the values. class B(object): __slots__ = ['a','b','c'] __slot_docs__ = {'a':"doc string for a", 'b' : "doc string for b"} __slot_types__ = {'a':(int,str), 'c':int, } [1] http://www.amk.ca/python/2.2/index.html#SECTION000320000000000000000 From thomas.heller@ion-tof.com Fri Nov 2 18:07:16 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 2 Nov 2001 19:07:16 +0100 Subject: [Python-Dev] Change in evaluation order in new object model References: <200111012242.RAA01504@email.nist.gov> <3BE2C84A.B297DC7C@lemburg.com> Message-ID: <05cb01c163c9$349c4f40$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" > Michael McLay wrote: > > > > I was suprised by a change to the order of evaluation of members in the new > > object type. I haven't found an explanation for why the change was made. I > > was comfortable with the way it worked. Is there an advantage to the change?. > > > > In the classic python model the interpreter looked in the instance dictionary > > and if the name wasn't there it looked in the class dictionary. The > > following illustrates this evaluation order. > > ... > > > > With the new slots mechanism the order has been reversed. The class level > > dictionary is searched and then the slots are evaluated. > > > > >>> class B(object): > > __slots__ = ['a','b','c'] > > > > >>> b = B() > > >>> b.a = 4 > > >>> b.a > > 4 > > >>> B.a = 6 > > >>> b.a > > 6 > > >>> b.a = 8 > > Traceback (most recent call last): > > File "", line 1, in ? > > b.a = 8 > > AttributeError: 'B' object attribute 'a' is read-only > > Could someone please first explain what these slots are used for > in the first place :-? There must be some difference to standard > class attributes... which is probably also the reason for the > above behaviour (even though it does look like a bug to me). The lookup order (instance dict, then class dict) hasn't changed, just there is npo instance dict any longer. B.a is an attribute descriptor which retrieves the 'a' attribute from the instance. B.a = 6 doesn't change anything in the instance, it just masks the slot so that it isn't accessible any longer. Is this a bug, a feature or an implementation detail? No idea... Thomas From thomas.heller@ion-tof.com Fri Nov 2 18:20:31 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 2 Nov 2001 19:20:31 +0100 Subject: [Python-Dev] Change in evaluation order in new object model References: <200111012242.RAA01504@email.nist.gov> <3BE2C84A.B297DC7C@lemburg.com> <200111021801.NAA19265@email.nist.gov> Message-ID: <05ed01c163cb$11f93460$e000a8c0@thomasnotebook> > Andrew described an example of how the descriptor[1] capability will allow > Python to be extended in interesting ways. It turns out to be relatively > easy to extend the descriptor capability. I submitted a patch yesterday that > adds support for doc strings and optional type checking to the members > defined by slots. The syntax to add the new capabilities is similar to the > __slots__, but uses dictionaries to assign the values. > > class B(object): > __slots__ = ['a','b','c'] > __slot_docs__ = {'a':"doc string for a", 'b' : "doc string for b"} > __slot_types__ = {'a':(int,str), 'c':int, } > This is very interesting - I will definitely look at your patch. This can also be achieved by implementing custom attribute descriptors, without changes to the core. You just have to equip the class with them manually. Currently I'm working on something similar (see http://lpfw.sf.net/, use the link titled 'Accessing and manipulating C data types'). Thomas From tim.one@home.com Fri Nov 2 18:32:30 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 2 Nov 2001 13:32:30 -0500 Subject: [Python-Dev] Change in evaluation order in new object model In-Reply-To: <3BE2C84A.B297DC7C@lemburg.com> Message-ID: [MAL] > Could someone please first explain what these slots are used for > in the first place :-? Please see my recent long reply to Michael, and don't confuse slots with __slots__ From tim.one@home.com Fri Nov 2 18:32:19 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 2 Nov 2001 13:32:19 -0500 Subject: [Python-Dev] Change in evaluation order in new object model In-Reply-To: <200111012242.RAA01504@email.nist.gov> Message-ID: [Michael McLay] > I was suprised by a change to the order of evaluation of members > in the new object type. I haven't found an explanation for why the > change was made. Read PEP 252, paying special attention to the section containing: When a dynamic attribute (one defined in a regular object's __dict__) has the same name as a static attribute (one defined by a meta-object in the inheritance graph rooted at the regular object's __class__), the static attribute has precedence if it is a descriptor that defines a __set__ method (see below); otherwise (if there is no __set__ method) the dynamic attribute has precedence. In other words, for data attributes (those with a __set__ method), the static definition overrides the dynamic definition, but for other attributes, dynamic overrides static. Rationale: we can't have a simple rule like "static overrides dynamic" or "dynamic overrides static", because ... > ... > With the new slots mechanism the order has been reversed. The > class level dictionary is searched and then the slots are evaluated. I should hope so! The *point* of __slots__ (which is what you're really talking about, not the general concept of "slots") is that the class, not the object, is responsible for doing the attribute name->storage_address mapping, and in intended use an object of a class with __slots__ doesn't even have a __dict__ (each __slot__ attribute is allocated at a fixed offset from the start of the object, saving tons of storage). >>> class C(object): ... __slots__ = ['a'] Now objects of type C don't have a dict: storage for one attribute 'a' is allocated directly in C objects. >>> c = C() >>> c.__dict__ Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' object has no attribute '__dict__' You can set and get 'a': >>> c.a = 12 >>> c.a 12 C.a is a special beast: >>> C.a What makes it special isn't that it came from __slots__, though, but that it has a __set__ method (reread the quoted text above until your eyes bleed : this is a deadly simple protocol, so simple that it can be hard to understand at first (shades of the metaclass hook and continuations, there)): >>> dir(C.a) ['__class__', '__delattr__', '__doc__', '__get__', '__getattribute__', ^^^^^^^ C.a.__get__ is called when c.a is referenced. '__hash__', '__init__', '__name__', '__new__', '__objclass__', '__reduce__', '__repr__', '__set__', '__setattr__', '__str__'] ^^^^^^^ C.a.__set__ is called when c.a is bound or del'ed. Objects of C type can't grow new attributes: >>> c.b =12 Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' object has no attribute 'b' > >>> class B(object): > __slots__ = ['a','b','c'] > > >>> b = B() > >>> b.a = 4 > >>> b.a > 4 > >>> B.a = 6 Here you overwrote the descriptor that allows b.a to mean something sensible (you nuked the thingie that maps 'a' to its storage address). Now B.a is an ordinary class attribute, and remember that b doesn't have a __dict__ (which you asked for, by using __slots__; you're not required to use __slots__). > >>> b.a > 6 > >>> b.a = 8 > Traceback (most recent call last): > File "", line 1, in ? > b.a = 8 > AttributeError: 'B' object attribute 'a' is read-only I agree it's an odd msg, but I'm not sure it can do better easily: by overwriting B.a (which was nuts -- you're exploring pathologies here, not intended usage), you've left b as an object with an 'a' attribute inherited from its class, but also as an object that can't grow new attributes of its own. Python looks at "hmm, I *can't* set 'a', but I do *have* an 'a'", and comes up with "read-only". Try your example again without using __slots__ (you do *not* want __slots__ if you intend an object's namespace to be dynamic -- __slots__ announces that you guarantee the set of object attributes is fixed at class creation time): >>> class B(object): pass ... >>> b = B() >>> b.a = 4 >>> B.a = 6 >>> b.a 4 >>> b.a = 8 >>> b.a 8 >>> B.a 6 >>> IOW, don't use new features if you don't want new semantics, and things look much the same. If you want __slots__, though, there was no way to get its effect prior to 2.2 short of writing an ExtensionClass in C. From tim.one@home.com Fri Nov 2 18:38:17 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 2 Nov 2001 13:38:17 -0500 Subject: [Python-Dev] Change in evaluation order in new object model In-Reply-To: <200111021801.NAA19265@email.nist.gov> Message-ID: [Michael McLay] > ... > The lookup of a member is also faster because it uses a lookup of > an offset instead of a dictionary lookup. There's still a dict lookup: when you do obj.a where a is a __slot__ attribute of obj.__class__, obj.__class__.__dict__['a'] is looked up in order to get the descriptor for attribute 'a'. The fixed set of __slot__ attributes leaves a door open for future optimizations, though (e.g., if Python could *know* obj.__class__ at compile-time, and know that runtime code won't overwrite the 'a' descriptor in obj.__class__.__dict__, it could map obj.a directly to its storage offset (from the base of obj) at compile-time). From tim.one@home.com Fri Nov 2 19:43:33 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 2 Nov 2001 14:43:33 -0500 Subject: [Python-Dev] who's maintaining tools/idle? In-Reply-To: <00ba01c163a7$db2f4140$ced241d5@hagrid> Message-ID: [Fredrik Lundh] > just noticed that CVS is merging changes into icons/minusnode.gif > every time I do a cvs update. doesn't sound right. > > turns out that minusnode.gif and plusnode.gif has been checked in > as text files (no -kb option). > > how can I fix this? > > (I should know, but I only use perforce these days, and my memory > isn't what it used to be...) You're not going to believe this, but I swear it's true: this happens on Windows boxes when daylight saving time starts or ends, and sometimes when you change your computer's clock when traveling across timezones. The simplest fix is simply to delete the files from your local snapshot, and let a cvs update recreate them from scratch. If it's actually related to the lack of -kb (beats me!), Fred (Drake) knows how to get that flag set retroactively. From fdrake@acm.org Fri Nov 2 19:41:11 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 2 Nov 2001 14:41:11 -0500 Subject: [Python-Dev] who's maintaining tools/idle? In-Reply-To: References: <00ba01c163a7$db2f4140$ced241d5@hagrid> Message-ID: <15330.63191.34296.170455@grendel.zope.com> Tim Peters writes: > If it's actually related to the lack of -kb (beats me!), Fred (Drake) knows > how to get that flag set retroactively. Fredrik's diagnosis was good; that was exactly the behavior I'd expect to see on a platform with different line-termination characters than the server when dealing with CVS clients as finicky as those found on Windows. ;-) The fact that it was just these files certainly helps isolate the problem as well; what you see happens only with time-related changes and involves all the files currently in your checkout. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jim@interet.com Fri Nov 2 20:22:17 2001 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 02 Nov 2001 15:22:17 -0500 Subject: [Python-Dev] Caching directory files in import.c Message-ID: <3BE30079.D6A8FB52@interet.com> I have a new version of my zip importing code. As before, it reads the file names from zipfiles and records them in a global dictionary to speed up finding zip imports. But what about imports from directories? Looking at the code, I saw that I could do an os.listdir(path), and record the directory file names into the same dictionary. Then it would not be necessary to perform a large number of fopen()'s. The same dictionary lookup is used instead. Is this a good idea??? It seems it should be faster when a "large" percentage of files in a directory are imported. It should be slower when only one file is imported from a directory with many names. I think I remember people discussing this before. Is the speedup real and worth the slight amount of additional code? JimA From nas@python.ca Fri Nov 2 21:36:35 2001 From: nas@python.ca (Neil Schemenauer) Date: Fri, 2 Nov 2001 13:36:35 -0800 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libstatcache.tex,1.3,1.4 In-Reply-To: ; from fdrake@users.sourceforge.net on Fri, Nov 02, 2001 at 12:20:21PM -0800 References: Message-ID: <20011102133635.A29984@glacier.arctrix.com> Fred L. Drake wrote: > Add deprecation notice to statcache. Should a warning be added to the module as well? Neil From fdrake@acm.org Fri Nov 2 21:35:29 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 2 Nov 2001 16:35:29 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libstatcache.tex,1.3,1.4 In-Reply-To: <20011102133635.A29984@glacier.arctrix.com> References: <20011102133635.A29984@glacier.arctrix.com> Message-ID: <15331.4513.475211.429998@grendel.zope.com> Neil Schemenauer writes: > Should a warning be added to the module as well? I'm not sure; Guido likes to take his time getting those in. Since we don't really have a formal deprecation policy in practice, I'm inclined to let that go for a release. Given that we're talking about statcache, though, I certainly wouldn't object if someone adds the warning to the code! ;-) The filecmp module uses it if the user passes use_statcache=1 to the public functions, but I suspect that code could be changed to only import statcache if it needed to and suppress the warning in that case. Hmm. I'd better add a note about use_statcache eventually being deprecated; filecmp itself should not be deprecated. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From akuchlin@mems-exchange.org Fri Nov 2 22:08:45 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 2 Nov 2001 17:08:45 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules shamodule.c,2.15,2.16 In-Reply-To: ; from fdrake@users.sourceforge.net on Fri, Nov 02, 2001 at 02:04:19PM -0800 References: Message-ID: <20011102170845.B2477@ute.mems-exchange.org> On Fri, Nov 02, 2001 at 02:04:19PM -0800, Fred L. Drake wrote: >Simplfy the insint() macro to use PyModule_AddIntConstant(). Should this clean-up be done more generally? ute Modules>grep -l insint *.c dlmodule.c pcremodule.c selectmodule.c shamodule.c socketmodule.c ute Modules> --amk From fdrake@acm.org Fri Nov 2 22:05:30 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 2 Nov 2001 17:05:30 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules shamodule.c,2.15,2.16 In-Reply-To: <20011102170845.B2477@ute.mems-exchange.org> References: <20011102170845.B2477@ute.mems-exchange.org> Message-ID: <15331.6314.158425.158276@grendel.zope.com> Andrew Kuchling writes: > On Fri, Nov 02, 2001 at 02:04:19PM -0800, Fred L. Drake wrote: > >Simplfy the insint() macro to use PyModule_AddIntConstant(). > > Should this clean-up be done more generally? I'm not sure that "should" is the right term, but it has the advantage that it helps divorce the extension code from the implementation of modules, and should generally produced smaller compiled code without any drawbacks. I wouldn't object, for what that's worth. I'd certainly encourage it being done when an extension is being changed anyway. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tim.one@home.com Sat Nov 3 07:02:57 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 3 Nov 2001 02:02:57 -0500 Subject: [Python-Dev] 2.1.2: patching Modules/ In-Reply-To: <200111011724.SAA29749@paros.informatik.hu-berlin.de> Message-ID: > [\n\ -> \n" conversion] > Did these actually fix anything? No, but they can create problems for you, which is why I do them : if I'm doing "real work" on a module and find that mechanical issues are proving to be a non-trivial distraction, I try to check in the mechanical edits first, so they don't get mixed up with the real work. Ironic: my current editor understands Python's triple-quoted strings better than the Emacs python-mode does, but doesn't understand backslash continuation of strings at all. I did \n\ -> \n" even before, though, as the text doesn't line up right for the eyeball when only the first line of a continued string has a leading ". From tim.one@home.com Sun Nov 4 04:21:10 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 3 Nov 2001 23:21:10 -0500 Subject: [Python-Dev] RE: Future division detection In-Reply-To: Message-ID: [Christopher A. Craig, playing with modifying the meaning of division] > ... > While doing this I was thinking that I would change true_division on > ints and floats to return a rational and change the rational code to > return a long if the denominator is 1. This works great, except that > if future division is off then rationals can suddenly become longs and > do not automatically cast back. This makes it virtually impossible to > guarantee a correct result to nearly any rational computation that > involves a division. > > So I wanted to know if there is some way to detect, at the object > level, if the CO_FUTURE_DIVISION feature is active. I'm unclear on what you're asking. In case it helps, note this section in __future__.py: # The CO_xxx symbols are defined here under the same names used by # compile.h, so that an editor search will find them here. However, # they're not exported in __all__, because they don't really belong to # this module. CO_NESTED = 0x0010 # nested_scopes CO_GENERATOR_ALLOWED = 0x1000 # generators CO_FUTURE_DIVISION = 0x2000 # division A code object's co_flags member is a mask made up of these (among other) bits: >>> def f(): ... a/b ... >>> hex(f.func_code.co_flags) '0x3' >>> from __future__ import division >>> def f(): ... a/b ... >>> hex(f.func_code.co_flags) '0x2003' >>> So if you can get at a code object, you can tell whether it was compiled with future division by checking its co_flags 0x2000 bit. This is internal implementation detail, though, and there's NO GUARANTEE we won't reuse the 0x2000 bit for some other purpose in some future release. From neal@metaslash.com Sun Nov 4 15:34:11 2001 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 04 Nov 2001 10:34:11 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory Message-ID: <3BE55FF3.18C64627@metaslash.com> I just submitted a bunch of patches to SF to correct memory leaks, possible memory leaks, and uninitialized memory reads (UMR). The problems were found by using purify on the regression tests. There are still some problems that I have not be able to correct. Here is a list of the biggest problems, the details are at the end of this mail: test_long_future - leaks 43124 bytes (future division on longs) test_asynchat (and others) - leaks 2k+ socketmodule.c:633 test___all__ (and others) - UMR marshal.c:438 test_parser - leaks 5k - parsermodule.c:721 recurses to line 716 There are other smaller problems, but this mail is already quite long. Once the bigger problems are resolved, I can work more on the small ones. Let me know if there's more info necessary to fix these problems. Also, if anyone wants to see the complete list of regression tests run and the results, let me know. What is the best format to post this info? Would there be a better way to deal with these kinds or problems? Neal -- test_long_future: there are memory leak problems with binary operations which use longs and future division: the following code leaks 32 bytes: >>> from __future__ import division >>> 5L / 3L here's the stack trace, using current CVS snapshot (from Sat): malloc [rtlib.o] muladd1 [longobject.c:51] PyLong_FromString [longobject.c:1027] parsenumber [compile.c:1096] com_atom [compile.c:1461] com_power [compile.c:1853] com_factor [compile.c:1982] com_term [compile.c:1994] com_arith_expr [compile.c:2027] com_shift_expr [compile.c:2053] com_and_expr [compile.c:2079] com_xor_expr [compile.c:2101] com_expr [compile.c:2123] com_comparison [compile.c:2177] com_and_test [compile.c:2252] com_test [compile.c:2353] and: muladd1 [longobject.c:51] PyLong_FromString [longobject.c:1027] parsenumber [compile.c:1096] com_atom [compile.c:1461] com_power [compile.c:1853] com_factor [compile.c:1982] com_term [compile.c:1992] com_arith_expr [compile.c:2027] com_shift_expr [compile.c:2053] com_and_expr [compile.c:2079] com_xor_expr [compile.c:2101] com_expr [compile.c:2123] com_comparison [compile.c:2177] com_and_test [compile.c:2252] com_test [compile.c:2353] here's a different stack trace if it helps: x_add [longobject.c:51] long_add [longobject.c:1408] binary_op1 [abstract.c:343] PyNumber_Add [abstract.c:578] eval_frame [ceval.c:960] PyEval_EvalCodeEx [ceval.c:2549] fast_function [ceval.c:3116] eval_frame [ceval.c:1996] PyEval_EvalCodeEx [ceval.c:2549] PyEval_EvalCode [ceval.c:483] PyImport_ExecCodeModuleEx [import.c:494] load_source_module [import.c:764] load_module [import.c:1348] import_submodule [import.c:1887] load_next [import.c:1743] test_asynchat: memory is allocated from socketmodule.c:633 (getaddr) it appears the memory is deallocated, but purify is still complaining here's the stack trace: malloc [rtlib.o] __IPv6_alloc [getipnodeby.c] getipnodebyname [getipnodeby.c] get_addr [getaddrinfo.c] getaddrinfo [libsocket.so.1] setipaddr [socketmodule.c:633] getsockaddrarg [socketmodule.c:821] PySocketSock_bind [socketmodule.c:1186] fast_cfunction [ceval.c:3086] eval_frame [ceval.c:1979] PyEval_EvalCodeEx [ceval.c:2549] PyEval_EvalCode [ceval.c:483] PyImport_ExecCodeModuleEx [import.c:494] load_source_module [import.c:764] load_module [import.c:1348] import_submodule [import.c:1887] test___all__: I'm not sure if this is a problem in Python or not. It's possible that the problem is atof()/strtod() reads the string in 4 byte increments. This is occurring while in: __big_float_times_power [libc.so.1] __decimal_to_binary_integer [libc.so.1] __decimal_to_unpacked [libc.so.1] decimal_to_double [libc.so.1] strtod [libc.so.1] r_object [marshal.c:438] r_object [marshal.c:524] r_object [marshal.c:595] PyMarshal_ReadObjectFromString [marshal.c:741] PyMarshal_ReadLastObjectFromFile [marshal.c:700] load_source_module [import.c:582] load_module [import.c:1348] import_submodule [import.c:1887] load_next [import.c:1743] import_module_ex [import.c:1594] PyImport_ImportModuleEx [import.c:1635] Reading 2 bytes from 0xffbe9034 on the stack. Address 0xffbe9034 is 452 bytes below frame pointer in function __decimal_to_unpacked. test_parser: This memory was allocated from: malloc [rtlib.o] PyNode_AddChild [node.c:35] build_node_children [parsermodule.c:716] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] Block of 20 bytes (69 times); last block at 0x4fc238 From fredrik@pythonware.com Sun Nov 4 16:13:58 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 4 Nov 2001 17:13:58 +0100 Subject: [Python-Dev] Memory leaks and uninitialized memory References: <3BE55FF3.18C64627@metaslash.com> Message-ID: <006001c1654b$c80703a0$ced241d5@hagrid> hi neal, > I just submitted a bunch of patches to SF to correct memory leaks, > possible memory leaks, and uninitialized memory reads (UMR). (I tried to follow up over at sourceforge, but all I get is an error message saying "ERROR!" and nothing else...) I'm a bit puzzled over your proposed SRE patch. the patch changes return PyString_FromString(""); to result = PyString_FromString(""); Py_INCREF(result); return result; but both according the documentation and the implementation, FromString returns a new reference (usually another reference to the internal nullstring object). in other words, if there's a bug somewhere, I'm not sure it's in SRE. > What is the best format to post this info? Would there be a better > way to deal with these kinds or problems? keep on posting them to this list, until someone comes up with a better idea. From neal@metaslash.com Sun Nov 4 17:04:40 2001 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 04 Nov 2001 12:04:40 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory References: <3BE55FF3.18C64627@metaslash.com> <006001c1654b$c80703a0$ced241d5@hagrid> Message-ID: <3BE57528.6A6829E1@metaslash.com> Fredrik Lundh wrote: > I'm a bit puzzled over your proposed SRE patch. > > the patch changes > > return PyString_FromString(""); > > to > > result = PyString_FromString(""); > Py_INCREF(result); > return result; > > in other words, if there's a bug somewhere, I'm not sure it's in > SRE. Hmmm, I thought that patch looked a bit suspect. I backed out this patch, then reran all the tests (143 of them) that this patch should have addressed. Nothing new was reported. So this patch must have been something I was playing with, but isn't correct. You can (try to) close this bug in SF. If I find a problem, I'll submit a better patch. Neal From tim.one@home.com Sun Nov 4 21:46:43 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 4 Nov 2001 16:46:43 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory In-Reply-To: <3BE55FF3.18C64627@metaslash.com> Message-ID: [Neal Norwitz] > test_long_future: > there are memory leak problems with binary operations > which use longs and future division: > > the following code leaks 32 bytes: > > >>> from __future__ import division > >>> 5L / 3L However, from __future__ import division while 1: 5L / 3L can run all day without memory size increasing, so this "leak" is probably bogus (note that Python stores pointers to all sorts of malloc'ed memory into file-static vrbls, and tiny "leaks" are often-- at considerable cost --traced simply to that, e.g., a static Python string object constant got dynamically initialized). OTOH, if I change the tail end of test_long_future.py to while 1: test_true_division() it leaks like a sieve, so *something* is wrong there. I'll track it down; the routines that show up at the top of the stack traces appear to be blameless: > malloc [rtlib.o] > muladd1 [longobject.c:51] > PyLong_FromString [longobject.c:1027] > parsenumber [compile.c:1096] From tim.one@home.com Sun Nov 4 23:12:55 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 4 Nov 2001 18:12:55 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory In-Reply-To: <3BE55FF3.18C64627@metaslash.com> Message-ID: [Neal Norwitz] > ... > There are still some problems that I have not be able to correct. > Here is a list of the biggest problems, the details are at the > end of this mail: > > test_long_future - leaks 43124 bytes (future division on longs) This one has been fixed -- or, at least, test_long_future.py in an infinite loop no longer grows. Thanks! I don't intend to look at more of these (too much else to do). If they're not resolved quickly, please open distinct bug reports for each, else they'll simply get lost. From neal@metaslash.com Mon Nov 5 01:12:11 2001 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 04 Nov 2001 20:12:11 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory References: Message-ID: <3BE5E76B.5D5C7D2@metaslash.com> Tim Peters wrote: > I don't intend to look at more of these (too much else to do). If they're > not resolved quickly, please open distinct bug reports for each, else > they'll simply get lost. Tim: You wrote: Neal, if you do more of these, could you please limit them to one module per patch? Not all the suggested fixes have made sense, and the report gets to be a mess when only part of a patch can be applied. By module do you mean directory (Modules, Lib, Python, etc) or do you mean individual files? Neal From tim.one@home.com Mon Nov 5 02:24:55 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 4 Nov 2001 21:24:55 -0500 Subject: [Python-Dev] Memory leaks and uninitialized memory In-Reply-To: <3BE5E76B.5D5C7D2@metaslash.com> Message-ID: [Neal] > By module do you mean directory (Modules, Lib, Python, etc) or do > you mean individual files? I mean one patch == the fewest number of files that must be changed simultaneously. This may vary from one (likely most common) to dozens (likely very uncommon), depending on the specific changes involved. I expect most "memory leak" fixes don't require coordinated changes across multiple files, and are better handled by one-file patches when that's possible. From tim.one@home.com Mon Nov 5 02:54:11 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 4 Nov 2001 21:54:11 -0500 Subject: [Python-Dev] Re: OS/2 VAC++ patches In-Reply-To: Message-ID: [Andrew MacIntyre] > ... > If possible I'd like to see both 473749 (the updated version) I just now checked that in. > and 474169 I believe Martin checked that in previously. > make it into 2.2b2, to give 2.2 a chance to ship with the OS/2 VAC++ > port buildable from the release sourceball (the EMX port won't make it > into CVS for 2.2). That's all the OS/2 patches I know about, except for your EMX patches. > ... > (as previously advised, I'm not going to be in a position to take on CVS > commits until after 2.2b2) In that case we'll release 2.2b2 later tonight . Thanks for the patches! From tim.one@home.com Mon Nov 5 09:00:47 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 5 Nov 2001 04:00:47 -0500 Subject: [Python-Dev] Patch submitted for cross-platform newline support In-Reply-To: <20011031165523.85D36303181@snelboot.oratrix.nl> Message-ID: [Jack Jansen] > ... > I'm also interested in discussing whether a patch like this is > appropriate while we're in beta. On the one hand I would say it is, > because the feature is disabled by default. On the other hand there > are changes (albeit mainly cosmetic ones) in a large number of places. Not so many -- it's mostly in fileobject.c, which perhaps should not have surprised me . A prime reason we do "feature freeze" is to give adequate mindshare to the stuff already in the beta. I just spent 10 hours on a Sunday "dealing with" bugs and patches and 2.2 issues, including over an hour reviewing this patch. The 2.2 work I *wanted* to get out of the way this weekend never got started, and it's increasingly doubtful it ever will. If the patch were a no-brainer with no controversial aspects, and 2.2 were basically done in all other respects, it would be easy to say "sure"; as is, there are threading issues and (lack of) error-detection issues and (lack of) doc issues and user interface issues ... and this sure looks like something that *should* have had a PEP and wide community debate (not just in the Mac community). I like the idea of the patch. but the timing is really bad (esp. with Guido on leave this month, American holidays coming up fast, and a backlog of bug reports and patches growing in the wrong direction). OTOH, since it's off by default, "no harm done" is an arguable position. If it does go in, and you turn it on for the Mac, and we decide we need something different in 2.3, will we be doomed to play the "sorry, can't change it -- backward compatibility!" game? If so, I'm -1 on it for 2.2. If it's treated as purely experimental and subject to arbitrary potentially incompatible change, then -0 (the "-" still for mindshare-dilution reasons alone). From com-nospam@ccraig.org Mon Nov 5 16:13:37 2001 From: com-nospam@ccraig.org (Christopher A. Craig) Date: 05 Nov 2001 11:13:37 -0500 Subject: [Python-Dev] Re: Future division detection In-Reply-To: References: Message-ID: "Tim Peters" writes: > [Christopher A. Craig, playing with modifying the meaning of division] > > ... > > While doing this I was thinking that I would change true_division on > > ints and floats to return a rational and change the rational code to > > return a long if the denominator is 1. This works great, except that > > if future division is off then rationals can suddenly become longs and > > do not automatically cast back. This makes it virtually impossible to > > guarantee a correct result to nearly any rational computation that > > involves a division. > > > > So I wanted to know if there is some way to detect, at the object > > level, if the CO_FUTURE_DIVISION feature is active. > > I'm unclear on what you're asking. In case it helps, note this section in > __future__.py: > Hmm, I thought I was being too verbose, I guess I was being too unclear. I have a rational module (in C) which I am trying to patch to be an object satisfying PEP-239. The problem is that I would like to have an rational cast back to a integer iff (1) the denominator is 1 and (2) future division is active. If I don't check (2) then I get the situation that if future division is inactive then `(rational('1/3')*3)/5` would yield 0 instead of 1/5. This makes rationals pretty much useless unless future division is active. What I would like to do is have a function that can check to see if the CO_FUTURE_DIVISION flag is set and if it is and the denominator is 1 then return the numerator, else return the rational. I have a strong suspicion that this can't be done, in which case I just won't do the cast back automatically. -- Christopher A. Craig From skip@pobox.com (Skip Montanaro) Mon Nov 5 16:29:51 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 05 Nov 2001 17:29:51 +0100 Subject: [Python-Dev] Re: Future division detection In-Reply-To: References: Message-ID: Christopher> What I would like to do is have a function that can check Christopher> to see if the CO_FUTURE_DIVISION flag is set and if it is Christopher> and the denominator is 1 then return the numerator, else Christopher> return the rational. Yeah, you can do this. It's just a little awkward. % python Python 2.2b1+ (#9, Oct 29 2001, 14:53:15) [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from __future__ import division >>> def f(): ... pass ... >>> f.func_code.co_flags 8195 >>> % python Python 2.2b1+ (#9, Oct 29 2001, 14:53:15) [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> def f(): ... pass ... >>> f.func_code.co_flags 3 Note how the code object's co_flags field changed. I think you just need to & it with 2<<12. During import you could have something like this: def _junk(): pass future_div = not not (_junk.func_code.co_flags & (2<<12)) del _junk then test the value of future_div where you need it. -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ P.S. I hope I guessed right about demangling your email address. If so, I've just exposed it to all the little email harvesting gremlins. So sorry... ;-) From mwh@python.net Mon Nov 5 16:48:16 2001 From: mwh@python.net (Michael Hudson) Date: 05 Nov 2001 11:48:16 -0500 Subject: [Python-Dev] Re: Future division detection In-Reply-To: com-nospam@ccraig.org's message of "05 Nov 2001 11:13:37 -0500" References: Message-ID: <2mvggpp3jj.fsf@starship.python.net> com-nospam@ccraig.org (Christopher A. Craig) writes: > "Tim Peters" writes: > > > [Christopher A. Craig, playing with modifying the meaning of division] > > > ... > > > While doing this I was thinking that I would change true_division on > > > ints and floats to return a rational and change the rational code to > > > return a long if the denominator is 1. This works great, except that > > > if future division is off then rationals can suddenly become longs and > > > do not automatically cast back. This makes it virtually impossible to > > > guarantee a correct result to nearly any rational computation that > > > involves a division. > > > > > > So I wanted to know if there is some way to detect, at the object > > > level, if the CO_FUTURE_DIVISION feature is active. > > > > I'm unclear on what you're asking. In case it helps, note this section in > > __future__.py: > > > > Hmm, I thought I was being too verbose, I guess I was being too > unclear. I have a rational module (in C) which I am trying to patch > to be an object satisfying PEP-239. The problem is that I would like > to have an rational cast back to a integer iff (1) the denominator is 1 > and (2) future division is active. > > If I don't check (2) then I get the situation that if future division > is inactive then `(rational('1/3')*3)/5` would yield 0 instead of 1/5. > This makes rationals pretty much useless unless future division is > active. > > What I would like to do is have a function that can check to see if > the CO_FUTURE_DIVISION flag is set and if it is and the denominator is > 1 then return the numerator, else return the rational. > > I have a strong suspicion that this can't be done, in which case I > just won't do the cast back automatically. Does this: int is_future_div(void) { PyCompilerFlags cf; PyEval_MergeCompilerFlags(&cf); return cf.cf_flags & CO_FUTURE_DIVISION; } work? You'll need to change this when future division becomes the default, but I think it'll work today. This is a murky dark corner of the interpreter, though -- so don't blame me when it breaks! Cheers, M. -- >> REVIEW OF THE YEAR, 2000 << It was shit. Give us another one. -- NTK Know, 2000-12-29, http://www.ntk.net/ From thomas.heller@ion-tof.com Mon Nov 5 17:31:25 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 5 Nov 2001 18:31:25 +0100 Subject: [Python-Dev] __class_init__ Message-ID: <052501c1661f$b21e8b10$e000a8c0@thomasnotebook> ExtensionClass did include a __class_init__ class-method, which has been called at the end of class creation. I've uploaded a (simple minded) patch to fileobject.c, which implements the same behaviour for new style classes. It is simple minded because it only iterates over the tp_base member and not over tp_bases. Any chance this could be included in Python 2.2? http://sourceforge.net/tracker/index.php?func=detail&aid=478374&group_id=5470&atid=305470 Thomas From mclay@nist.gov Mon Nov 5 17:54:10 2001 From: mclay@nist.gov (Michael McLay) Date: Mon, 5 Nov 2001 12:54:10 -0500 Subject: [Python-Dev] PEP 252 comment Message-ID: <200111051753.MAA24824@email.nist.gov> The potential for optimizing the new class/type sytem in Python can be improved by adding an additional constraint to the name lookup rules[1]. This new constrain only apply to the new name lookup rules for types that include __slots__ definitions so the change should not break backward compatibility. - Any name defined by __slot__ is not allowed as a key in the type dict. This rule would eliminate special type attribute names, like __dict__ and __class__, from the names allowed in __slots__. The constraint would also make it illegal to injected a name into the type dictionary from outside the class definition if that name is one of the names in _slots__. In the following example the costraint would allow the assignment to B.d, but not the assignment to B.b. The assignment of 'c=5' would trigger a TypeError because the c is defined in __slots__. The '__dict__' in __slots__ would also trigger a type error because it is a special attribute name that is defined for all types. >>> class B(object): a = 3 c = 5 __slots__ = ['b', '__dict__' , 'c'] def __init__(self): self.b = 4 >>> B.d = 'fish' >>> b = B() >>> b.b 4 >>> b.d 'fish' >>> B.b = 3.2 >>> b.b 3.2000000000000002 >>> b.b = 55 Traceback (most recent call last): File "", line 1, in ? b.b = 55 AttributeError: 'B' object attribute 'b' is read-only >>> The rule eliminates the need to look in the type dictionary if the name is defined in __slots__. The cryptic AttributeError message in the example would never occur. A slightly more restrictive rule would extend the constraint to eliminate any addition of names to the type dict from outside of the class definition. This additional constraint would make the assignment to B.d illegal in the previous example. This rule would make it possible to know the complete list of all names that could be referenced by an object of the given type. [1] From http://python.sourceforge.net/peps/pep-0252.html In the more complicated case, there's a conflict between names stored in the instance dict and names stored in the type dict. If both dicts have an entry with the same key, which one should we return? Looking at classic Python for guidance, I find conflicting rules: for class instances, the instance dict overrides the class dict, *except* for the special attributes (like __dict__ and __class__), which have priority over the instance dict. - I resolved this with the following set of rules, implemented in PyObject_GenericGetAttr(): 1. Look in the type dict. If you find a *data* descriptor, use its get() method to produce the result. This takes care of special attributes like __dict__ and __class__. 2. Look in the instance dict. If you find anything, that's it. (This takes care of the requirement that normally the instance dict overrides the class dict.) 3. Look in the type dict again (in reality this uses the saved result from step 1, of course). If you find a descriptor, use its get() method; if you find something else, that's it; if it's not there, raise AttributeError. From mclay@nist.gov Fri Nov 2 21:06:00 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 2 Nov 2001 17:06:00 -0400 Subject: [Python-Dev] Change in evaluation order in new object model In-Reply-To: References: Message-ID: <200111051753.MAA24813@email.nist.gov> On Friday 02 November 2001 01:32 pm, Tim Peters wrote: > [Michael McLay] > > > I was suprised by a change to the order of evaluation of members > > in the new object type. I haven't found an explanation for why the > > change was made. > > Read PEP 252, paying special attention to the section containing: > > When a dynamic attribute (one defined in a regular object's > __dict__) has the same name as a static attribute (one defined > by a meta-object in the inheritance graph rooted at the regular > object's __class__), the static attribute has precedence if it > is a descriptor that defines a __set__ method (see below); > otherwise (if there is no __set__ method) the dynamic attribute > has precedence. In other words, for data attributes (those > with a __set__ method), the static definition overrides the > dynamic definition, but for other attributes, dynamic overrides > static. > > Rationale: we can't have a simple rule like "static overrides > dynamic" or "dynamic overrides static", because ... > > > ... > > With the new slots mechanism the order has been reversed. The > > class level dictionary is searched and then the slots are evaluated. > > I should hope so! The *point* of __slots__ (which is what you're really > talking about, not the general concept of "slots") is that the class, not > the object, is responsible for doing the attribute name->storage_address > mapping, and in intended use an object of a class with __slots__ doesn't > even have a __dict__ (each __slot__ attribute is allocated at a fixed > offset from the start of the object, saving tons of storage). Yes, that is one of the reasons I want to use them. I've read the paragraph and my eyes are bleeding. I'm still trying to understand why the definition in that pargraph cased the order to be reverse. Why idoes the dynamic class attribute overriding an instance attribute? The attributes added by __slots__ have __set__ and __get__ methods so it should take precedence according to the definition. The __slots__ names could have been used prior to the names in the dictionary in the class, just like the __dict__ in an old class instance object was searched prior to the __dict__ in the class definition. > > > > >>> b = B() > > >>> b.a = 4 > > >>> b.a > > > > 4 > > > > >>> B.a = 6 > > Here you overwrote the descriptor that allows b.a to mean something > sensible (you nuked the thingie that maps 'a' > to its storage address). Now B.a is an ordinary class attribute, and > remember that b doesn't have a __dict__ (which you asked for, by using > __slots__; you're not required to use __slots__). Do you consider it a good thing that a class attribute can be introduced outside of the class definition? Given the nature of the changes being introduced with the new type system I would think the ability to add names to a class dictionary from outside the class would be turned off for the new type classes. If not for all types then at least for cases where the __slot__ mechanism is being used to define members. The slots mechanism also allows the same name to be used twice in the definition of a class. >>> class B(object): __slots__ = ['a','b','c','c'] I suspect this is going to cause a slot to be created for the first 'c' and then covered up by the second 'c'. >> class B(object): __slots__ = ['a','b','c'] >>> class C(B): __slots__ = ['d','a'] In this example the same thing will happen when names are redefined in a subclass. > > >>> b.a = 8 > > > > Traceback (most recent call last): > > File "", line 1, in ? > > b.a = 8 > > AttributeError: 'B' object attribute 'a' is read-only > > I agree it's an odd msg, but I'm not sure it can do better easily: by > overwriting B.a (which was nuts -- you're exploring pathologies here, not > intended usage), Why not make the overwriting of B.a generate an error message? > you've left b as an object with an 'a' attribute inherited > from its class, but also as an object that can't grow new attributes of its > own. Python looks at "hmm, I *can't* set 'a', but I do *have* an 'a'", and > comes up with "read-only". > > Try your example again without using __slots__ (you do *not* want __slots__ > if you intend an object's namespace to be dynamic -- __slots__ announces > that you guarantee the set of object attributes is fixed at class creation If I don't want the object's namespace to be dynamic then the class in which it is defined should not be dynamic either. > > time): > >>> class B(object): pass > > ... > > >>> b = B() > >>> b.a = 4 > >>> B.a = 6 > >>> b.a > > 4 > > >>> b.a = 8 > >>> b.a > > 8 > > >>> B.a > > 6 > > > IOW, don't use new features if you don't want new semantics, and things > look much the same. If you want __slots__, though, there was no way to get > its effect prior to 2.2 short of writing an ExtensionClass in C. I was looking for ways in which people could mess data iin objects from outside of those objects. The current semantics will allow people to do dumb thinks that will break code silently rather than raising an error. I need and like the new semantics. I have extending them to allow optional type checking. This will eliminate piles of ugly classes that wrap attribute access with isinstance type checks if they are added to the member descriptor and handled automatically by the set and get functions. From com-nospam@ccraig.org Mon Nov 5 17:56:36 2001 From: com-nospam@ccraig.org (Christopher A. Craig) Date: 05 Nov 2001 12:56:36 -0500 Subject: [Python-Dev] Re: Future division detection In-Reply-To: <2mvggpp3jj.fsf@starship.python.net> References: <2mvggpp3jj.fsf@starship.python.net> Message-ID: Michael Hudson writes: > Does this: > > int is_future_div(void) > { > PyCompilerFlags cf; > PyEval_MergeCompilerFlags(&cf); > return cf.cf_flags & CO_FUTURE_DIVISION; > } > > work? Yes! That's exactly the sort of thing I was looking for. I couldn't find the "PyEval_MergeCompilerFlags()" call. > You'll need to change this when future division becomes the default, > but I think it'll work today. This is a murky dark corner of the > interpreter, though -- so don't blame me when it breaks! I knew this, but my alternative was having to change behavior when future division becomes default, which would be much worse. When I decided to try to make my module into a patch and bend integers and longs to my will I knew that I was going to have to enter murky corners of the interpreter :-) -- Christopher A. Craig From tim.one@home.com Tue Nov 6 01:56:15 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 5 Nov 2001 20:56:15 -0500 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: <3BE737A8.8CF867A5@chello.nl> Message-ID: Speed freaks should look up this thread on c.l.py; also a related SF bug report I recently closed as "Won't Fix". Roeland Rengelink set up a simple test that builds increasingly large dicts, timing the per-item creation and destruction times. This was in response to another poster who bumped into the "geez, my program seems to take *forever* to exit" annoyance, where final decref'ing of many objects *can* take hours to complete (normally people only notice this at program exit, but it can happen whenever a large number of objects get freed). Roeland found the creation time per dict element on his Linux system was pretty steady, but destruction time per element grew disturbingly with dict size. I found the same on Win98SE, but the degeneration in destruction time per element was milder than on his Linux test. I fiddled dict deallocation on my box to do everything *except* call free() when a refcount hit zero (the dict contained only string objects, so free() was the only thing left out -- strings have a trivial destructor). So the memory leaked, but per-element destruction time no longer increased with dict size, i.e. "the problem" on my box was entirely due to MS free() behavior. I suggested to Roeland that he try rebuilding his Python with PyMalloc enabled, just to see what would happen. This is what happened (time is average microseconds per dict entry, as computed from time.time() deltas captured across whole-dict operations): > Well, aint that nice > > 2.2b1 --with-pymaloc > > size: 10000, creation: 29.94, destruction: 0.61 > size: 20000, creation: 30.10, destruction: 0.64 > size: 50000, creation: 30.73, destruction: 0.71 > size: 100000, creation: 30.72, destruction: 0.68 > size: 200000, creation: 30.95, destruction: 0.69 > size: 500000, creation: 30.62, destruction: 0.67 > size: 1000000, creation: 30.71, destruction: 0.68 > > malloc is faster too ;) This is what he saw earlier, using his platform malloc/free: > All times in micro-seconds per item. For the code see end of this post. > (Linux 2.2.14, 128M RAM, Cel 333 MHz) > > size: 10000, creation: 31.00, destruction: 1.49 > size: 20000, creation: 31.10, destruction: 1.57 > size: 50000, creation: 32.77, destruction: 1.76 > size: 100000, creation: 32.00, destruction: 1.92 > size: 200000, creation: 32.59, destruction: 2.38 > size: 500000, creation: 32.12, destruction: 4.35 > size: 1000000, creation: 32.25, destruction: 10.47 Can any Python-Dev'er make time to dig into the advisability of making PyMalloc the default? I only took time for this because I'm out sick today, and was looking for something mindless to occupy my fevered thoughts; alas, it paid off . I recall there are still thread issues wrt PyMalloc, and there *were* some reports that PyMalloc was slower on some platforms. Against that, I'm the guy who usually gets stuck trying to explain the inexplicable, and malloc/free performance are so critical to Python performance that it's always been "a problem" that we have no idea how system malloc/free behave across platforms (although I suppose it's "a feature" that I know how to crash Win9X by provoking problems with its malloc ). I can't make time for it in the 2.2 timeframe, though. factors-of-2-to-15-are-worth-a-little-effort-if-they're-real-ly y'rs - tim From barry@zope.com Tue Nov 6 04:11:55 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 5 Nov 2001 23:11:55 -0500 Subject: [Python-Dev] Re: [Zope.Com Geeks] Re: Program very slow to finish References: <3BE737A8.8CF867A5@chello.nl> Message-ID: <15335.25355.731350.856289@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> Roeland found the creation time per dict element on his Linux TP> system was pretty steady, but destruction time per element TP> grew disturbingly with dict size. I found the same on TP> Win98SE, but the degeneration in destruction time per element TP> was milder than on his Linux test. Very interesting! I think I've noticed the same thing with some Pipermail (Mailman's archiver) work I've been doing. When doing a from-scratch re-generation of the archives of say, python-list (280+MB), it creates some really big dicts, and when it goes to free them (apparently in gc), the disk just gets hammered. Relatively little CPU is being used (it's like 80-90% idle), but the machine is almost unresponsive. This on a 2.4.2 kernel with 256MB. I'll try building a 2.2 with pymalloc and see what happens. -Barry From tim.one@home.com Tue Nov 6 06:13:16 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 01:13:16 -0500 Subject: [Python-Dev] RE: [Zope.Com Geeks] Re: Program very slow to finish In-Reply-To: <15335.25355.731350.856289@anthem.wooz.org> Message-ID: [Barry] > Very interesting! I think I've noticed the same thing with some > Pipermail (Mailman's archiver) work I've been doing. When doing a > from-scratch re-generation of the archives of say, python-list > (280+MB), it creates some really big dicts, and when it goes to free > them (apparently in gc), the disk just gets hammered. Relatively > little CPU is being used (it's like 80-90% idle), but the machine is > almost unresponsive. This on a 2.4.2 kernel with 256MB. Maybe related, but not the same: Roeland was careful to run tests small enough that the disk didn't get involved. His slowdowns were on all-in-RAM cases, and he reported high CPU utilization throughout. Once the disk gets involved, it can be a true nightmare: while we traverse the dict in cache-friendly left-to-right sequential order, the memory it points *at* jumps all over creation from one slot to the next. Several years ago I amusedly waited 3 hours for a Python program to terminate on my old Win95 box; I would have waited longer except the disk grinding was interfering with hearing the TV . We could likely "fix that", via sorting the dict entries by pointed-at memory address before decref'ing (waste a second to save a day); note that there's already a superstitious feeble approximation to that in list_dealloc. > I'll try building a 2.2 with pymalloc and see what happens. Well, I'm waiting . From barry@zope.com Tue Nov 6 06:41:21 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 6 Nov 2001 01:41:21 -0500 Subject: [Python-Dev] RE: [Zope.Com Geeks] Re: Program very slow to finish References: <15335.25355.731350.856289@anthem.wooz.org> Message-ID: <15335.34321.928299.606678@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: >> I'll try building a 2.2 with pymalloc and see what happens. TP> Well, I'm waiting . Me too! We'll see what it looks like tomorrow. So far it's been interesting, but I'm too tired to babysit it any longer. if-it's-still-thrashing-in-the-morning-/you/-have-to-change-its-diaper- ly y'rs, -Barry From just@letterror.com Tue Nov 6 11:05:22 2001 From: just@letterror.com (Just van Rossum) Date: Tue, 6 Nov 2001 12:05:22 +0100 Subject: [Python-Dev] RE: [Zope.Com Geeks] Re: Program very slow to finish In-Reply-To: <15335.34321.928299.606678@anthem.wooz.org> Message-ID: <20011106120529-r01010800-97fb58e4-0920-010c@10.0.0.23> Here are some results on a 400 Mhz G4 PPC, 256 Megs of RAM, OSX 10.1, CVS Python: without pymalloc: size: 10000, creation: 14.73 (usec/elem), destruction: 1.46 (usec/elem) size: 20000, creation: 21.90 (usec/elem), destruction: 1.73 (usec/elem) size: 50000, creation: 18.10 (usec/elem), destruction: 1.82 (usec/elem) size: 100000, creation: 16.99 (usec/elem), destruction: 1.96 (usec/elem) size: 200000, creation: 16.29 (usec/elem), destruction: 2.11 (usec/elem) size: 400000, creation: 26.00 (usec/elem), destruction: 2.61 (usec/elem) size: 600000, creation: 21.78 (usec/elem), destruction: 3.64 (usec/elem) size: 800000, creation: 24.34 (usec/elem), destruction: 2.90 (usec/elem) size: 1000000, creation: 20.47 (usec/elem), destruction: 3.36 (usec/elem) with pymalloc: size: 10000, creation: 15.21 (usec/elem), destruction: 0.67 (usec/elem) size: 20000, creation: 19.58 (usec/elem), destruction: 0.48 (usec/elem) size: 50000, creation: 18.16 (usec/elem), destruction: 0.64 (usec/elem) size: 100000, creation: 17.54 (usec/elem), destruction: 0.66 (usec/elem) size: 200000, creation: 17.04 (usec/elem), destruction: 0.74 (usec/elem) size: 400000, creation: 16.90 (usec/elem), destruction: 0.69 (usec/elem) size: 600000, creation: 16.13 (usec/elem), destruction: 0.63 (usec/elem) size: 800000, creation: 16.74 (usec/elem), destruction: 0.68 (usec/elem) size: 1000000, creation: 16.31 (usec/elem), destruction: 0.64 (usec/elem) 266 Mhz G3 PPC, 160 Megs of RAM, OSX 10.0.4, CVS Python: without pymalloc: size: 10000, creation: 22.72 (usec/elem), destruction: 1.98 (usec/elem) size: 20000, creation: 21.17 (usec/elem), destruction: 2.12 (usec/elem) size: 50000, creation: 22.80 (usec/elem), destruction: 2.26 (usec/elem) size: 100000, creation: 22.58 (usec/elem), destruction: 2.38 (usec/elem) size: 200000, creation: 22.68 (usec/elem), destruction: 2.54 (usec/elem) size: 400000, creation: 22.79 (usec/elem), destruction: 2.93 (usec/elem) size: 600000, creation: 21.09 (usec/elem), destruction: 3.54 (usec/elem) size: 800000, creation: 24.75 (usec/elem), destruction: 18.47 (usec/elem) size: 1000000, creation: 119.68 (usec/elem), destruction: 435.15 (usec/elem) with pymalloc: size: 10000, creation: 20.40 (usec/elem), destruction: 0.58 (usec/elem) size: 20000, creation: 19.77 (usec/elem), destruction: 0.62 (usec/elem) size: 50000, creation: 20.86 (usec/elem), destruction: 0.75 (usec/elem) size: 100000, creation: 21.00 (usec/elem), destruction: 0.76 (usec/elem) size: 200000, creation: 21.21 (usec/elem), destruction: 0.76 (usec/elem) size: 400000, creation: 21.32 (usec/elem), destruction: 0.79 (usec/elem) size: 600000, creation: 20.35 (usec/elem), destruction: 0.71 (usec/elem) size: 800000, creation: 21.44 (usec/elem), destruction: 0.80 (usec/elem) size: 1000000, creation: 20.80 (usec/elem), destruction: 0.76 (usec/elem) From nas@python.ca Tue Nov 6 17:02:26 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 Nov 2001 09:02:26 -0800 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: ; from tim.one@home.com on Mon, Nov 05, 2001 at 08:56:15PM -0500 References: <3BE737A8.8CF867A5@chello.nl> Message-ID: <20011106090226.B7312@glacier.arctrix.com> Tim Peters wrote: > Can any Python-Dev'er make time to dig into the advisability of making > PyMalloc the default? I only took time for this because I'm out sick today, > and was looking for something mindless to occupy my fevered thoughts; alas, > it paid off . I recall there are still thread issues wrt PyMalloc, > and there *were* some reports that PyMalloc was slower on some platforms. The problem is with extension modules. We can make sure code in CVS always has the big lock held when calling PyMalloc. We can't be sure that extension modules are safe. The other, more serious problem is that many extension modules allocate memory with PyObject_New() and free it with free() or PyMem_DEL() instead of PyObject_Del() or PyObject_DEL(). If pymalloc is enabled then memory allocated by a PyObject_* function must be freed by a PyObject_* funciton. mxDateTime is the first module I ran into that does this but there are many others I'm sure. I think almost all of the C modules I have written do it. The code in xxmodule.c used to do it as well. no-solutions-only-more-problems-hey-i'm-sick-today-too-ly y'rs Neil From mal@lemburg.com Tue Nov 6 17:29:39 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 06 Nov 2001 18:29:39 +0100 Subject: [Python-Dev] RE: Program very slow to finish References: <3BE737A8.8CF867A5@chello.nl> <20011106090226.B7312@glacier.arctrix.com> Message-ID: <3BE81E03.374B37E6@lemburg.com> Neil Schemenauer wrote: > > Tim Peters wrote: > > Can any Python-Dev'er make time to dig into the advisability of making > > PyMalloc the default? I only took time for this because I'm out sick today, > > and was looking for something mindless to occupy my fevered thoughts; alas, > > it paid off . I recall there are still thread issues wrt PyMalloc, > > and there *were* some reports that PyMalloc was slower on some platforms. > > The problem is with extension modules. We can make sure code in CVS > always has the big lock held when calling PyMalloc. We can't be sure > that extension modules are safe. > > The other, more serious problem is that many extension modules allocate > memory with PyObject_New() and free it with free() or PyMem_DEL() > instead of PyObject_Del() or PyObject_DEL(). If pymalloc is enabled > then memory allocated by a PyObject_* function must be freed by a > PyObject_* funciton. > > mxDateTime is the first module I ran into that does this but there are > many others I'm sure. I think almost all of the C modules I have > written do it. The code in xxmodule.c used to do it as well. What is considered the "right" approach for this ? Should Python objects *always* be deallocated using one of PyObject_Del() and PyObject_DEL() or is PyMem_DEL() usable as well ? The reason I'm asking is that the mxDateTime objects I'm deallocating are actually unreferenced objects on a free list, so PyObject_Del() will probably bomb on them. Also, what should be done with failing constructors ? I usually use PyMem_DEL() to prevent the deallocator from being called but still free the allocated memory. Is that the correct approach ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From andymac@bullseye.apana.org.au Tue Nov 6 09:26:12 2001 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Tue, 6 Nov 2001 20:26:12 +1100 (EDT) Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: Message-ID: On Mon, 5 Nov 2001, Tim Peters wrote: > Can any Python-Dev'er make time to dig into the advisability of making > PyMalloc the default? I only took time for this because I'm out sick today, > and was looking for something mindless to occupy my fevered thoughts; alas, > it paid off . I recall there are still thread issues wrt PyMalloc, > and there *were* some reports that PyMalloc was slower on some platforms. > Against that, I'm the guy who usually gets stuck trying to explain the > inexplicable, and malloc/free performance are so critical to Python > performance that it's always been "a problem" that we have no idea how > system malloc/free behave across platforms (although I suppose it's "a > feature" that I know how to crash Win9X by provoking problems with its > malloc ). I can't make time for it in the 2.2 timeframe, though. My experience with the OS/2+EMX port indicates that on this platform, PyMalloc is (in its current form) about 70% slower than the EMX malloc(). At least, when I used it for _all_ interpreter memory management... As I recall, the test suite run time didn't change much with PyMalloc enabled for object allocation only (fractionally slower IIRC) - the standard WITH_PYMALLOC setup. FWIW I'm reasonably sure I had threads enabled for this testing too, and no problems were observed with the threads test. Due to the benefits PyMalloc would bring to this platform, it has been on my todo list to research the performance hit. My quick glance at the code suggested a couple of things that a sophisticated optimiser should have dealt with, but I've not had the chance to verify that this optimisation is actually happening. The other issue with PyMalloc has to do with C extensions - NumPy in particular had a problem due to (I think) inconsistent use of the Python memory interfaces, long since fixed of course. Other less widely used extensions may still have this issues in this regard, but we also need to provoke some corrective action on this front too. I'm tempted to suggest releasing the Win32 binary of 2.2b2 with PyMalloc, just to see what you might be able to shake out, but I don't have to support it... ;-) One idea that occurs to me, but probably has more warts than benefits, is to offer a PYTHON22.DLL compiled with PyMalloc (complete with "Experimental!" labelling all over it) as a drop-in replacement for the standard DLL. A note that points out its possible benefits wrt the sort of problem that kicked this off might garner enough testing to make some progress. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From guido@python.org Tue Nov 6 18:07:29 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 06 Nov 2001 13:07:29 -0500 Subject: [Python-Dev] Born! Message-ID: <200111061807.NAA22341@cj20424-a.reston1.va.home.com> http://www.python.org/~guido/orlijn/ --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Tue Nov 6 18:24:24 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 Nov 2001 10:24:24 -0800 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: <3BE81E03.374B37E6@lemburg.com>; from mal@lemburg.com on Tue, Nov 06, 2001 at 06:29:39PM +0100 References: <3BE737A8.8CF867A5@chello.nl> <20011106090226.B7312@glacier.arctrix.com> <3BE81E03.374B37E6@lemburg.com> Message-ID: <20011106102424.A7610@glacier.arctrix.com> M.-A. Lemburg wrote: > What is considered the "right" approach for this ? Should Python > objects *always* be deallocated using one of PyObject_Del() and > PyObject_DEL() or is PyMem_DEL() usable as well ? PyMem_DEL should not be used since it could be using a different allocator then PyObject_New. > The reason I'm asking is that the mxDateTime objects I'm > deallocating are actually unreferenced objects on a > free list, so PyObject_Del() will probably bomb on them. Why would it bomb? It doesn't do anything special as far as I can tell except free memory. Vladimir went a bit overboard with the pre-processor, IMHO, so its a little hard to tell for certain. > Also, what should be done with failing constructors ? I usually > use PyMem_DEL() to prevent the deallocator from being called but > still free the allocated memory. Is that the correct approach ? PyObject_Del() doesn't call the deallocator function. Neil From nas@python.ca Tue Nov 6 18:32:17 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 Nov 2001 10:32:17 -0800 Subject: [Python-Dev] Born! In-Reply-To: <200111061807.NAA22341@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Nov 06, 2001 at 01:07:29PM -0500 References: <200111061807.NAA22341@cj20424-a.reston1.va.home.com> Message-ID: <20011106103217.B7610@glacier.arctrix.com> Guido van Rossum wrote: > http://www.python.org/~guido/orlijn/ Congratulations Guido and Kim. Neil From loewis@informatik.hu-berlin.de Tue Nov 6 19:37:11 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 6 Nov 2001 20:37:11 +0100 (MET) Subject: [Python-Dev] RE: Program very slow to finish Message-ID: <200111061937.fA6JbBv08900@paros.informatik.hu-berlin.de> > Can any Python-Dev'er make time to dig into the advisability of > making PyMalloc the default? I was looking into making pymalloc the default a few weeks ago, when I noticed that allocations is pretty efficient on Linux. I studied glibc malloc a bit to see that it is indeed quite clever about allocation, even in the presence of MT locks (with per-arena locks, and recording a per-thread arena in a thread-local variable). I believe the main difference in deallocation speed (which I didn't notice then) comes from the attempt to combine subsequent memory blocks in glibc; pymalloc doesn't attempt to do so. Also, there *is* a lock acquire/release cycle in glibc malloc, which may account for some overhead, too. Looking at obmalloc.c, I see one aspect that I'd consider questionable: detection of pool-vs-system-malloc uses a heuristics, namely offset = (off_t )p & POOL_SIZE_MASK; pool = (poolp )((block *)p - offset); if (pool->pooladdr != pool || pool->magic != (uint )POOL_MAGIC) { _SYSTEM_FREE(p); return; } So if the pool header has the pool magic and points to itself, it really is a pool header. That may result in a system malloc being recognized as a pool malloc, under rare circumstances. As a result, the system heap will be corrupted. Another aspect that may be troubling is that obmalloc never returns memory to the system. It just detects free pools, and is willing to rearrange a free pool for use with a different object size. Of course, there is little chance that a complete arena will ever become empty, so this is probably not that bad. Regards, Martin From tim.one@home.com Tue Nov 6 19:38:11 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 14:38:11 -0500 Subject: [Python-Dev] RE: [Zope.Com Geeks] Re: Program very slow to finish In-Reply-To: <20011106120529-r01010800-97fb58e4-0920-010c@10.0.0.23> Message-ID: [Just van Rossum] > ... > 266 Mhz G3 PPC, 160 Megs of RAM, OSX 10.0.4, CVS Python: > > without pymalloc: > ... > size: 600000, creation: 21.09, destruction: 3.54 > size: 800000, creation: 24.75, destruction: 18.47 > size: 1000000, creation: 119.68, destruction: 435.15 > > with pymalloc: > ... > size: 600000, creation: 20.35, destruction: 0.71 > size: 800000, creation: 21.44, destruction: 0.80 > size: 1000000, creation: 20.80, destruction: 0.76 Looks like you ran out of RAM at the end there, when using the system malloc. PyMalloc has low memory overhead per object allocated so long as small blocks are requested. Since Roeland's test uses 7-character string keys, most requests should be for 28-byte string-object chunks: 4 type pointer 4 refcount 4 character count 4 cached hash code 4 interned string pointer 7 characters 1 trailing 0 byte -- 28 It will round that up to 32, but that's essentially all the waste. The system malloc likely adds at least enough more to store the size of the allocated block too (at free() time, PyMalloc infers the size from the memory address). Curious: the PyMalloc comments (obmalloc.c) say requests through 256 bytes are handled internally. But as I read the *code*, SMALL_REQUEST_THRESHOLD is actually 64 on 32-bit boxes, and 96 or 128 on 64-bit boxes: #define ALIGNMENT 8 #define _PYOBJECT_THRESHOLD ((SIZEOF_LONG + SIZEOF_VOID_P) * ALIGNMENT) #define SMALL_REQUEST_THRESHOLD _PYOBJECT_THRESHOLD This should probably be boosted! Since Vladimir wrote this: 1. gc-able objects grew 12 bytes of gc overhead. 2. The smallest dict possible now has 8 slots embedded in the dict object (so consumes at least 8*12 == 96 bytes for that alone, so dict requests are probably never handled directly by PyMalloc anymore). 3. Type objects have grown. 4. The new __slots__ mechanism will likely become heavily used in memory-conscious code, and creates oodles of new possibilities for heavy allocation of a variety of "small block" sizes we didn't see often before. From tim.one@home.com Tue Nov 6 21:47:52 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 16:47:52 -0500 Subject: [Python-Dev] FW: Program very slow to finish Message-ID: FYI, a good illustration of why tuning malloc per-platform is a bottomless pit. -----Original Message----- From: python-list-admin@python.org On Behalf Of Alexei Zverovitch wrote in news:3BE41CBE.94AED22@nospamco.com: > Python 2.1.1 (#3, Oct 25 2001, 12:54:40) [C] on osf1V4 Since you're running Digital Unix, you might want to try tweaking the __fast_free_max et al variables used by the system malloc(). 'man malloc' is your friend (I believe you'll need to re-link the python executable if you want to change those variables). We've had a similar problem recently when a (C++) program was taking ages to free() .5 million small structures. It turned out that most of the time was spent by free() coalescing memory blocks as they were being deallocated. Increasing __fast_free_max solved the problem (IIRC the execution time was reduced by several orders of magnitude). You may be seeing the same (or similar) behaviour. Cheers Alexei -- alexei (at) barclays (dot) net -- http://mail.python.org/mailman/listinfo/python-list From tim.one@home.com Tue Nov 6 22:30:57 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 17:30:57 -0500 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: <20011106090226.B7312@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > The problem is with extension modules. We can make sure code in CVS > always has the big lock held when calling PyMalloc. We can't be sure > that extension modules are safe. So can that *ever* be solved? Some part of the puzzle went unaddressed here. > The other, more serious problem is that many extension modules allocate > memory with PyObject_New() and free it with free() or PyMem_DEL() > instead of PyObject_Del() or PyObject_DEL(). Umm, OK, PyObject_DEL() ... is a macro ... which redirects to PyObject_FREE() ... which is a macro ... which redirects to PyCore_OBJECT_FREE() ... which is a macro ... which redirects to PyCore_OBJECT_FREE_FUNC() ... which is a macro ... which redirects to _PyCore_ObjectFree (WITH_PYMALLOC) or PyCore_FREE_FUNC (without). PyCore_FREE_FUNC is a macro ... which redirects to free(). And _PyCore_ObjectFree ... doesn't exist. I must have missed an #undef in there somewhere ... ah, OK, in the WITH_PYMALLOC case, PyCore_OBJECT_FREE_FUNC(== _PyCore_ObjectFree) appears in obmalloc.c's #define _THIS_FREE PyCore_OBJECT_FREE_FUNC so that _THIS_FREE is actually _PyCore_ObjectFree; then obmalloc.c's void _THIS_FREE(void *p) { expands to void PyCore_OBJECT_FREE_FUNC(void *p) { which expands again to void _PyCore_ObjectFree(void *p) { and we're done. Couldn't be simpler . Then PyMem_DEL ... oh, forget it. Now I remember why I gave up on this last time I looked at it -- it's Preprocessor Hell. > If pymalloc is enabled then memory allocated by a PyObject_* function > must be freed by a PyObject_* funciton. Damn, you're good. > mxDateTime is the first module I ran into that does this but there are > many others I'm sure. I think almost all of the C modules I have > written do it. The code in xxmodule.c used to do it as well. > > no-solutions-only-more-problems-hey-i'm-sick-today-too-ly y'rs Neil Big Hammer? Change every one of the existing guys to resolve to malloc() and free(). Then declare them all obsolete, define a much smaller new set of names, edit the core to use the new guys, and non-core modules get stuck with malloc/free until they're rewritten too. It sucks, but we'd be in better shape today if that *had* been done the last time this got reworked. From tim.one@home.com Tue Nov 6 23:07:56 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 18:07:56 -0500 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: <20011106102424.A7610@glacier.arctrix.com> Message-ID: [M.-A. Lemburg wrote:] > What is considered the "right" approach for this ? Should Python > objects *always* be deallocated using one of PyObject_Del() and > PyObject_DEL() or is PyMem_DEL() usable as well ? I agree with everything Neil said. Note that the xxx_del/DEL functions are not destructors, nor do they call destructors. Despite their arguably odd names, they don't even care whether you pass a pointer to a Python object. Without pymalloc enabled, chase down the layers of macros and all any of them do is call free(), although PyMem_DEL -> PyMem_FREE -> PyCore_FREE -> PyCore_FREE_FUNC and there's *some* scheme or other hiding in here that's supposed to let users define their own expansion for at least one of the steps in that chain. It's complicated, and I'm not sure anyone understands all the design details anymore. From Anthony Baxter Tue Nov 6 23:09:45 2001 From: Anthony Baxter (Anthony Baxter) Date: Wed, 07 Nov 2001 10:09:45 +1100 Subject: [Python-Dev] FW: Program very slow to finish In-Reply-To: Message from "Tim Peters" of "Tue, 06 Nov 2001 16:47:52 CDT." Message-ID: <200111062309.fA6N9js30208@mbuna.arbhome.com.au> Is it worth collecting all these different platform specific tweaks into a single file? Anthony >>> "Tim Peters" wrote > FYI, a good illustration of why tuning malloc per-platform is a bottomless > pit. -- Anthony Baxter It's never too late to have a happy childhood. From paul-python@svensson.org Tue Nov 6 23:16:23 2001 From: paul-python@svensson.org (Paul Svensson) Date: Tue, 6 Nov 2001 18:16:23 -0500 (EST) Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: Message-ID: On Tue, 6 Nov 2001, Tim Peters wrote: >Umm, OK, PyObject_DEL() ... is a macro ... which redirects to >PyObject_FREE() ... which is a macro ... which redirects to >PyCore_OBJECT_FREE() ... which is a macro ... which redirects to >PyCore_OBJECT_FREE_FUNC() ... which is a macro ... which redirects to >_PyCore_ObjectFree (WITH_PYMALLOC) or PyCore_FREE_FUNC (without). >PyCore_FREE_FUNC is a macro ... which redirects to free(). And >_PyCore_ObjectFree ... doesn't exist. I must have missed an #undef in there >somewhere ... (-- etc ad nauseam) >Big Hammer? Change every one of the existing guys to resolve to malloc() >and free(). Then declare them all obsolete, define a much smaller new set >of names, edit the core to use the new guys, and non-core modules get stuck >with malloc/free until they're rewritten too. It sucks, but we'd be in >better shape today if that *had* been done the last time this got reworked. +1, with the option of using a bigger hammer /Paul From martin@v.loewis.de Tue Nov 6 23:42:13 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 7 Nov 2001 00:42:13 +0100 Subject: [Python-Dev] FW: Program very slow to finish Message-ID: <200111062342.fA6NgDQ02104@mira.informatik.hu-berlin.de> > Is it worth collecting all these different platform specific tweaks > into a single file? I don't think so. Tuning malloc is only possible if you know the access pattern, and if you can experiment. E.g. in an MT application, those parameter may need completely different values from the ones in a single-threaded application. Or, an application allocating many strings might have different requirements than an application allocating many numbers (which are fixed-size). These parameters are offered to applications, to quiet the application developers that have been asking for them all these years, without knowing what they'd get when they can tune the parameters. If you don't know the application (such as when implementing a Python interpreter), they are useless. As will all computational-complexity problems: You can change the constants. You cannot change the complexity class of an algorithm with tuning. Regards, Martin From barry@zope.com Tue Nov 6 23:49:11 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 6 Nov 2001 18:49:11 -0500 Subject: [Python-Dev] RE: Program very slow to finish References: Message-ID: <15336.30455.873021.605178@anthem.wooz.org> Uncle Timmie: >> Big Hammer? Change every one of the existing guys to resolve >> to malloc() and free(). Then declare them all obsolete, define >> a much smaller new set of names, edit the core to use the new >> guys, and non-core modules get stuck with malloc/free until >> they're rewritten too. It sucks, but we'd be in better shape >> today if that *had* been done the last time this got reworked. >>>>> "PS" == Paul Svensson writes: PS> +1, with the option of using a bigger hammer PEP it for 2.3. -Barry From tim.one@home.com Wed Nov 7 00:12:12 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 19:12:12 -0500 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: <200111061937.fA6JbBv08900@paros.informatik.hu-berlin.de> Message-ID: [Martin von Loewis] > I was looking into making pymalloc the default a few weeks ago, when I > noticed that allocations is pretty efficient on Linux. I studied glibc > malloc a bit to see that it is indeed quite clever about allocation, > even in the presence of MT locks (with per-arena locks, and recording > a per-thread arena in a thread-local variable). Is this Doug Lea's malloc implementation ? Can you tell whether it was the one Roeland used in his "before" test case run? > I believe the main difference in deallocation speed (which I didn't > notice then) comes from the attempt to combine subsequent memory > blocks in glibc; pymalloc doesn't attempt to do so. That would be my guess; I traced bad deallocation performance to free() coalescing in a different system long ago, and it seems a common problem. > Also, there *is* a lock acquire/release cycle in glibc malloc, which > may account for some overhead, too. Sure. Doug Lea sez (see above) "If you are using malloc in a concurrent program, you would be far better off obtaining ptmalloc ..."; slapping a lock around a thread-naive malloc is merely correct. > Looking at obmalloc.c, I see one aspect that I'd consider > questionable: detection of pool-vs-system-malloc uses a heuristics, > namely > > offset = (off_t )p & POOL_SIZE_MASK; > pool = (poolp )((block *)p - offset); > if (pool->pooladdr != pool || pool->magic != (uint )POOL_MAGIC) { > _SYSTEM_FREE(p); > return; > } > > So if the pool header has the pool magic and points to itself, it > really is a pool header. That may result in a system malloc being > recognized as a pool malloc, under rare circumstances. As a result, > the system heap will be corrupted. Yes. I believe Vladimir offered to buy us lunch if that ever happened in a real program, so he's betting at least US$4 that it won't -- and nothing else in Python is backed by that much hard cash . The tradeoff is that he gets a lot of comparative storage efficiency out of this trick, and some speed too (no per-object doubly-linked free lists to maintain, no need to store size fields). I don't see a way to make it bulletproof without giving that up, short of over-allocating "large requests" enough so that a hidden pool_header struct can be stuffed at a POOL_SIZE-aligned address. Then the above could become offset = (off_t )p & POOL_SIZE_MASK; pool = (poolp )((block *)p - offset); assert(pool->pooladdr == pool); /* pool->magic member no longer exists; use, e.g., pool->szidx == (uint)-1 as a flag to mean "this block came from the system malloc". */ if (pool->szidx == (uint)-1)) { _SYSTEM_FREE(p); return; } POOL_SIZE is 4KB, though, and that's a lot of wasted space. OTOH, a "large object arena" could be introduced too, and carved up much like the current one. I suppose the bottom line is that you just can't make address tricks bulletproof unless you arrange-- by hook or by crook --to control the addresses you pass out (so you never call malloc at all, or wrap what it does). > Another aspect that may be troubling is that obmalloc never returns > memory to the system. Ah, but it does for "large objects" obtained via malloc() -- they get free()d individually. > It just detects free pools, and is willing to rearrange a free pool for > use with a different object size. Of course, there is little chance > that a complete arena will ever become empty, so this is probably not > that bad. It's also nothing really new, and *could* lead to an improvement: the hope was that PyMalloc could be made fast enough that Python could drop all its fiddly little type-specific free lists. Those never return memory either, and, worse, can't even recycle memory across types; for example, if some phase of your program uses a million floats at one time, and the next phase only retains their sum, you're stuck forever with space for a million (minus 1 ) dead float objects, sitting in floatobject.c's block_list. From tim.one@home.com Wed Nov 7 00:17:22 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 6 Nov 2001 19:17:22 -0500 Subject: [Python-Dev] FW: Program very slow to finish In-Reply-To: <200111062309.fA6N9js30208@mbuna.arbhome.com.au> Message-ID: [Tim] > FYI, a good illustration of why tuning malloc per-platform is a > bottomless pit. [Anthony Baxter] > Is it worth collecting all these different platform specific tweaks > into a single file? Not if I have to do it, and I think it's unlikely Python is going to grow masses of platform-specific malloc-fiddling in any case. Vladimir wrote PyMalloc so we could do a good job regardless of platform. From nas@python.ca Wed Nov 7 04:50:58 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 Nov 2001 20:50:58 -0800 Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: ; from tim.one@home.com on Tue, Nov 06, 2001 at 05:30:57PM -0500 References: <20011106090226.B7312@glacier.arctrix.com> Message-ID: <20011106205058.B8437@glacier.arctrix.com> Tim Peters wrote: > Big Hammer? Change every one of the existing guys to resolve to malloc() > and free(). Then declare them all obsolete, define a much smaller new set > of names, edit the core to use the new guys, and non-core modules get stuck > with malloc/free until they're rewritten too. Sounds good to me. Neil From mal@lemburg.com Wed Nov 7 08:29:31 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 09:29:31 +0100 Subject: [Python-Dev] RE: Program very slow to finish References: <20011106090226.B7312@glacier.arctrix.com> <20011106205058.B8437@glacier.arctrix.com> Message-ID: <3BE8F0EB.D4E4813@lemburg.com> Neil Schemenauer wrote: > > Tim Peters wrote: > > Big Hammer? Change every one of the existing guys to resolve to malloc() > > and free(). Then declare them all obsolete, define a much smaller new set > > of names, edit the core to use the new guys, and non-core modules get stuck > > with malloc/free until they're rewritten too. > > Sounds good to me. +1 ... even though I've already changed all free()s to PyMem_DEL() and then to PyObject_Del(), changing them back to free() wouldn't be much trouble ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Nov 7 08:21:42 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 09:21:42 +0100 Subject: [Python-Dev] RE: Program very slow to finish References: <3BE737A8.8CF867A5@chello.nl> <20011106090226.B7312@glacier.arctrix.com> <3BE81E03.374B37E6@lemburg.com> <20011106102424.A7610@glacier.arctrix.com> Message-ID: <3BE8EF16.3C3AE8FF@lemburg.com> Neil Schemenauer wrote: > > M.-A. Lemburg wrote: > > What is considered the "right" approach for this ? Should Python > > objects *always* be deallocated using one of PyObject_Del() and > > PyObject_DEL() or is PyMem_DEL() usable as well ? > > PyMem_DEL should not be used since it could be using a different > allocator then PyObject_New. > > > The reason I'm asking is that the mxDateTime objects I'm > > deallocating are actually unreferenced objects on a > > free list, so PyObject_Del() will probably bomb on them. > > Why would it bomb? It doesn't do anything special as far as I can tell > except free memory. Vladimir went a bit overboard with the > pre-processor, IMHO, so its a little hard to tell for certain. > > > Also, what should be done with failing constructors ? I usually > > use PyMem_DEL() to prevent the deallocator from being called but > > still free the allocated memory. Is that the correct approach ? > > PyObject_Del() doesn't call the deallocator function. Thanks for the clarifications. I'll turn to using PyObject_Del() in all mx Extensions then. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Wed Nov 7 09:39:42 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 07 Nov 2001 10:39:42 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception Message-ID: Consider this code posted to c.l.py in the past day or two: try: x = float(s) result = 1 except: result = 0 to which Andrew Kuchling replied: It's better to catch ValueError and not all exceptions. Else what would happen if you get a MemoryError or a KeyboardInterrupt? All well and good. There are situations other than in throwaway code where you do want a more-or-less catchall except clause (any time you are executing arbitrary code for which you don't know all the exceptions that might be raised), but you generally have to remember to code it as try: fragile code except (SystemExit, KeyboardInterrupt): raise except: recover I have a simple proposal: Change the exception class hierarchy slightly, so that exceptions you generally will want to re-raise don't inherit from StandardError. Currently, SystemExit, StopIteration and Warning inherit directly from Exception. I suggest that KeyboardInterrupt should also inherit from Exception, and not StandardError. That way, the standard catch all except clause can be try: fragile code except StandardError: recover and use of bare except clauses can be discouraged more strongly than they currently are. (Maybe MemoryError should inherit directly from Exception as well because recovery opportunities should that arise are going to be minimal and fall into a decidely different set of options than, say, recovering from invalid numeric input by a user.) Pro: Programmers won't have to remember to consider SystemExit and KeyboardInterrupt when coding catch-all excepts. Con: Slightly changes the semantic implications of "except StandardError". I doubt this is used much, but I see that it is used by test_cgi.py and xml/dom/domreg.py. Both of those cases would have to be recoded. (Actually, I think the domreg.py case may be an error, since it doesn't catch and re-raise KeyboardInterrupt...) Thoughts? -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ From MarkH@ActiveState.com Wed Nov 7 11:41:32 2001 From: MarkH@ActiveState.com (Mark Hammond) Date: Wed, 7 Nov 2001 22:41:32 +1100 Subject: [Python-Dev] PEP 273: Import Modules from Zip Archives In-Reply-To: <3BDD91CC.9F0ECB07@interet.com> Message-ID: Sorry for the delay: Jim writes: > On Windows, the directory is the directory of sys.executable. Any chance this can be in sys.prefix, else the directory of sys.executable if sys.prefix is empty? The reason is for embedding situations - sys.executable may not be a reasonable watermark. We recently had a bug regarding os.popen() on Windows for the exact same reason, and a patch was recently checked in that goes to great lengths to ensure sys.prefix is always valid even in these embedding situations. Mark. From gward@python.net Wed Nov 7 13:33:49 2001 From: gward@python.net (Greg Ward) Date: Wed, 7 Nov 2001 08:33:49 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: References: Message-ID: <20011107083349.A19554@gerg.ca> On 07 November 2001, Skip Montanaro said: > I have a simple proposal: Change the exception class hierarchy > slightly, so that exceptions you generally will want to re-raise don't > inherit from StandardError. Currently, SystemExit, StopIteration and > Warning inherit directly from Exception. I suggest that > KeyboardInterrupt should also inherit from Exception, and not > StandardError. Sounds sensible to me. > That way, the standard catch all except clause can be > > try: > fragile code > except StandardError: > recover So "fragile code" would be the main loop of a GUI or server, or an eval or exec of user-supplied code (eg. a config file that happens to be Python source) -- that sort of thing? Those are the only legitimate places that I can think of "except: ...", and changing that to "except StandardError: ..." seems like a tiny hardship that buys a little something. Hmmm... does anyone else habitually write if __name__ == "__main__": try: main() except KeyboardInterrupt: sys.exit("interrupted") And is anyone else sick of doing this? Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ There are no stupid questions -- only stupid people. From fdrake@acm.org Wed Nov 7 14:41:08 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 7 Nov 2001 09:41:08 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: <20011107083349.A19554@gerg.ca> References: <20011107083349.A19554@gerg.ca> Message-ID: <15337.18436.488131.971317@grendel.zope.com> On 07 November 2001, Skip Montanaro said: > I have a simple proposal: Change the exception class hierarchy > slightly, so that exceptions you generally will want to re-raise don't > inherit from StandardError. Currently, SystemExit, StopIteration and > Warning inherit directly from Exception. I suggest that > KeyboardInterrupt should also inherit from Exception, and not > StandardError. Sounds reasonable to me. Greg Ward writes: > Hmmm... does anyone else habitually write > > if __name__ == "__main__": > try: > main() > except KeyboardInterrupt: > sys.exit("interrupted") It must be you, Greg! ;-) The only thing I can think of that I do similar to that is: import errno if __name__ == "__main__": # do setup stuff... ... # output result to file... try: write_result() # or whatever it really is... except IOError, e: if e.errno != errno.EPIPE: raise -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jim@interet.com Wed Nov 7 14:40:12 2001 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 07 Nov 2001 09:40:12 -0500 Subject: [Python-Dev] PEP 273: Import Modules from Zip Archives References: Message-ID: <3BE947CC.C1F84625@interet.com> Mark Hammond wrote: > Jim writes: > > > On Windows, the directory is the directory of sys.executable. > > Any chance this can be in sys.prefix, else the directory of sys.executable > if sys.prefix is empty? > > The reason is for embedding situations - sys.executable may not be a > reasonable watermark. We recently had a bug regarding os.popen() on Windows > for the exact same reason, and a patch was recently checked in that goes to > great lengths to ensure sys.prefix is always valid even in these embedding > situations. Hmmm...., you are right. Sys.executable doesn't really work for embedding. But sys.prefix is obtained from a search of the directory structure for a "landmark" file, namely os.py. When the Python library is in a zip file, it is likely that no landmark files will be found, and sys.prefix will contain garbage. Since sys.prefix is searched for, its name is unpredictable. We need a known location for python22.zip. How about using the full path name of pythonXX.dll with the last three characters changed to "zip"? This associates the libraries with the DLL, which is more logical than associating them with the executable. And the file name is identical but with "zip" instead of "dll". Does this work, and solve all embedding problems? JimA From jim@interet.com Wed Nov 7 15:18:08 2001 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 07 Nov 2001 10:18:08 -0500 Subject: [Python-Dev] Caching directory files in import.c References: <3BE30079.D6A8FB52@interet.com> Message-ID: <3BE950B0.D9B6FC8C@interet.com> "James C. Ahlstrom" wrote: > Looking at the code, I saw that I could do an os.listdir(path), > and record the directory file names into the same dictionary. > Then it would not be necessary to perform a large number of > fopen()'s. The same dictionary lookup is used instead. Well, I didn't get a lot of HellNo's, so I added the code. The new patch for import.c is now part of Patch 476047. There are a lot of other changes too. Please take a look. No, it is not done yet. Don't install it. JimA From mal@lemburg.com Wed Nov 7 15:23:32 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 16:23:32 +0100 Subject: [Python-Dev] switch-based programming in Python Message-ID: <3BE951F4.3B79913C@lemburg.com> Now I know that this has been brought quite a few times in the past. Still, with the slowness of Python method calls, a switch-based coding style would be nice way to implement fast token based processing of data in Python rather than C. Currently, dispatching of execution based on the value of one variable is usually implemented by having some dictionary of possible values and then calling method which implement the different branches of execution. This works well for code which uses medium sized methods, but fails badly for small ones such as code which is often used in method callback based parsers. The alternative is using lengthy if x == 'one': ... elif x == 'two': ... elif x == 'three': ... else: ...default case... constructs. Wouldn't it make sense to enable the byte code compiler to take the above construct and turn it into a dictionary based switch statement ? I'm not talking about adding syntax to the language, it would just be nice to have the compiler recognize this kind of code (somehow; perhaps with some extra help) and produce optimized code for it, possibly using new opcodes for the switching operation. Thoughts ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Nov 7 15:25:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 16:25:18 +0100 Subject: [Python-Dev] Problem in SRE ? Message-ID: <3BE9525E.E5534D83@lemburg.com> I've just run across a strange problem with SRE. The following code does not run with the current CVS version -- the program simply exists without notice even though there doesn't seem to be an exit() or abort() call in SRE. Note that sys.version is '2.2b1+ (#59, Nov 7 2001, 12:57:29) \n[GCC pgcc-2.95.2 19991024 (release)]' Any ideas ? -- import sys, re, string _sys_version_parser = re.compile('([\w.]+)\s*' '\(#(\d+),\s*([\w ]+),\s*([\w :]+)\)\s*' '\[([^\]]+)\]?') _sys_version_cache = None def _sys_version(): """ Returns a parsed version of Python's sys.version as tuple (version, buildno, builddate, compiler) referring to the Python version, build number, build date/time as string and the compiler identification string. Note that unlike the Python sys.version, the returned value for the Python version will always include the patchlevel (it defaults to '.0'). """ global _sys_version_cache import sys, re, time if _sys_version_cache is not None: return _sys_version_cache print sys.version version, buildno, builddate, buildtime, compiler = \ _sys_version_parser.match(sys.version).groups() buildno = int(buildno) builddate = builddate + ' ' + buildtime l = string.split(version, '.') if len(l) == 2: l.append('0') version = string.join(l, '.') _sys_version_cache = (version, buildno, builddate, compiler) return _sys_version_cache -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Wed Nov 7 15:45:18 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 07 Nov 2001 16:45:18 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: <20011107083349.A19554@gerg.ca> References: <20011107083349.A19554@gerg.ca> Message-ID: >>>>> "Greg" == Greg Ward writes: Skip> That way, the standard catch all except clause can be Skip> Skip> try: Skip> fragile code Skip> except StandardError: Skip> recover Greg> So "fragile code" would be the main loop of a GUI or server, Greg> or an eval or exec of user-supplied code (eg. a config file Greg> that happens to be Python source) -- that sort of thing? That's precisely the context this arose in here in Vienna. I just stole Andrew's example from c.l.py because it was easier to paste that than to type my own. Greg> Hmmm... does anyone else habitually write Greg> if __name__ == "__main__": Greg> try: Greg> main() Greg> except KeyboardInterrupt: Greg> sys.exit("interrupted") This suggests one other enhancement to me. Just as raising SystemExit doesn't generate a traceback, perhaps the default handling of KeyboardInterrupt could be configurable so I could set (for example): sys.gen_kbi_traceback = 0 sys.gen_kbi_message = 0 to suppress traceback and/or message output without having to do explicitly catch it. The default for both would be 1 to remain compatible with current behavior. If you caught the exception, these would have no effect. Skip From mwh@python.net Wed Nov 7 15:47:35 2001 From: mwh@python.net (Michael Hudson) Date: 07 Nov 2001 10:47:35 -0500 Subject: [Python-Dev] Problem in SRE ? In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 07 Nov 2001 16:25:18 +0100" References: <3BE9525E.E5534D83@lemburg.com> Message-ID: <2m1yjaoa5k.fsf@starship.python.net> "M.-A. Lemburg" writes: > I've just run across a strange problem with SRE. The following > code does not run with the current CVS version -- the program > simply exists without notice even though there doesn't seem > to be an exit() or abort() call in SRE. Well, your regexp doesn't work for the given version: > Note that sys.version is > '2.2b1+ (#59, Nov 7 2001, 12:57:29) \n[GCC pgcc-2.95.2 19991024 > (release)]' > > Any ideas ? > > -- > import sys, re, string > _sys_version_parser = re.compile('([\w.]+)\s*' You need a \+? in here somewhere! Or a + inside the []. With this change, works fine on all Pythons I have here: [mwh@starship mwh]$ /usr/local/bin/python foo4.py 2.1.1 (#1, Aug 23 2001, 22:12:58) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] ('2.1.1', 1, 'Aug 23 2001 22:12:58', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') [mwh@starship mwh]$ /usr/bin/python foo4.py 1.5.2 (#1, Dec 21 2000, 15:29:08) [GCC egcs-2.91.66 19990314/Linux (egcs- ('1.5.2', 1, 'Dec 21 2000 15:29:08', 'GCC egcs-2.91.66 19990314/Linux (egcs-') [mwh@starship mwh]$ ~/bin/python foo4.py 2.2a4+ (#1, Oct 19 2001, 03:56:59) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] ('2.2a4.0', 1, 'Oct 19 2001 03:56:59', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') [mwh@starship mwh]$ ~/src/python/dist/src/build/python foo4.py 2.2b1+ (#1, Nov 7 2001, 05:07:34) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] ('2.2b1.0', 1, 'Nov 7 2001 05:07:34', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') Cheers, M. -- 40. There are two ways to write error-free programs; only the third one works. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From niemeyer@conectiva.com Wed Nov 7 17:38:05 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Wed, 7 Nov 2001 15:38:05 -0200 Subject: [Python-Dev] xxsubtype builtin? Message-ID: <20011107153805.A881@ibook.distro.conectiva> --2fHTh5uZTiUOsy+g Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi there!! Python 2.2b1+ (#1, Nov 6 2001, 21:52:55) [GCC 2.95.3 20010315 (release) (conectiva)] on linux-i386 Type "help", "copyright", "credits" or "license" for more information. >>> import sys sys.builtin_module_names ('__builtin__', '__main__', '_sre', '_symtable', 'exceptions', 'gc', 'imp', 'marshal', 'new', 'posix', 'signal', 'sys', 'thread', 'xxsubtype') >>> Should xxsubtype be a builtin module? The line mentioning it in Setup.dist is uncommented by default. Maybe it's there just for testing purposes? Thanks! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --2fHTh5uZTiUOsy+g Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE76XF9IlOymmZkOgwRAnB8AJ4qmRkOw55HZhxiaxXh/poqPJcmQwCghnEN 25CuLv1WnL4v9v2OKX0HHEk= =RAiT -----END PGP SIGNATURE----- --2fHTh5uZTiUOsy+g-- From mal@lemburg.com Wed Nov 7 17:57:48 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 18:57:48 +0100 Subject: [Python-Dev] Problem in SRE ? References: <3BE9525E.E5534D83@lemburg.com> <2m1yjaoa5k.fsf@starship.python.net> Message-ID: <3BE9761C.1C34DC8E@lemburg.com> Michael Hudson wrote: > > "M.-A. Lemburg" writes: > > > I've just run across a strange problem with SRE. The following > > code does not run with the current CVS version -- the program > > simply exists without notice even though there doesn't seem > > to be an exit() or abort() call in SRE. > > Well, your regexp doesn't work for the given version: > > > Note that sys.version is > > '2.2b1+ (#59, Nov 7 2001, 12:57:29) \n[GCC pgcc-2.95.2 19991024 > > (release)]' > > > > Any ideas ? > > > > -- > > import sys, re, string > > _sys_version_parser = re.compile('([\w.]+)\s*' > > You need a \+? in here somewhere! Or a + inside the []. True and thanks for the hint, but still: why does Python exit ? I'd expect a None return, or rather an attribute error since I'm asking for the .groups() method of None. Something is either wrong with my compiler or some attribute lookup code (or both). > With this change, works fine on all Pythons I have here: > > [mwh@starship mwh]$ /usr/local/bin/python foo4.py > 2.1.1 (#1, Aug 23 2001, 22:12:58) > [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] > ('2.1.1', 1, 'Aug 23 2001 22:12:58', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') > [mwh@starship mwh]$ /usr/bin/python foo4.py > 1.5.2 (#1, Dec 21 2000, 15:29:08) [GCC egcs-2.91.66 19990314/Linux (egcs- > ('1.5.2', 1, 'Dec 21 2000 15:29:08', 'GCC egcs-2.91.66 19990314/Linux (egcs-') > [mwh@starship mwh]$ ~/bin/python foo4.py > 2.2a4+ (#1, Oct 19 2001, 03:56:59) > [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] > ('2.2a4.0', 1, 'Oct 19 2001 03:56:59', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') > [mwh@starship mwh]$ ~/src/python/dist/src/build/python foo4.py > 2.2b1+ (#1, Nov 7 2001, 05:07:34) > [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] > ('2.2b1.0', 1, 'Nov 7 2001 05:07:34', 'GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)') -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mwh@python.net Wed Nov 7 18:25:10 2001 From: mwh@python.net (Michael Hudson) Date: 07 Nov 2001 13:25:10 -0500 Subject: [Python-Dev] Problem in SRE ? In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 07 Nov 2001 18:57:48 +0100" References: <3BE9525E.E5534D83@lemburg.com> <2m1yjaoa5k.fsf@starship.python.net> <3BE9761C.1C34DC8E@lemburg.com> Message-ID: <2madxya16h.fsf@starship.python.net> "M.-A. Lemburg" writes: > Michael Hudson wrote: > > > > "M.-A. Lemburg" writes: > > > > > import sys, re, string > > > _sys_version_parser = re.compile('([\w.]+)\s*' > > > > You need a \+? in here somewhere! Or a + inside the []. > > True and thanks for the hint, but still: why does Python exit ? Well, even if I leave it out, it doesn't exit for me: $ ~/src/python/dist/src/build/python foo4.py 2.2b1+ (#1, Nov 7 2001, 05:07:34) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] Traceback (most recent call last): File "foo4.py", line 37, in ? print _sys_version() File "foo4.py", line 26, in _sys_version version, buildno, builddate, buildtime, compiler = \ AttributeError: 'NoneType' object has no attribute 'groups' That's from this morning. How are you testing? I can't provoke any wierd behaviour whatever I do. This is on NT4, btw. Are you sure there's nothing eating the exception? Cheers, M. -- at any rate, I'm satisfied that not only do they know which end of the pointy thing to hold, but where to poke it for maximum effect. -- Eric The Read, asr, on google.com From mal@lemburg.com Wed Nov 7 18:08:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 19:08:41 +0100 Subject: [Python-Dev] Problem in SRE ? References: <3BE9525E.E5534D83@lemburg.com> <2m1yjaoa5k.fsf@starship.python.net> <3BE9761C.1C34DC8E@lemburg.com> Message-ID: <3BE978A9.20212FFB@lemburg.com> Solved... "M.-A. Lemburg" wrote: > > Michael Hudson wrote: > > > > "M.-A. Lemburg" writes: > > > > > I've just run across a strange problem with SRE. The following > > > code does not run with the current CVS version -- the program > > > simply exists without notice even though there doesn't seem > > > to be an exit() or abort() call in SRE. > > > > Well, your regexp doesn't work for the given version: > > > > > Note that sys.version is > > > '2.2b1+ (#59, Nov 7 2001, 12:57:29) \n[GCC pgcc-2.95.2 19991024 > > > (release)]' > > > > > > Any ideas ? > > > > > > -- > > > import sys, re, string > > > _sys_version_parser = re.compile('([\w.]+)\s*' > > > > You need a \+? in here somewhere! Or a + inside the []. > > True and thanks for the hint, but still: why does Python exit ? > > I'd expect a None return, or rather an attribute error since > I'm asking for the .groups() method of None. > > Something is either wrong with my compiler or some attribute lookup > code (or both). Sorry about the mixup: the application which was using the code was masking the AttributeError and did the sys.exit(). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Nov 7 21:05:42 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 Nov 2001 22:05:42 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception References: <20011107083349.A19554@gerg.ca> Message-ID: <3BE9A226.B1D7F31E@lemburg.com> Skip Montanaro wrote: > > Greg> Hmmm... does anyone else habitually write > > Greg> if __name__ == "__main__": > Greg> try: > Greg> main() > Greg> except KeyboardInterrupt: > Greg> sys.exit("interrupted") > > This suggests one other enhancement to me. Just as raising SystemExit > doesn't generate a traceback, perhaps the default handling of > KeyboardInterrupt could be configurable so I could set (for example): > > sys.gen_kbi_traceback = 0 > sys.gen_kbi_message = 0 > > to suppress traceback and/or message output without having to do > explicitly catch it. The default for both would be 1 to remain > compatible with current behavior. If you caught the exception, these > would have no effect. Isn't this already possible using one of the display hooks in sys, e.g. sys.excepthook() ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@v.loewis.de Wed Nov 7 21:21:01 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 7 Nov 2001 22:21:01 +0100 Subject: [Python-Dev] switch-based programming in Python Message-ID: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> > Wouldn't it make sense to enable the byte code compiler to take the > above construct and turn it into a dictionary based switch statement > ? That won't work. You cannot know what type "x" has, so you don't know in advance how "x == 'one'" is evaluated. Regards, Martin From greg@cosc.canterbury.ac.nz Wed Nov 7 23:29:24 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 08 Nov 2001 12:29:24 +1300 (NZDT) Subject: [Python-Dev] Re: Future division detection In-Reply-To: Message-ID: <200111072329.MAA11508@s454.cosc.canterbury.ac.nz> com-nospam@ccraig.org (Christopher A. Craig): > I would like > to have an rational cast back to a integer iff (1) the denominator is 1 > and (2) future division is active. > > If I don't check (2) then I get the situation that if future division > is inactive then `(rational('1/3')*3)/5` would yield 0 instead of > 1/5. I think what you're asking is impossible. Remember that "future division is active" is a compile-time, per-module notion. Some parts of the code may be doing old-style divisions and other parts new-style divisions. In the above example, at the time when you do the multiplication by 3, there is no way of knowing what sort of division might be performed on the result at some later time, maybe in another module where the future division option was set differently. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Nov 8 00:04:21 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 08 Nov 2001 13:04:21 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BE951F4.3B79913C@lemburg.com> Message-ID: <200111080004.NAA11516@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > Wouldn't it make sense to enable the byte code compiler to take > the above construct and turn it into a dictionary based > switch statement ? That's an interesting idea. +1 on giving it some more thought. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From andymac@bullseye.apana.org.au Wed Nov 7 20:47:46 2001 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Thu, 8 Nov 2001 07:47:46 +1100 (EDT) Subject: [Python-Dev] RE: Program very slow to finish In-Reply-To: Message-ID: On Tue, 6 Nov 2001, Andrew MacIntyre wrote: > One idea that occurs to me, but probably has more warts than benefits, is > to offer a PYTHON22.DLL compiled with PyMalloc (complete with > "Experimental!" labelling all over it) as a drop-in replacement for the > standard DLL. A note that points out its possible benefits wrt the sort > of problem that kicked this off might garner enough testing to make some > progress. Forget this idea - the exported entry points for the DLL will be different :-( -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From barry@zope.com Thu Nov 8 03:17:10 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 7 Nov 2001 22:17:10 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception References: Message-ID: <15337.63798.224470.209674@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> I suggest that KeyboardInterrupt should also inherit from SM> Exception, and not StandardError. It doesn't sound completely unreasonable, but I'd be -1 on it for Python 2.2. >>>>> "GW" == Greg Ward writes: GW> Hmmm... does anyone else habitually write | if __name__ == "__main__": | try: | main() | except KeyboardInterrupt: | sys.exit("interrupted") No, but I sometimes put "pass" in the except-suite of a KeyboardInterrupt. GW> And is anyone else sick of doing this? Not really. I only do it for one or two daemon main loops, generally never for plain scripts. Hitting C-c and seeing the traceback is usually fine, and often exactly what I want! E.g. I want to know that I had to kill it in the take_yer_time_figgering_this_one_out() method. >>>>> "Fred" == Fred L Drake, Jr writes: Fred> import errno | # output result to file... | try: | write_result() # or whatever it really is... | except IOError, e: | if e.errno != errno.EPIPE: | raise I do stuff like this all the time, although it's usually OSError. I love that OSError and IOError have a common base class! I've often wanted all the errno's to be transformed into subclasses of IOError/OSError, so I could just do something like: try: os.mkdir(...) except OSErrorEEXIST: pass # any other OSError propagates up -Barry From barry@zope.com Thu Nov 8 03:43:02 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 7 Nov 2001 22:43:02 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.287,2.288 References: <15337.37707.681014.720550@slothrop.digicool.com> Message-ID: <15337.65350.808408.163204@anthem.wooz.org> [Followups redirected to python-dev... -BAW] >>>>> "MAL" == M writes: | Modified Files: | ceval.c | Log Message: | Add fast-path for comparing interned (true) string objects. MAL> This patch boosts performance for comparing identical string MAL> object by some 20% on my machine while not causing any MAL> noticable slow-down for other operations (according to tests MAL> done with pybench). >>>>> "JH" == Jeremy Hylton writes: JH> Hey! Don't do that. I had a similar reaction! :) JH> The last time this came up, I thought there was a pretty clear JH> conclusion that we did not want to make thise change. Tim's objection is here: http://mail.python.org/pipermail/python-dev/2001-October/018274.html Since there isn't concensus, I don't think it should go into Python 2.2 at this late date. Please back this one out MAL! -Barry From mal@lemburg.com Thu Nov 8 08:43:56 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 09:43:56 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.287,2.288 References: <15337.37707.681014.720550@slothrop.digicool.com> <15337.65350.808408.163204@anthem.wooz.org> Message-ID: <3BEA45CC.9E948F2B@lemburg.com> "Barry A. Warsaw" wrote: > > [Followups redirected to python-dev... -BAW] > > >>>>> "MAL" == M writes: > > | Modified Files: > | ceval.c > | Log Message: > | Add fast-path for comparing interned (true) string objects. > > MAL> This patch boosts performance for comparing identical string > MAL> object by some 20% on my machine while not causing any > MAL> noticable slow-down for other operations (according to tests > MAL> done with pybench). > > >>>>> "JH" == Jeremy Hylton writes: > > JH> Hey! Don't do that. > > I had a similar reaction! :) Sorry about that. > JH> The last time this came up, I thought there was a pretty clear > JH> conclusion that we did not want to make thise change. > > Tim's objection is here: > > http://mail.python.org/pipermail/python-dev/2001-October/018274.html > > Since there isn't concensus, I don't think it should go into Python > 2.2 at this late date. Ok. Even though I can't really understand why nobody seems to be doing any benchmarking... all the talk has been theoretical except for Martin who posted usage numbers (not timings). I did run benchmark tests and checked in the patch only because I didn't find any significant change in performance for code not triggering the fast path. Well, perhaps we can look into this again for 2.3. I believe that some Python applications could benefit from this kind of small enhancement (also see the other thread "switch programming in Python"). > Please back this one out MAL! Done. Sorry guys, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Nov 8 08:50:59 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 09:50:59 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> Message-ID: <3BEA4773.6FE53E12@lemburg.com> "Martin v. Loewis" wrote: > > > Wouldn't it make sense to enable the byte code compiler to take the > > above construct and turn it into a dictionary based switch statement > > ? > > That won't work. You cannot know what type "x" has, so you don't know > in advance how "x == 'one'" is evaluated. But you do know that x won't change from one compare to the next, so a single dictionary lookup could replace the equality tests (provided that x is hashable). As mentioned in the posting: the compiler probably has to be given some extra information to enable this sort of optimization. I'm not sure how the information could be "encoded", though. Suggestions are appreciated, as always :-) What do you think about the general idea, BTW ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Nov 8 08:59:17 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 09:59:17 +0100 Subject: [Python-Dev] __slots__ (Change in evaluation order in new object model) References: Message-ID: <3BEA4965.5589BDFE@lemburg.com> Tim Peters wrote: > > [Michael McLay] > > ... > > The lookup of a member is also faster because it uses a lookup of > > an offset instead of a dictionary lookup. > > There's still a dict lookup: when you do obj.a where a is a __slot__ > attribute of obj.__class__, obj.__class__.__dict__['a'] is looked up in > order to get the descriptor for attribute 'a'. The fixed set of __slot__ > attributes leaves a door open for future optimizations, though (e.g., if > Python could *know* obj.__class__ at compile-time, and know that runtime > code won't overwrite the 'a' descriptor in obj.__class__.__dict__, it could > map obj.a directly to its storage offset (from the base of obj) at > compile-time). That'd be cool :-) Say, would it also be possible to use __slots__ for methods ? Or even make all methods defined in the class automagically become __slots__ members ? (As I understood your explanations, __slots__ would not interfere with class attributes, only instance attributes, so this should be possible, right ?) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@v.loewis.de Thu Nov 8 09:12:37 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 10:12:37 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEA4773.6FE53E12@lemburg.com> (mal@lemburg.com) References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> Message-ID: <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> > > That won't work. You cannot know what type "x" has, so you don't know > > in advance how "x == 'one'" is evaluated. > > But you do know that x won't change from one compare to the next, > so a single dictionary lookup could replace the equality tests > (provided that x is hashable). How do you know x won't change? I certainly can write a class where it does. The key issue is that x must be hashable. If it is (including the constraint that a==b implies hash(a)==hash(b)), then I agree that this transformation would work. > What do you think about the general idea, BTW ? I'm also uncertain that this would give any speed-up. I assume you want to generate a dictionary {rhs-string : byte-code-address} or the like. I'm not convinced that the dictionary lookup + computed goto is necessarily faster than the compare sequence; this could be established only by implementing it (you don't need to implement the parser/compiler aspects, just the changes to ceval.c). There are also some security aspects here: I assume you'll put the dictionary into the constant pool (co_consts). Of course, a dictionary is not const, so somebody may change the dictionary, thus letting you jump to code positions which were not intended as jump targets. Finally, I guess tools analysing byte code will be confused. So I'm -1 on it until I see that it actually does any good. Then I'm -0 until I see that it does that good in real-life applications (of course, your application would be one, but I'd like to see a second one :-) Regards, Martin From martin@v.loewis.de Thu Nov 8 09:31:16 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 10:31:16 +0100 Subject: [Python-Dev] python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 Message-ID: <200111080931.fA89VG301500@mira.informatik.hu-berlin.de> [addition of tkFileDialog.askdirectory] > ahem. I did check in a tkDirectoryChooser module last > week... Sorry, didn't notice (obviously); you didn't put anything into Misc/NEWS or the SF patch, either... So which one would you like to preserve? I notice that there is a naming inconsistency also: it is tkFileDialog, but tkDirectory*Chooser*... I'd propose to migrate tkDirectoryChooser.Chooser._fixresult into tkFileDialog.Directory._fixresult, and remove the new module. Please let me know what you think. Regards, Martin From mwh@starship.python.net Thu Nov 8 10:24:53 2001 From: mwh@starship.python.net (Michael Hudson) Date: Thu, 08 Nov 2001 05:24:53 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception References: <20011107083349.A19554@gerg.ca> Message-ID: Skip Montanaro writes: > [KeyboardInterrupt] Parenthetically, could someone explain why this doesn't work[1]: >>> import signal >>> def interrupt(signum, frame): ... sys.exit(1) ... >>> signal.signal(signal.SIGINT, interrupt) >>> ^C KeyboardInterrupt >>> ^C KeyboardInterrupt >>> ^C KeyboardInterrupt and yet: >>> def testit(): ... signal.signal(signal.SIGINT, interrupt) ... while 1: ... pass ... >>> testit() ^C[mwh@starship mwh]$ I think I hate readline. Cheers, M. [1] Actually I know, on thinking about it, but it still sucks. Is this worth working around? -- You have run into the classic Dmachine problem: your machine has become occupied by a malevolent spirit. Replacing hardware or software will not fix this - you need an exorcist. -- Tim Bradshaw, comp.lang.lisp From thomas@xs4all.net Thu Nov 8 10:50:18 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 8 Nov 2001 11:50:18 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BE951F4.3B79913C@lemburg.com> References: <3BE951F4.3B79913C@lemburg.com> Message-ID: <20011108115018.A474@xs4all.nl> On Wed, Nov 07, 2001 at 04:23:32PM +0100, M.-A. Lemburg wrote: > Currently, dispatching of execution based on the value of > one variable is usually implemented by having some dictionary > of possible values and then calling method which implement the > different branches of execution. This works well for code which > uses medium sized methods, but fails badly for small ones such > as code which is often used in method callback based parsers. Why does it not work well for small-sized methods ? And how do you define 'well' ? Would a lengthy if/elif/elif/else construct work better ? Why ? And if not, what _would_ work better ? Or did you mean 'small numbers of methods' ? > The alternative is using lengthy > if x == 'one': ... > elif x == 'two': ... > elif x == 'three': ... > else: ...default case... > constructs. > Wouldn't it make sense to enable the byte code compiler to take > the above construct and turn it into a dictionary based > switch statement ? Frankly, no, it wouldn't make sense. Not only do we not know the type of 'x' at compile time, but we also don't know how often 'x' changes type when you start comparing it with any kind of object. What _might_ work is something like: x_key = str(x) if x == 'one': ... [etc] But that would require a type inference mechanism first, and a is-str-being-masked-or-has-builtins-been-modified check. I think your best bet for adding those is in an external (python-written) bytecode-optimizer with complicated flow analysis. If the problem with dict-based switches is the clumsiness of declaration, maybe something like this would improve matters: disp = dispatcher.dispatcher( one=lambda x: ... two=lambda x: ... three=lambda x: ... four=func_for_four five=lambda: pass) disp.dispatch(x) Or maybe you'd prefer using strings containing code fragments and exec'ing them. I'm not sure what you want the if/else to actually do. Personally, I either need a function call in there (in which case a dispatch table calling the function directly, sometimes with apply() or lambda-wrapper tricks, does fine) or some kind of variable assignment, in which case a simple dict lookup works just as fine. Then again, I don't write that much Python code. > I'm not talking about adding syntax to the language, it would > just be nice to have the compiler recognize this kind of code > (somehow; perhaps with some extra help) and produce optimized > code for it, possibly using new opcodes for the switching > operation. I personally wouldn't be adverse to a switch-like syntax, as long as we define it like a dict dispatch (the argument is evaluated once, it should be hashable, and all the 'cases' should be hashable -- preferably even compile-time constants.) I like the idea, I'm just not sure if there's enough use for it. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Thu Nov 8 10:55:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 11:55:10 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> Message-ID: <3BEA648E.AD3968D1@lemburg.com> "Martin v. Loewis" wrote: > > > > That won't work. You cannot know what type "x" has, so you don't know > > > in advance how "x == 'one'" is evaluated. > > > > But you do know that x won't change from one compare to the next, > > so a single dictionary lookup could replace the equality tests > > (provided that x is hashable). > > How do you know x won't change? I certainly can write a class where it > does. You mean one where calling the compare slot causes the object value to change ? Ok, you can always construct a case where this fails due to the dynamic nature of Python -- that's why the compiler will probably need some extra information to do the right thing here. > The key issue is that x must be hashable. If it is (including the > constraint that a==b implies hash(a)==hash(b)), then I agree that this > transformation would work. Dito. > > What do you think about the general idea, BTW ? > > I'm also uncertain that this would give any speed-up. I assume you > want to generate a dictionary {rhs-string : byte-code-address} or the > like. I'm not convinced that the dictionary lookup + computed goto is > necessarily faster than the compare sequence; It would be for large if...elif...elif...else switches which is what I'm after here. These constructs are currently not used so much in Python because of them being rather slow (O(n) on average rather than O(1) for perfect hash tables). > this could be > established only by implementing it (you don't need to implement > the parser/compiler aspects, just the changes to ceval.c). Hmm. How is that supposed to work ? I would like the compiler to generate different code for these "switch" statements. It would also have to generate the hash table and store it in the constants. > There are also some security aspects here: I assume you'll put the > dictionary into the constant pool (co_consts). Of course, a dictionary > is not const, so somebody may change the dictionary, thus letting you > jump to code positions which were not intended as jump targets. We'd need a perfect hash table object for this which would have to be read-only by nature. > Finally, I guess tools analysing byte code will be confused. True, since we'd probably need some new opcodes for this. > So I'm -1 on it until I see that it actually does any good. Then I'm > -0 until I see that it does that good in real-life applications (of > course, your application would be one, but I'd like to see a second > one :-) Well, just try to write an XML parser using mxTextTools and the taggin engine which then generates a tag list to be processed in Python by an if..elif...else "switch" statement and compare the speed to a method call based one. You'll note the difference in performance (and have a second application ;-). This is just one aspect, though. I think that a lot more state machine like code could be written in Python if well-performing "switches" would be possible in Python. That would keep the requirement to write C code for fast execution small and reduce the need for callbacks a lot. The net result for these application would be a significant win in performance and flexibility. Now how could the compiler be provided with the needed information... ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Thu Nov 8 11:18:53 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 8 Nov 2001 12:18:53 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: References: <20011107083349.A19554@gerg.ca> Message-ID: <20011108121853.C474@xs4all.nl> On Thu, Nov 08, 2001 at 05:24:53AM -0500, Michael Hudson wrote: > >>> signal.signal(signal.SIGINT, interrupt) > > >>> ^C > KeyboardInterrupt [..] > I think I hate readline. Exactly :) > [1] Actually I know, on thinking about it, but it still sucks. > Is this worth working around? I don't think so, but it's worth keeping in mind when writing your readline replacement, or your REPL/interactive-interpreter replacement. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com (Skip Montanaro) Thu Nov 8 11:31:42 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 12:31:42 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: <3BE9A226.B1D7F31E@lemburg.com> References: <20011107083349.A19554@gerg.ca> <3BE9A226.B1D7F31E@lemburg.com> Message-ID: <15338.27934.487154.659925@beluga.mojam.com> Skip> Just as raising SystemExit doesn't generate a traceback, perhaps Skip> the default handling of KeyboardInterrupt could be configurable so Skip> I could set ... mal> Isn't this already possible using one of the display hooks in mal> sys, e.g. sys.excepthook() ?! Right you are. I wasn't even aware of its existence. Skip From mwh@python.net Thu Nov 8 11:32:53 2001 From: mwh@python.net (Michael Hudson) Date: 08 Nov 2001 06:32:53 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: Thomas Wouters's message of "Thu, 8 Nov 2001 12:18:53 +0100" References: <20011107083349.A19554@gerg.ca> <20011108121853.C474@xs4all.nl> Message-ID: <2meln9v6oq.fsf@starship.python.net> Thomas Wouters writes: > On Thu, Nov 08, 2001 at 05:24:53AM -0500, Michael Hudson wrote: > > [1] Actually I know, on thinking about it, but it still sucks. > > Is this worth working around? > > I don't think so, but it's worth keeping in mind when writing your > readline replacement, or your REPL/interactive-interpreter > replacement. Well, it works in pyrepl but that's mainly because it's implemented in a sensible language with a sane approach to exceptions and so doesn't really have to worry about signals (apart from SIGWINCH...). I really ought to release the working version of that I have. Cheers, M. -- ZAPHOD: You know what I'm thinking? FORD: No. ZAPHOD: Neither do I. Frightening isn't it? -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From skip@pobox.com (Skip Montanaro) Thu Nov 8 11:43:13 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 12:43:13 +0100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: <15337.63798.224470.209674@anthem.wooz.org> References: <15337.63798.224470.209674@anthem.wooz.org> Message-ID: <15338.28625.936895.572364@beluga.mojam.com> SM> I suggest that KeyboardInterrupt should also inherit from SM> Exception, and not StandardError. BAW> It doesn't sound completely unreasonable, but I'd be -1 on it for BAW> Python 2.2. Yeah, that's why I didn't ask "I know we're in feature freeze, but can we please please please make this one itty bitty change? Please please please? Did I forget to mention please please please?" ;-) BAW> I do stuff like this all the time, although it's usually OSError. BAW> I love that OSError and IOError have a common base class! I've BAW> often wanted all the errno's to be transformed into subclasses of BAW> IOError/OSError, so I could just do something like: BAW> try: BAW> os.mkdir(...) BAW> except OSErrorEEXIST: BAW> pass BAW> # any other OSError propagates up Why not make them all subclasses of OSError that are also attributes of the OSError class. You'd then have try: os.mkdir(...) except OSError.EEXIST: pass # any other OSError propagates up which seems a little less builtin namespace polluting to me. After all, there are quite a few signals, yes? Skip From mal@lemburg.com Thu Nov 8 11:32:39 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 12:32:39 +0100 Subject: [Python-Dev] switch-based programming in Python References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> Message-ID: <3BEA6D57.E381D6AB@lemburg.com> Thomas Wouters wrote: > > On Wed, Nov 07, 2001 at 04:23:32PM +0100, M.-A. Lemburg wrote: > > > Currently, dispatching of execution based on the value of > > one variable is usually implemented by having some dictionary > > of possible values and then calling method which implement the > > different branches of execution. This works well for code which > > uses medium sized methods, but fails badly for small ones such > > as code which is often used in method callback based parsers. > > Why does it not work well for small-sized methods ? The time needed for the call overhead is significant compared to the execution time for the code of the small-sized method, e.g. like in: def handle_data(self, data): self.data.append(data) > And how do you define > 'well' ? 'well' == fast :-) > Would a lengthy if/elif/elif/else construct work better ? Why ? Yes, because it doesn't involve calling methods, no execution frames have to be setup, no arguments need to be passed in, state can be managed in local variables, etc. > And > if not, what _would_ work better ? Or did you mean 'small numbers of > methods' ? No. > > The alternative is using lengthy > > if x == 'one': ... > > elif x == 'two': ... > > elif x == 'three': ... > > else: ...default case... > > constructs. > > > Wouldn't it make sense to enable the byte code compiler to take > > the above construct and turn it into a dictionary based > > switch statement ? > > Frankly, no, it wouldn't make sense. Not only do we not know the type of 'x' > at compile time, but we also don't know how often 'x' changes type when you > start comparing it with any kind of object. What _might_ work is something > like: > > x_key = str(x) > if x == 'one': ... > [etc] > > But that would require a type inference mechanism first, and a > is-str-being-masked-or-has-builtins-been-modified check. I think your best > bet for adding those is in an external (python-written) bytecode-optimizer > with complicated flow analysis. Hmm, I'd rather not use a special compiler for this... ideal would be a simple mechanism to pass the needed information ("please compile this using a dictionary dispatch functionality") to the existing compiler. > If the problem with dict-based switches is the clumsiness of declaration, > maybe something like this would improve matters: > > disp = dispatcher.dispatcher( > one=lambda x: ... > two=lambda x: ... > three=lambda x: ... > four=func_for_four > five=lambda: pass) > > disp.dispatch(x) It would if the lambdas would be inlined by the compiler. Otherwise, you'd have the same call overhead as for method callbacks. > Or maybe you'd prefer using strings containing code fragments and exec'ing > them. You'd have to compile the strings each time... exec'uting code objects would be better, but then you again have to go through the trouble of initializing locals etc. > I'm not sure what you want the if/else to actually do. Personally, I > either need a function call in there (in which case a dispatch table calling > the function directly, sometimes with apply() or lambda-wrapper tricks, does > fine) or some kind of variable assignment, in which case a simple dict > lookup works just as fine. Then again, I don't write that much Python code. You don't ? > > I'm not talking about adding syntax to the language, it would > > just be nice to have the compiler recognize this kind of code > > (somehow; perhaps with some extra help) and produce optimized > > code for it, possibly using new opcodes for the switching > > operation. > > I personally wouldn't be adverse to a switch-like syntax, as long as we > define it like a dict dispatch (the argument is evaluated once, it should be > hashable, and all the 'cases' should be hashable -- preferably even > compile-time constants.) I like the idea, I'm just not sure if there's > enough use for it. That's the idea. There's enough need in it for my applications, so I'd go through the trouble of writing the code for it, provided I get the OK and help from python-dev. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Thu Nov 8 12:01:44 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 13:01:44 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEA4773.6FE53E12@lemburg.com> References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> Message-ID: <15338.29736.970751.455689@beluga.mojam.com> mal> As mentioned in the posting: the compiler probably has to be given mal> some extra information to enable this sort of optimization. I'm not mal> sure how the information could be "encoded", though. Suggestions mal> are appreciated, as always :-) I don't think you need anything extra if the RHS of the == is a hashable literal of some sort and the LHS is always the same simple variable or subscript expression. If the compiler can recognize the structure (that may be a big "if"), all you need is a dictionary of offsets stored in the function's constants. You just execute the equivalent of offset = jumptable.get(x, E) jumpby offset label 'one': ... jumpto endofswitch label two': ... jumpto endofswitch ... else: ... jumpto endofswitch "E" is the offset of the else clause. If none exists, E == the offset of endofswitch. It's possible that you can just analyze the parse tree before generating the code. If it matches the desired pattern you transform it to a switch-like structure. Of course, this presumes you use a parse tree as input to code generation, unlike the current C-based compiler (but like the Python-based compiler does?) Skip From fdrake@acm.org Thu Nov 8 13:38:58 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 8 Nov 2001 08:38:58 -0500 Subject: [Python-Dev] python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 In-Reply-To: <200111080931.fA89VG301500@mira.informatik.hu-berlin.de> References: <200111080931.fA89VG301500@mira.informatik.hu-berlin.de> Message-ID: <15338.35570.360546.425684@grendel.zope.com> Martin v. Loewis writes: > I'd propose to migrate tkDirectoryChooser.Chooser._fixresult into > tkFileDialog.Directory._fixresult, and remove the new module. I like this better too. Though it's something of a style issue, I really don't like ending up with a whole lot of really small modules that are so closely related. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fredrik@pythonware.com Thu Nov 8 13:02:32 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 8 Nov 2001 14:02:32 +0100 Subject: [Python-Dev] Re: python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 References: <200111080931.fA89VG301500@mira.informatik.hu-berlin.de> Message-ID: <027901c16855$a13616f0$0900a8c0@spiff> > Sorry, didn't notice (obviously); you didn't put anything into > Misc/NEWS or the SF patch, either... didn't know about the patch (and whoever submitted the patch didn't know about the directory chooser module, which has been in circulation for quite a while...) > So which one would you like to preserve? I notice that there is a > naming inconsistency also: it is tkFileDialog, but > tkDirectory*Chooser*... it's inherited from Tk (they use "choosers" for colours and directories, but not for files) > I'd propose to migrate tkDirectoryChooser.Chooser._fixresult into > tkFileDialog.Directory._fixresult, and remove the new module. that'll break my documentation, but I can fix that if you fix the code. From skip@pobox.com (Skip Montanaro) Thu Nov 8 14:34:47 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 15:34:47 +0100 Subject: [Python-Dev] Re: complex() bug or feature? (fwd) Message-ID: <15338.38919.143785.962636@beluga.mojam.com> The topic of the behavior and documentation of complex() in the face of string args came up in c.l.py. I believe the current behavior and documentation are both in error. I have a patch, test case and doc fixes that I still need to run. Looks like a probably bug fix for 2.2b2 to me. I'll submit a bug report and patch and assign to Tim. Skip ------- start of forwarded message (RFC 934 encapsulation) ------- From: Skip Montanaro Sender: python-list-admin@python.org To: "Steve Holden" Cc: python-list@python.org Subject: Re: complex() bug or feature? Date: Thu, 08 Nov 2001 15:23:58 +0100 Reply-To: skip@pobox.com (Skip Montanaro) Steve> Well the documentation should really make it clear that the Steve> single string argument case is completely different from Steve> the single numeric argument case. The former uses an Steve> implied zero as the imaginary component, whereas the latter Steve> extracts the imaginary component from the string. Steve> And yes, the implementation *should* raise an exception Steve> with a string first argument and any second argument. But Steve> the docs could use clarification. I've got a patch for the code I need to test and will generate a patch for the doc, then submit both later today. I suspect this will make it into 2.2. - -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ ------- end ------- From barry@zope.com Thu Nov 8 15:05:15 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 8 Nov 2001 10:05:15 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.287,2.288 References: <15337.37707.681014.720550@slothrop.digicool.com> <15337.65350.808408.163204@anthem.wooz.org> <3BEA45CC.9E948F2B@lemburg.com> Message-ID: <15338.40747.469446.516603@anthem.wooz.org> >>>>> "M" == M writes: M> Well, perhaps we can look into this again for 2.3. Let's do. >> Please back this one out MAL! M> Done. Thanks MAL! -Barry From barry@zope.com Thu Nov 8 15:18:55 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 8 Nov 2001 10:18:55 -0500 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception References: <15337.63798.224470.209674@anthem.wooz.org> <15338.28625.936895.572364@beluga.mojam.com> Message-ID: <15338.41567.871707.308417@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Yeah, that's why I didn't ask "I know we're in feature freeze, SM> but can we please please please make this one itty bitty SM> change? Please please please? Did I forget to mention please SM> please please?" ;-) :) BAW> I do stuff like this all the time, although it's usually BAW> OSError. I love that OSError and IOError have a common base BAW> class! I've often wanted all the errno's to be transformed BAW> into subclasses of IOError/OSError, so I could just do BAW> something like: BAW> try: os.mkdir(...) except OSErrorEEXIST: pass # any other BAW> OSError propagates up SM> Why not make them all subclasses of OSError that are also SM> attributes of the OSError class. You'd then have | try: | os.mkdir(...) | except OSError.EEXIST: | pass | # any other OSError propagates up SM> which seems a little less builtin namespace polluting to me. SM> After all, there are quite a few signals, yes? Y'know, that's got a certain appeal, although perhaps they ought to be added to EnvironmentError, since I think most can show up as a result of both IOError and OSError. Hmm, okay, I work this up and squeeze it into Python 2.2... ...just kidding. :) But it'd be worth looking into for Py2.3. -Barry From thomas@xs4all.net Thu Nov 8 15:24:25 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 8 Nov 2001 16:24:25 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEA6D57.E381D6AB@lemburg.com> References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> Message-ID: <20011108162425.D474@xs4all.nl> On Thu, Nov 08, 2001 at 12:32:39PM +0100, M.-A. Lemburg wrote: > Thomas Wouters wrote: > > Would a lengthy if/elif/elif/else construct work better ? Why ? > Yes, because it doesn't involve calling methods, no execution > frames have to be setup, no arguments need to be passed in, > state can be managed in local variables, etc. > > I'm not sure what you want the if/else to actually do. Personally, I > > either need a function call in there (in which case a dispatch table calling > > the function directly, sometimes with apply() or lambda-wrapper tricks, does > > fine) or some kind of variable assignment, in which case a simple dict > > lookup works just as fine. Then again, I don't write that much Python code. > You don't ? Is that sarcasm ? :) No, I don't. My actual job, the part I get paid for, doesn't (yet) involve writing Python. It's part C and Perl, part system design, and part administration. So Python is just a hobby. > > I personally wouldn't be adverse to a switch-like syntax, as long as we > > define it like a dict dispatch (the argument is evaluated once, it should be > > hashable, and all the 'cases' should be hashable -- preferably even > > compile-time constants.) I like the idea, I'm just not sure if there's > > enough use for it. > That's the idea. > There's enough need in it for my applications, so I'd go through > the trouble of writing the code for it, provided I get the OK > and help from python-dev. The writing part would be very tricky. I don't think you can do it without syntax support, at least not reliably, even if 'without syntax support' is some kind of directive statement to signal that a particulare if/elif/elif/else chain should be converted to a jump table behind the scene. For new syntax, I'd imagine something like this: switch EXPR: case CONSTANT: [suite] case CONSTANT: [suite] ... else: EXPR is a normal Python expression. CONSTANT should be a hashable constant (with a persistant hash value, duh) so we don't have to re-hash all the cases when entering the switch. The 'else' would function like a 'default:' case in C's switch. I'm not sure on the naming of 'switch' and 'case', nor about the indentation-level of the 'cases'. And what to do about fallthrough? It's commonly accepted (or at least argued :) as a design flaw that C's switch() defaults to fallthrough. Bytecodewise it should probably turn something like: def whatis(x): switch(x): case 'one': print '1' case 'two': print '2' case 'three': print '3' else: print "D'oh!" Into (ommitting POP_TOP's and SET_LINENO's): 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (switch-table-1) 12 SWITCH 26 (to 38) # or maybe 'SWITCH ' 14 LOAD_CONST 2 ('1') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP 43 22 LOAD_CONST 3 ('2') 25 PRINT_ITEM 26 PRINT_NEWLINE 27 JUMP 43 30 LOAD_CONST 4 ('3') 33 PRINT_ITEM 34 PRINT_NEWLINE 35 JUMP 43 38 LOAD_CONST 5 ("D'oh!") 41 PRINT_ITEM 42 PRINT_NEWLINE >>43 LOAD_CONST 0 (None) 46 RETURN_VALUE Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. PEP, anyone ? :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh@python.net Thu Nov 8 15:26:38 2001 From: mwh@python.net (Michael Hudson) Date: 08 Nov 2001 10:26:38 -0500 Subject: [Python-Dev] Re: complex() bug or feature? (fwd) In-Reply-To: Skip Montanaro's message of "Thu, 8 Nov 2001 15:34:47 +0100" References: <15338.38919.143785.962636@beluga.mojam.com> Message-ID: <2mwv11wafl.fsf@starship.python.net> Skip Montanaro writes: > The topic of the behavior and documentation of complex() in the face of > string args came up in c.l.py. I believe the current behavior and > documentation are both in error. I have a patch, test case and doc fixes > that I still need to run. Looks like a probably bug fix for 2.2b2 to me. > I'll submit a bug report and patch and assign to Tim. Unlike the one I just submitted and assigned to Fred? http://sourceforge.net/tracker/index.php?func=detail&aid=479551&group_id=5470&atid=305470 Does yours do things significantly differently? I'm not too enamoured with the error message in mine. using-my-unfair-advantage-of-being-in-europe-ly y'rs M. -- ARTHUR: Ford, you're turning into a penguin, stop it. -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From Anthony Baxter Thu Nov 8 15:37:48 2001 From: Anthony Baxter (Anthony Baxter) Date: Fri, 09 Nov 2001 02:37:48 +1100 Subject: [Python-Dev] Proposal - KeyboardInterrupt should inherit directly from Exception In-Reply-To: Message from Michael Hudson of "Thu, 08 Nov 2001 05:24:53 CDT." Message-ID: <200111081537.fA8FbmT01915@mbuna.arbhome.com.au> >>> Michael Hudson wrote > Parenthetically, could someone explain why this doesn't work[1]: > [keyboard interrupt vs sigint] > I think I hate readline. readline messes up the signals & bitsnpieces far, far too much. In the unified messaging server at work, we reimplemented our own history &c thing in the console, because that way we knew what it was doing. Anthony. (look for another large mass of checkins in the next day on the branch) -- Anthony Baxter It's never too late to have a happy childhood. From martin@v.loewis.de Thu Nov 8 15:43:32 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 16:43:32 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEA648E.AD3968D1@lemburg.com> (mal@lemburg.com) References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> Message-ID: <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> > > this could be > > established only by implementing it (you don't need to implement > > the parser/compiler aspects, just the changes to ceval.c). > > Hmm. How is that supposed to work ? I would like the compiler > to generate different code for these "switch" statements. It > would also have to generate the hash table and store it in > the constants. You would need to generate the byte code literally. Come up with a byte code definition for this feature, support it in ceval, then come up with a byte string that is an example for a large optimized "if" chain, and generate a code object from it. Adapting Lib/compiler may be helpful in generating byte code more flexible. Only when it is known that this byte code that you generated more or less manually indeed performs significantly faster, only then it would be worthwhile looking into parser/compiler support for that byte code. > Well, just try to write an XML parser using mxTextTools and the > taggin engine which then generates a tag list to be processed in > Python by an if..elif...else "switch" statement and > compare the speed to a method call based one. You'll note the > difference in performance (and have a second application ;-). If it is unacceptably inefficient, perhaps this approach to XML parsers is doomed to fail... > This is just one aspect, though. I think that a lot more state > machine like code could be written in Python if well-performing > "switches" would be possible in Python. I think people have successfully used dictionaries of functions for that. Why do you insist on generating long "if" chains? > Now how could the compiler be provided with the needed > information... ? I still think this is the last question to ask. First define the opcodes, and show us that they really do speed up things, then look into the syntax needed to support this optimization. If you think you need an annotation, you may just as well propose to introduce a switch statement into the language. switch x: case 'foo': ... case 'bar': ... case 42: ... Regards, Martin From mal@lemburg.com Thu Nov 8 15:45:07 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 16:45:07 +0100 Subject: [Python-Dev] switch-based programming in Python References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> <20011108162425.D474@xs4all.nl> Message-ID: <3BEAA883.E72D6B54@lemburg.com> Thomas Wouters wrote: > > On Thu, Nov 08, 2001 at 12:32:39PM +0100, M.-A. Lemburg wrote: > > Thomas Wouters wrote: > > > > Would a lengthy if/elif/elif/else construct work better ? Why ? > > > Yes, because it doesn't involve calling methods, no execution > > frames have to be setup, no arguments need to be passed in, > > state can be managed in local variables, etc. > > > > I'm not sure what you want the if/else to actually do. Personally, I > > > either need a function call in there (in which case a dispatch table calling > > > the function directly, sometimes with apply() or lambda-wrapper tricks, does > > > fine) or some kind of variable assignment, in which case a simple dict > > > lookup works just as fine. Then again, I don't write that much Python code. > > > You don't ? > > Is that sarcasm ? :) No, I don't. My actual job, the part I get paid for, > doesn't (yet) involve writing Python. It's part C and Perl, part system > design, and part administration. So Python is just a hobby. You should change that ;-) BTW, how did you get XS4ALL into funding the www.python.org traffic, if they don't heavily depend on Python ? > > > I personally wouldn't be adverse to a switch-like syntax, as long as we > > > define it like a dict dispatch (the argument is evaluated once, it should be > > > hashable, and all the 'cases' should be hashable -- preferably even > > > compile-time constants.) I like the idea, I'm just not sure if there's > > > enough use for it. > > > That's the idea. > > > There's enough need in it for my applications, so I'd go through > > the trouble of writing the code for it, provided I get the OK > > and help from python-dev. > > The writing part would be very tricky. I don't think you can do it without > syntax support, at least not reliably, even if 'without syntax support' is > some kind of directive statement to signal that a particulare > if/elif/elif/else chain should be converted to a jump table behind the > scene. Well, I tried to avoid syntax changes for two reasons: 1. new keywords are a problem (even though I like your proposed syntax very much) 2. old code should be able to benefit from the new feature I think that Skip's proposal would go a long way (sketching here a bit): It should be possible for the compiler to detect an if-elif-else construct which has the following signature: if x == 'first':... elif x == 'second':... else:... (ie. LHS always the same variable, RHS some hashable immutable builtin type) The compiler could then setup a perfect hash table, store it in the constants and add some opcode which triggers the following run-time behaviour: At runtime, the interpreter would check x for being one of the well-known immutable types (strings, unicode, numbers) and use the hash table for finding the right opcode snippet. > For new syntax, I'd imagine something like this: > > switch EXPR: > case CONSTANT: > [suite] > case CONSTANT: > [suite] > ... > else: > > EXPR is a normal Python expression. CONSTANT should be a hashable constant > (with a persistant hash value, duh) so we don't have to re-hash all the > cases when entering the switch. The 'else' would function like a 'default:' > case in C's switch. I'm not sure on the naming of 'switch' and 'case', nor > about the indentation-level of the 'cases'. And what to do about > fallthrough? It's commonly accepted (or at least argued :) as a design flaw > that C's switch() defaults to fallthrough. Bytecodewise it should probably > turn something like: I think you missed some indents in your example. I added them again, removing the parens around x and tweaked the formatting a bit (also note the addition of a few breaks). def whatis(x): switch x: case 'one': print '1' break case 'two': print '2' # fall through case 'three': print '3' break else: print "D'oh!" Turns out that this look very Pythonic :-) > Into (ommitting POP_TOP's and SET_LINENO's): > > 6 LOAD_FAST 0 (x) > 9 LOAD_CONST 1 (switch-table-1) > 12 SWITCH 26 (to 38) # or maybe 'SWITCH ' > > 14 LOAD_CONST 2 ('1') > 17 PRINT_ITEM > 18 PRINT_NEWLINE > 19 JUMP 43 > > 22 LOAD_CONST 3 ('2') > 25 PRINT_ITEM > 26 PRINT_NEWLINE > 27 JUMP 43 > > 30 LOAD_CONST 4 ('3') > 33 PRINT_ITEM > 34 PRINT_NEWLINE > 35 JUMP 43 > > 38 LOAD_CONST 5 ("D'oh!") > 41 PRINT_ITEM > 42 PRINT_NEWLINE > > >>43 LOAD_CONST 0 (None) > 46 RETURN_VALUE > > Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. > PEP, anyone ? :) Sure smells like PEP-time :-) If I get some more positive feedback on this, I'll start looking into this. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@v.loewis.de Thu Nov 8 15:55:16 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 16:55:16 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <15338.29736.970751.455689@beluga.mojam.com> (message from Skip Montanaro on Thu, 8 Nov 2001 13:01:44 +0100) References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <15338.29736.970751.455689@beluga.mojam.com> Message-ID: <200111081555.fA8FtGc01312@mira.informatik.hu-berlin.de> > I don't think you need anything extra if the RHS of the == is a hashable > literal of some sort and the LHS is always the same simple variable or > subscript expression. If the compiler can recognize the structure (that may > be a big "if"), all you need is a dictionary of offsets stored in the > function's constants. You just execute the equivalent of > > offset = jumptable.get(x, E) That won't work: It maybe that x is not hashable, even though it compares equal with the RHS values. Even if it was hashable, you'd change the language semantics: In the original code, you call __cmp__, say, 20 times; in the modified code, you call __hash__ once and __cmp__ perhaps also once. If __cmp__ has side effects, you get a language change. Regards, Martin From mwh@python.net Thu Nov 8 16:00:58 2001 From: mwh@python.net (Michael Hudson) Date: 08 Nov 2001 11:00:58 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin Message-ID: <2mzo5xntfp.fsf@starship.python.net> CVS doesn't build cleanly on cygwin at the moment. The problem is that the address of a DL_IMPORT()ed function is not a compile time constant when building shared libraries, so when a type wants to use PyObject_GenericGetAttr as it's tp_getattro it shouldn't include it in the PyTypeObject definition (similar to why we now always write PyObject_HEAD_INIT(NULL) ). This means that (currently) cPickle and socket don't build as shared libraries. Two solutions: (1) build them statically. This works, but is hardly a long term solution. (2) change them to poke the relavent things into the type object at module load time. Shall I just do (2)? Anyways, I thought I should mention this as an issue. Cheers, M. -- C is not clean -- the language has _many_ gotchas and traps, and although its semantics are _simple_ in some sense, it is not any cleaner than the assembly-language design it is based on. -- Erik Naggum, comp.lang.lisp From mal@lemburg.com Thu Nov 8 16:03:59 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 17:03:59 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> Message-ID: <3BEAACEF.23F89989@lemburg.com> "Martin v. Loewis" wrote: > > > > this could be > > > established only by implementing it (you don't need to implement > > > the parser/compiler aspects, just the changes to ceval.c). > > > > Hmm. How is that supposed to work ? I would like the compiler > > to generate different code for these "switch" statements. It > > would also have to generate the hash table and store it in > > the constants. > > You would need to generate the byte code literally. Come up with a > byte code definition for this feature, support it in ceval, then come > up with a byte string that is an example for a large optimized "if" > chain, and generate a code object from it. Adapting Lib/compiler may > be helpful in generating byte code more flexible. > > Only when it is known that this byte code that you generated more or > less manually indeed performs significantly faster, only then it would > be worthwhile looking into parser/compiler support for that byte code. Ok. > > Well, just try to write an XML parser using mxTextTools and the > > taggin engine which then generates a tag list to be processed in > > Python by an if..elif...else "switch" statement and > > compare the speed to a method call based one. You'll note the > > difference in performance (and have a second application ;-). > > If it is unacceptably inefficient, perhaps this approach to XML > parsers is doomed to fail... It is not unacceptably inefficient. Indeed this approach already outperforms the method callback based one using the current Python versions and long if-elif-else statements. What I'm argueing for is that we make it perform even better ;-) > > This is just one aspect, though. I think that a lot more state > > machine like code could be written in Python if well-performing > > "switches" would be possible in Python. > > I think people have successfully used dictionaries of functions for > that. Why do you insist on generating long "if" chains? I don't "insist" on any programming technique. It just happens that I have made very good experience with this kind of approach in both C and Python (except that Python could be made smarter when it comes to finding the right code snippet to execute). > > Now how could the compiler be provided with the needed > > information... ? > > I still think this is the last question to ask. First define the > opcodes, and show us that they really do speed up things, then look > into the syntax needed to support this optimization. I'll do that, but only if people on python-dev agree that it's worth trying (I'm +1 on it, obviously). There's not much point investing time into something which then get's lots of -1's just because I can't convince you guys of the benefits, whether its performance, gaining new grounds for Python development (writing low-level fast parsers in Python) or simply introducing a different way of approaching a common problem (method callbacks vs. switches). > If you think you need an annotation, you may just as well propose to > introduce a switch statement into the language. True, but that would probably be even harder to get accepted on python-dev (or would it ;-) ? > switch x: > case 'foo': > ... > case 'bar': > ... > case 42: > ... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Nov 8 16:12:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 17:12:18 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <15338.29736.970751.455689@beluga.mojam.com> <200111081555.fA8FtGc01312@mira.informatik.hu-berlin.de> Message-ID: <3BEAAEE2.892DEA58@lemburg.com> "Martin v. Loewis" wrote: > > > I don't think you need anything extra if the RHS of the == is a hashable > > literal of some sort and the LHS is always the same simple variable or > > subscript expression. If the compiler can recognize the structure (that may > > be a big "if"), all you need is a dictionary of offsets stored in the > > function's constants. You just execute the equivalent of > > > > offset = jumptable.get(x, E) > > That won't work: It maybe that x is not hashable, even though it > compares equal with the RHS values. > > Even if it was hashable, you'd change the language semantics: In the > original code, you call __cmp__, say, 20 times; in the modified code, > you call __hash__ once and __cmp__ perhaps also once. If __cmp__ has > side effects, you get a language change. Good point. Now would such a change be acceptable if the optimization would only be triggered for builtin immuatble types on both sides of the "==" ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Thu Nov 8 16:13:28 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 17:13:28 +0100 Subject: [Python-Dev] FAQ 4.98 should point to Tim's FP tutorial section Message-ID: <15338.44840.150894.977212@beluga.mojam.com> If someone remembers when the time comes, FAQ question 4.98 should probably refer to the new tutorial section Tim wrote on floating point. I went to make the change but realized I can't really point to it yet (it's still only available in the less permanent development docs) and am conveniently blanking on the FAQ wizard password... (spam? inquisition? ???) :-( Skip From niemeyer@conectiva.com Thu Nov 8 16:21:07 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 8 Nov 2001 14:21:07 -0200 Subject: [Python-Dev] Python's footprint Message-ID: <20011108142106.A2559@ibook.distro.conectiva> --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello! I've been thinking about ways to reduce python's footprint, so we'd be able to include the interpreter in environments requiring reduced applications. One of the tests I've done was to remove every inlined documentation. Please, take a look at the results using a stripped binary of python 2.2: With inline docs: -rwxrwxr-x 1 niemeyer niemeyer 634452 Nov 8 17:05 python Without inline docs: -rwxrwxr-x 1 niemeyer niemeyer 576852 Nov 8 17:12 python It means that about 10% of python's executable is documentation. Now I'm wondering if something like a DOCSTRING("foo") macro would be valid in that case. If the user disabled it trough --disable-doc, for example, DOCSTRING() would return "". Thanks! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --k+w/mQv8wyuph6w0 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE76rDyIlOymmZkOgwRAn85AKC5mQS/NABzC1Yce5VsVsoI4oNkewCfc/r6 RNlpkR12nyTEFMQ1auuq1Eo= =oDRc -----END PGP SIGNATURE----- --k+w/mQv8wyuph6w0-- From skip@pobox.com (Skip Montanaro) Thu Nov 8 16:23:09 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 08 Nov 2001 17:23:09 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEAACEF.23F89989@lemburg.com> References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> <3BEAACEF.23F89989@lemburg.com> Message-ID: >> If you think you need an annotation, you may just as well propose to >> introduce a switch statement into the language. mal> True, but that would probably be even harder to get accepted on mal> python-dev (or would it ;-) ? >> switch x: >> case 'foo': >> ... >> case 'bar': >> ... >> case 42: >> ... If you restrict the case values to hashable literals do you need "case"? One new keyword would be easier than two for Guido to swallow... One other post I saw in this thread used explicit breaks as is required in C. I would get rid of that. When the current case's code ends, control flow should just jump to the end of the switch. No other block in Python falls through like that does it? Leaving out the break statement can also be a subtle source of errors in C code and can probably be eliminated without much loss of expressiveness. Besides, switches (especially those used to implement state machines) are often executed inside loops. If break is used to terminate the current case, it's not available to break out of the enclosing loop and you're stuck with using a try/except/raise combination or setting some state variable and checking it at the bottom of each loop. Skip From esr@thyrsus.com Thu Nov 8 17:17:56 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 8 Nov 2001 12:17:56 -0500 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: ; from montanaro@tttech.com on Thu, Nov 08, 2001 at 05:23:09PM +0100 References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> <3BEAACEF.23F89989@lemburg.com> Message-ID: <20011108121756.A10988@thyrsus.com> Skip Montanaro : > One other post I saw in this thread used explicit breaks as is required in > C. I would get rid of that. When the current case's code ends, control > flow should just jump to the end of the switch. I disagree. Such fallthrough is very useful when writing state machines, which is a significant part of the utility of a case statement. -- Eric S. Raymond I don't like the idea that the police department seems bent on keeping a pool of unarmed victims available for the predations of the criminal class. -- David Mohler, 1989, on being denied a carry permit in NYC From jason@tishler.net Thu Nov 8 16:44:30 2001 From: jason@tishler.net (Jason Tishler) Date: Thu, 8 Nov 2001 11:44:30 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: <2mzo5xntfp.fsf@starship.python.net> Message-ID: <20011108114430.A816@dothill.com> Michael, On Thu, Nov 08, 2001 at 11:00:58AM -0500, Michael Hudson wrote: > Two solutions: > > (1) build them statically. This works, but is hardly a long term > solution. > (2) change them to poke the relavent things into the type object at > module load time. > > Shall I just do (2)? Yes, please try option 2 above. If successful, would you be willing to submit the patch to the SourceForge Python patch collector? Thanks, Jason From martin@v.loewis.de Thu Nov 8 16:44:20 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 17:44:20 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEAAEE2.892DEA58@lemburg.com> (mal@lemburg.com) References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <15338.29736.970751.455689@beluga.mojam.com> <200111081555.fA8FtGc01312@mira.informatik.hu-berlin.de> <3BEAAEE2.892DEA58@lemburg.com> Message-ID: <200111081644.fA8GiKP01438@mira.informatik.hu-berlin.de> > Now would such a change be acceptable if the optimization would > only be triggered for builtin immuatble types on both sides of > the "==" ? It may be even acceptable with the language change if the procedures for introducing incompatible language changes are followed (i.e. add a __future__ import in one version, and only use the optimization if the future is imported; in the next version, decide that the future is there). When there is clearly no language change, the change would be acceptable to me if it would cause no "significant" slow-down if the optimization wasn't triggered. I assume you'll need a run-time test, so it is likely that there is atleast a small slow-down due to the additional test. Of course, only experiments can show whether the slow-down is "significant". Regards, Martin From martin@v.loewis.de Thu Nov 8 16:51:32 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 17:51:32 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: (message from Skip Montanaro on Thu, 08 Nov 2001 17:23:09 +0100) References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> <3BEAACEF.23F89989@lemburg.com> Message-ID: <200111081651.fA8GpW901465@mira.informatik.hu-berlin.de> > >> switch x: > >> case 'foo': > >> ... > >> case 'bar': > >> ... > >> case 42: > >> ... > > If you restrict the case values to hashable literals do you need "case"? I'm not going to draft a language extension here :-) Syntactically, you don't need it - even if you allow arbitrary complex expressions: you probably wouldn't allow any statements directly nested into the switch. It may be that switch x: if 'foo': ... elif 'bar': ... is also acceptable, and doesn't need the new keyword. > One new keyword would be easier than two for Guido to swallow... You'd introduce it through a __future__ import, so it wouldn't matter if it is one or two. Regards, Martin From thomas@xs4all.net Thu Nov 8 16:51:48 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 8 Nov 2001 17:51:48 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEAA883.E72D6B54@lemburg.com> References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> <20011108162425.D474@xs4all.nl> <3BEAA883.E72D6B54@lemburg.com> Message-ID: <20011108175148.E474@xs4all.nl> On Thu, Nov 08, 2001 at 04:45:07PM +0100, M.-A. Lemburg wrote: > BTW, how did you get XS4ALL into funding the www.python.org traffic, if > they don't heavily depend on Python ? That's a complicated story. I'll be happy to explain it over a beer at IPC10 (if we both make it there ;P) but the short version is that XS4ALL is not an ordinary company, we use a lot of opensource software, and my boss suggested it. The traffic is peanuts, by the way, I think the more costly part is the rackspace in our system room. > I think that Skip's proposal would go a long way (sketching here > a bit): > It should be possible for the compiler to detect an if-elif-else > construct which has the following signature: > if x == 'first':... > elif x == 'second':... > else:... [..] > At runtime, the interpreter would check x for being one of the > well-known immutable types (strings, unicode, numbers) and > use the hash table for finding the right opcode snippet. Hmm... I don't think this will have as much impact as you think. But testing it like Martin suggested would be a good idea, and the compiler/interpreter is a fun thing to play and experiment with. [ About my switch proposal ] > I think you missed some indents in your example. I added them again, > removing the parens around x and tweaked the formatting a bit (also > note the addition of a few breaks). Actually, no, all but the parentheses were intentional. I don't like needing the break (hence my comments about fallthrough) and I think the switch, case and else should all be indented to the same level, just like 'if/elif/else'. > def whatis(x): > switch x: > case 'one': > print '1' > break > Turns out that this look very Pythonic :-) I like my version better, with the exception of the parentheses around 'x' in 'switch(x):' :) > Sure smells like PEP-time :-) Aye, but lets do it while Guido is still on paternity leave so we at least get to finish the proposal before it's -1'ed :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim@zope.com Thu Nov 8 17:20:02 2001 From: tim@zope.com (Tim Peters) Date: Thu, 8 Nov 2001 12:20:02 -0500 Subject: [Python-Dev] FAQ 4.98 should point to Tim's FP tutorial section In-Reply-To: <15338.44840.150894.977212@beluga.mojam.com> Message-ID: [Skip Montanaro] > If someone remembers when the time comes, You're elected by unanimous acclaim . > FAQ question 4.98 should probably refer to the new tutorial section Tim > wrote on floating point. I went to make the change but realized I can't > really point to it yet (it's still only available in the less permanent > development docs) I expect the URL is more stable than you might guess -- it wiil also be in future versions of the development docs, and node numbers in the Tutorial don't change often. > and am conveniently blanking on the FAQ wizard password... (spam? > inquisition? ???) :-( It's the same as your name, except the fourth letter is moved to the second position, and "ki" is replaced by "am". Spam-hinting-ly y'rs - caseful tim From skip@pobox.com (Skip Montanaro) Thu Nov 8 17:42:01 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 18:42:01 +0100 Subject: [Python-Dev] FAQ 4.98 should point to Tim's FP tutorial section In-Reply-To: References: <15338.44840.150894.977212@beluga.mojam.com> Message-ID: <15338.50153.899381.784764@beluga.mojam.com> Tim> [Skip Montanaro] >> If someone remembers when the time comes, Tim> You're elected by unanimous acclaim . Okay, thanks. I'm subject to bitrot due to my ever advancing age, so I've asked the 'at' command to remind me. Tim> It's the same as your name, except the fourth letter is moved to Tim> the second position, and "ki" is replaced by "am". i'd-tatto-that-on-my-right-eyelid-but-i-close-it-when-i-wink-ly, y'rs, Skip From paul-python@svensson.org Thu Nov 8 17:43:56 2001 From: paul-python@svensson.org (Paul Svensson) Date: Thu, 8 Nov 2001 12:43:56 -0500 (EST) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <20011108121756.A10988@thyrsus.com> Message-ID: On Thu, 8 Nov 2001, Eric S. Raymond wrote: >Skip Montanaro : >> One other post I saw in this thread used explicit breaks as is required in >> C. I would get rid of that. When the current case's code ends, control >> flow should just jump to the end of the switch. > >I disagree. Such fallthrough is very useful when writing state machines, >which is a significant part of the utility of a case statement. +0 on the idea of some kind of switch statement, but -1 on bringing C's glorified computed GOTO into Python. Maybe it would be a good idea to use some other keyword(s), so as to avoid confusion with C altogether, e.g select/when instead of switch/case. /Paul From martin@v.loewis.de Thu Nov 8 17:54:46 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 8 Nov 2001 18:54:46 +0100 Subject: [Python-Dev] Re: python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 In-Reply-To: <027901c16855$a13616f0$0900a8c0@spiff> (fredrik@pythonware.com) References: <200111080931.fA89VG301500@mira.informatik.hu-berlin.de> <027901c16855$a13616f0$0900a8c0@spiff> Message-ID: <200111081754.fA8Hskn01633@mira.informatik.hu-berlin.de> > > I'd propose to migrate tkDirectoryChooser.Chooser._fixresult into > > tkFileDialog.Directory._fixresult, and remove the new module. > > that'll break my documentation, but I can fix that if you fix > the code. Done. I left the class name as tkFileDialog.Directory, and the convenience function as tkFileDialog.askdirectory. Regards, Martin From tim@zope.com Thu Nov 8 18:04:46 2001 From: tim@zope.com (Tim Peters) Date: Thu, 8 Nov 2001 13:04:46 -0500 Subject: [Python-Dev] Re: python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 In-Reply-To: <027901c16855$a13616f0$0900a8c0@spiff> Message-ID: BTW, does our Windows distro support this stuff? Just curious. There's one open report about the Windows distro not supporting tix, which I believe, and reluctantly don't intend to do anything about (no time). From fredrik@pythonware.com Thu Nov 8 18:57:51 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 8 Nov 2001 19:57:51 +0100 Subject: [Python-Dev] switch-based programming in Python References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> <20011108162425.D474@xs4all.nl> <3BEAA883.E72D6B54@lemburg.com> <20011108175148.E474@xs4all.nl> Message-ID: <00ce01c16887$4bab5d80$ced241d5@hagrid> thomas wrote: > Aye, but lets do it while Guido is still on paternity leave so we at least > get to finish the proposal before it's -1'ed :) if you need a -1, you can get one from me. From fredrik@pythonware.com Thu Nov 8 18:57:15 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 8 Nov 2001 19:57:15 +0100 Subject: [Python-Dev] Re: python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 References: Message-ID: <00cd01c16887$4b298350$ced241d5@hagrid> tim wrote: > BTW, does our Windows distro support this stuff? Just curious. There's one > open report about the Windows distro not supporting tix, which I believe, > and reluctantly don't intend to do anything about (no time). our does. depends on what Tk version you're using. (iirc, you'll need 8.3.2 or later) From mwh@python.net Thu Nov 8 18:52:01 2001 From: mwh@python.net (Michael Hudson) Date: 08 Nov 2001 13:52:01 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: Jason Tishler's message of "Thu, 8 Nov 2001 11:44:30 -0500" References: <20011108114430.A816@dothill.com> Message-ID: <2mvgglqeni.fsf@starship.python.net> Jason Tishler writes: > Michael, > > On Thu, Nov 08, 2001 at 11:00:58AM -0500, Michael Hudson wrote: > > Two solutions: > > > > (1) build them statically. This works, but is hardly a long term > > solution. > > (2) change them to poke the relavent things into the type object at > > module load time. > > > > Shall I just do (2)? > > Yes, please try option 2 above. If successful, would you be willing to > submit the patch to the SourceForge Python patch collector? It works. Do you want to see the patches or shall I just check the changes in? BTW, _cursesmodule.c doesn't compile; you get things like: Warning: resolving _stdscr by linking to __imp__stdscr (auto-import) Warning: resolving _LINES by linking to __imp__LINES (auto-import) Warning: resolving _COLS by linking to __imp__COLS (auto-import) Warning: resolving _newscr by linking to __imp__newscr (auto-import) Warning: resolving _COLORS by linking to __imp__COLORS (auto-import) Warning: resolving _COLOR_PAIRS by linking to __imp__COLOR_PAIRS (auto-import) build/temp.cygwin-1.3.3-i686-2.2/_cursesmodule.o: In function `PyCurses_InitScr': /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1842: undefined reference to `acs_map' /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1843: undefined reference to `acs_map' /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1844: undefined reference to `acs_map' /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1845: undefined reference to `acs_map' /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1846: undefined reference to `acs_map' build/temp.cygwin-1.3.3-i686-2.2/_cursesmodule.o:/cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1847: more undefined references to `acs_map' follow and contrary to README, test_poll and threads seem to work fine. test_strftime is still bust, though: 158 tests OK. 1 test failed: test_strftime 26 tests skipped: test_al test_bsddb test_cd test_cl test_curses test_dbm test_dl test_gl test_imgfile test_largefile test_linuxaudiodev test_locale test_minidom test_nis test_ntpath test_openpty test_pty test_pyexpat test_sax test_socket_ssl test_socketserver test_sunaudiodev test_sundry test_unicode_file test_winreg test_winsound Ask someone to teach regrtest.py about which tests are expected to get skipped on cygwin. Cheers, M. -- [Perl] combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript. -- Jamie Zawinski From skip@pobox.com (Skip Montanaro) Thu Nov 8 19:05:41 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 8 Nov 2001 20:05:41 +0100 Subject: [Python-Dev] switch & bytecodehacks? Message-ID: <15338.55173.761770.654977@beluga.mojam.com> Could bytecodehacks be made to do the heavy lifting for MAL's switch performance testing? Might be easier than manually creating code strings. Skip From mal@lemburg.com Thu Nov 8 19:30:52 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 20:30:52 +0100 Subject: [Python-Dev] switch-based programming in Python References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> <20011108162425.D474@xs4all.nl> <3BEAA883.E72D6B54@lemburg.com> <20011108175148.E474@xs4all.nl> <00ce01c16887$4bab5d80$ced241d5@hagrid> Message-ID: <3BEADD6C.5F48F9C4@lemburg.com> Fredrik Lundh wrote: > > thomas wrote: > > Aye, but lets do it while Guido is still on paternity leave so we at least > > get to finish the proposal before it's -1'ed :) > > if you need a -1, you can get one from me. Please explain... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim@zope.com Thu Nov 8 20:18:10 2001 From: tim@zope.com (Tim Peters) Date: Thu, 8 Nov 2001 15:18:10 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: <2mvgglqeni.fsf@starship.python.net> Message-ID: [Michael Hudson] >>> (2) change them to poke the relavent things into the type object at >>> module load time. >>> >>> Shall I just do (2)? [Jason Tishler] >> Yes, please try option 2 above. If successful, would you be willing to >> submit the patch to the SourceForge Python patch collector? [Michael Hudson] > It works. Do you want to see the patches or shall I just check the > changes in? Michael, I'm assuming the patches aren't Cygwin-specific, in which case please just check them in. And thank you! > BTW, _cursesmodule.c doesn't compile; Is this supposed to be bad news ? count-your-blessings-if-it-did-compile-then-you'd-have-to-figure-out- why-it-doesn't-work-ly y'rs - tim From jason@tishler.net Thu Nov 8 20:20:52 2001 From: jason@tishler.net (Jason Tishler) Date: Thu, 8 Nov 2001 15:20:52 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: <2mvgglqeni.fsf@starship.python.net> Message-ID: <20011108152052.B816@dothill.com> --CGQ4QxZ4DCp/f9YC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Michael, On Thu, Nov 08, 2001 at 01:52:01PM -0500, Michael Hudson wrote: > It works. Do you want to see the patches or shall I just check the > changes in? If they are just the standard "PyObject_HEAD_INIT(NULL)" style fix, then please just commit them. > BTW, _cursesmodule.c doesn't compile; you get things like: > > Warning: resolving _stdscr by linking to __imp__stdscr (auto-import) > [snip] > build/temp.cygwin-1.3.3-i686-2.2/_cursesmodule.o: In function `PyCurses_InitScr': > /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1842: undefined reference to `acs_map' > [snip] I'm getting a feeling of deja vu from the above, but I can't quite remember... I believe that the above is related to the auto-import features recently added to Cygwin's binutils. Please send me your /etc/setup/installed.db via private email. In the meantime, I will dig some on my own. > and contrary to README, The Cygwin section of Python README has become stale -- I need to submit a doco patch. Sigh... > test_poll The test_poll problem was fixed by: http://sources.redhat.com/ml/cygwin-patches/2001-q3/msg00109.html and released in Cygwin 1.3.4. > and threads seem to work fine. There is still one known Cygwin pthreads hang. If interested, see the following for the current state of affairs: http://sources.redhat.com/ml/cygwin-developers/2001-10/msg00193.html > test_strftime is still bust, though: The test_strftime problem was fixed by: http://sources.redhat.com/ml/newlib/2001/msg00504.html and released in Cygwin 1.3.4. I've attached the latest README, since it is the most complete description of the current state. Thanks, Jason --CGQ4QxZ4DCp/f9YC Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="python-2.1.1.README" $Id: README,v 1.10 2001/09/26 13:28:31 jt Exp $ Abstract: This is the README for the Cygwin Python distribution. As of Python 2.1, Cygwin Python is built with a DLL core very similar to how Win32 Python is built. This enables Cygwin Python to support building shared extensions with the traditional Misc/Makefile.pre.in and the newer distutils methods. As of Python 2.1.1-2, the Cygwin Python port is essentially complete (at least for Windows NT 4.0 and 2000). The most notable changes are the addition of the _tkinter module and the elimination of the test_poll hang. I was tempted to enable threading in the 2.1.1-2 release too. Unfortunately, there is still one known problem with Cygwin's pthreads support. So, I opted for the more prudent choice which is to continue to disable threading until this problem is resolved. See the issues section for more details, if interested. Requirements: The following packages or later are required to build and/or execute Cygwin Python: Cygwin 1.3.3-2 gcc 2.95.3-1 The following packages or later are required to build and/or execute some of the standard Cygwin Python extension modules: gdbm-1.8.0-3 (gdbm) ncurses-5.2-5 (_curses and _curses_panel) readline 4.2-3 (readline) zlib 1.1.3-6 (zlib) tcltk-20001125-1 (_tkinter) Install: Cygwin Python does not require any special installation procedures. However, to use the _tkinter module you must define the following environment variables: $ export TCL_LIBRARY=$(cygpath -w /usr/share/tcl8.0) $ export TK_LIBRARY=$(cygpath -w /usr/share/tk8.0) since tcltk-20001125-1 is a native Win32 (i.e., not Cygwin) application. Source: The Python source builds OOTB under Cygwin. However, there are a few minor issues so the source has been patched to correct them. The following patches have been submitted to Python CVS: http://sourceforge.net/tracker/index.php?func=detail&aid=429442&group_id=5470&atid=305470 http://sourceforge.net/tracker/index.php?func=detail&aid=443669&group_id=5470&atid=305470 http://sourceforge.net/tracker/index.php?func=detail&aid=459385&group_id=5470&atid=305470 http://sourceforge.net/tracker/index.php?func=detail&aid=462255&group_id=5470&atid=305470 http://sourceforge.net/tracker/index.php?func=detail&aid=462258&group_id=5470&atid=305470 for consideration and have been accepted into Python CVS. Hence, these minor issues will be resolved in Python 2.2. I also added the following files to the source archive: CYGWIN-PATCHES/README CYGWIN-PATCHES/build.sh CYGWIN-PATCHES/python.patch and renamed the original source archive to match Cygwin's setup.exe naming conventions. To restore the Python source to its original state, perform the following: $ cd python-2.1.1 $ patch -R -p1 --CGQ4QxZ4DCp/f9YC-- From gward@python.net Thu Nov 8 21:51:05 2001 From: gward@python.net (Greg Ward) Date: Thu, 8 Nov 2001 16:51:05 -0500 Subject: [Python-Dev] Python's footprint In-Reply-To: <20011108142106.A2559@ibook.distro.conectiva> References: <20011108142106.A2559@ibook.distro.conectiva> Message-ID: <20011108165105.A29947@gerg.ca> On 08 November 2001, Gustavo Niemeyer said: > It means that about 10% of python's executable is documentation. Interesting! I wonder what the corresponding figure for .pyc files in the std library is. > Now I'm > wondering if something like a DOCSTRING("foo") macro would be valid in > that case. If the user disabled it trough --disable-doc, for example, > DOCSTRING() would return "". I think it would have to be a bit fancier than that; wouldn't you also have to specify the name of the C identifier into which that documentation is put? That's doable in an all-ANSI-C world, but trickier than DOCSTRING("foo"). Anyways, that sounds like a useful idea. It would probably be a big patch that touches lots of files, so it's unlikely to get into Python 2.2. You might consider whipping up a patch now to get it under consideration early in 2.3's life-cycle. Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ All things are possible -- except skiing through a revolving door. From mal@lemburg.com Thu Nov 8 22:17:15 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 Nov 2001 23:17:15 +0100 Subject: [Python-Dev] Python's footprint References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> Message-ID: <3BEB046B.24FBB6FA@lemburg.com> Greg Ward wrote: > > On 08 November 2001, Gustavo Niemeyer said: > > It means that about 10% of python's executable is documentation. > > Interesting! I wonder what the corresponding figure for .pyc files in > the std library is. > > > Now I'm > > wondering if something like a DOCSTRING("foo") macro would be valid in > > that case. If the user disabled it trough --disable-doc, for example, > > DOCSTRING() would return "". > > I think it would have to be a bit fancier than that; wouldn't you also > have to specify the name of the C identifier into which that > documentation is put? That's doable in an all-ANSI-C world, but > trickier than DOCSTRING("foo"). > > Anyways, that sounds like a useful idea. It would probably be a big > patch that touches lots of files, so it's unlikely to get into Python > 2.2. You might consider whipping up a patch now to get it under > consideration early in 2.3's life-cycle. Even better: why not work together with Martin to have the doc-strings localized ?! (One of the possible languages could then be the emtpy one ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From barry@zope.com Thu Nov 8 22:23:24 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 8 Nov 2001 17:23:24 -0500 Subject: [Python-Dev] Documentation TODO for Python 2.2 Message-ID: <15339.1500.338680.203793@anthem.wooz.org> A few days ago, we (Pythonlabs) sat down and tried to figure out what still needs to be documented for Python 2.2. We came up with an impressive list. :) It's clear that, left up to us, there's no way we'll be able to get it all done, so we're turning to you to help! Here's the plan we came up with. For each of the items in the list below, we'll add a bug report to SF in the documentation category. Actually, Fred will do this, and it will auto-assign to him. If you find a topic that you can volunteer to document, and you are a SF committer, then simply assign the bug report to yourself and go for it. If you are not a SF committer, the bug report will stay assigned to Fred, and you should coordinate the assignments with him. Fred's going to oversee this effort, so any questions you might have (e.g. you're not sure where the documentation for new feature X should be added), please coordinate with him. In the list below, some stuff are new language features, some are new C API functions, etc. Some things will need to go in the language ref, some in the library ref. Some items are already documented to some degree, and may just need a little work to clean up, or make consistent with other sections of the manuals. We've already done a first pass at claiming some of the items in the list. Any help you can provide will be very much appreciated. (Note that the "big one" is documenting new-style classes. This is a large bite because classic classes aren't really adequately documented, and it doesn't make much sense to document new-style classes independent of classic ones. Guido is the most qualified to document new-style classes, but it's doubtful he'll have time, so I volunteer to do it instead. Or at least take a crack at it.) The numbers are rough estimates in days. Items that are claimed have that person's initials next to them. Thanks, -Barry PEPs 'n' stuff 4.5 - iterators .5 - generators .5 - nested scopes 1 (JH) - future division .5 - new builtins .5 - list comprehensions (< .5) - unifying long & int .5 - extended call syntax (< .5) Library 2.5 - compiler package 1.5 (JH) - hotshot 1 (FD) Classes 6 (GvR or BAW) - old style classes - distinction between kinds of classes (new-style and old-style) - new __ methods - method resolution order - metatypes - coercion - subclassing builtin types - slots - properties / descriptors - D.B. hook - attribute lookup rules C API 4 - new __ slots 1 - subclassing types 1 (JH?) - METH_O and friends (?) - writing a new builtin type 2 (JH) Modules - pydoc - smtpd (BAW) - all the new modules that have been added and not documented 18+ days, and that's being optimistic. :) From niemeyer@conectiva.com Thu Nov 8 22:53:14 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 8 Nov 2001 20:53:14 -0200 Subject: [Python-Dev] Python's footprint In-Reply-To: <3BEB046B.24FBB6FA@lemburg.com>; from mal@lemburg.com on Thu, Nov 08, 2001 at 11:17:15PM +0100 References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> <3BEB046B.24FBB6FA@lemburg.com> Message-ID: <20011108205314.A9862@ibook.distro.conectiva> --sdtB3X0nJg68CQEu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 08, 2001 at 11:17:15PM +0100, M.-A. Lemburg wrote: > Greg Ward wrote: > > On 08 November 2001, Gustavo Niemeyer said: > > > wondering if something like a DOCSTRING("foo") macro would be valid in > > > that case. If the user disabled it trough --disable-doc, for example, > > > DOCSTRING() would return "". > >=20 > > I think it would have to be a bit fancier than that; wouldn't you also > > have to specify the name of the C identifier into which that > > documentation is put? That's doable in an all-ANSI-C world, but > > trickier than DOCSTRING("foo"). What I had in mind would be something like: static char module_doc[] =3D DOCSTRING("module documentation"); And do something like: #if Py_INLINE_DOCS #define DOCSTRING(x) x #else #define DOCSTRING(x) "" #endif > > Anyways, that sounds like a useful idea. It would probably be a big > > patch that touches lots of files, so it's unlikely to get into Python > > 2.2. You might consider whipping up a patch now to get it under > > consideration early in 2.3's life-cycle. Indeed, it would touch lots of files. On the other hand, changes introduced by the patch would be very trivial and shouldn't affect python's stability at all. > Even better: why not work together with Martin to have the doc-strings > localized ?! (One of the possible languages could then be the emtpy > one ;-) I could work with Martin on that, for sure. But this fact doesn't change the need for DOCSTRING(). I18n usually keeps the default version of the string in the object to ensure that something is shown, even if you don't have any i18n files at all. Best regards! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --sdtB3X0nJg68CQEu Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE76wzaIlOymmZkOgwRAju7AJkBjaFLlanwCa6wrGypkFsBpJ8GmACfc4Ot qJcF+Ng0FcaOJJQBRbwMcDU= =ucTf -----END PGP SIGNATURE----- --sdtB3X0nJg68CQEu-- From tim@zope.com Thu Nov 8 23:02:10 2001 From: tim@zope.com (Tim Peters) Date: Thu, 8 Nov 2001 18:02:10 -0500 Subject: [Python-Dev] Re: python/dist/src/Lib/lib-tk tkFileDialog.py,1.2,1.3 In-Reply-To: <00cd01c16887$4b298350$ced241d5@hagrid> Message-ID: [Tim] > BTW, does our Windows distro support this stuff? Just curious. > There's one open report about the Windows distro not supporting tix, > which I believe, and reluctantly don't intend to do anything about (no > time). [/F] > our does. > > depends on what Tk version you're using. (iirc, you'll need > 8.3.2 or later) We've been using 8.3.2 for as long as I've been building the installer. I don't know anything about tix, though. If somebody wants to see it in the PythonLabs Windows distro too, it's up to them to make it happen. From barry.alan.scott@ntlworld.com Fri Nov 9 01:11:37 2001 From: barry.alan.scott@ntlworld.com (Barry Scott) Date: Fri, 9 Nov 2001 01:11:37 -0000 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: Message-ID: <001601c168bb$7b108da0$070210ac@private> The fall through is the source of too many defects in C/C++ code. And its rarely used in the wild according to report on this subject a few years ago. Goto in python would be a terrible thing. BArry -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Paul Svensson Sent: 08 November 2001 17:44 To: python-dev@python.org Subject: Re: [Python-Dev] switch-based programming in Python On Thu, 8 Nov 2001, Eric S. Raymond wrote: >Skip Montanaro : >> One other post I saw in this thread used explicit breaks as is required in >> C. I would get rid of that. When the current case's code ends, control >> flow should just jump to the end of the switch. > >I disagree. Such fallthrough is very useful when writing state machines, >which is a significant part of the utility of a case statement. +0 on the idea of some kind of switch statement, but -1 on bringing C's glorified computed GOTO into Python. Maybe it would be a good idea to use some other keyword(s), so as to avoid confusion with C altogether, e.g select/when instead of switch/case. /Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev From greg@cosc.canterbury.ac.nz Fri Nov 9 02:17:53 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 09 Nov 2001 15:17:53 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111081651.fA8GpW901465@mira.informatik.hu-berlin.de> Message-ID: <200111090217.PAA12184@s454.cosc.canterbury.ac.nz> "Martin v. Loewis" : > switch x: > if 'foo': > ... > elif 'bar': > ... I don't like that, because the 'if' has a different meaning from usual because of being inside a construct that is perhaps some distance away visually. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Nov 9 02:27:26 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 09 Nov 2001 15:27:26 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <20011108121756.A10988@thyrsus.com> Message-ID: <200111090227.PAA12188@s454.cosc.canterbury.ac.nz> "Eric S. Raymond" : > I disagree. Such fallthrough is very useful when writing state machines, > which is a significant part of the utility of a case statement. It only handles a special case, though, where one branch is a strict postfix of another. It's no use if the factoring you want to do is any more complicated. Also, it's backwards to require you to explicitly state where you *don't* want fallthrough, when it's most often not what you want. I'm about -100 on fallthrough. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Nov 9 02:36:59 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 09 Nov 2001 15:36:59 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <20011108162425.D474@xs4all.nl> Message-ID: <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> Thomas Wouters : > switch EXPR: > case CONSTANT: > [suite] > case CONSTANT: > [suite] > ... > else: Looks good, except that I'd indent the cases as well, i.e. switch EXPR: case CONSTANT: [suite] case CONSTANT: [suite] else: [suite] To my mind the cases are logically a subordinate part of the switch statement, and the indentation should reflect that. Some alternatives: Using only one keyword: case EXPR: CONSTANT: [suite] CONSTANT: [suite] else: [suite] Using two, but reminding one less of C: case EXPR: of CONSTANT: [suite] of CONSTANT: [suite] else: [suite] Possible refinements: * Multiple values in a case CONSTANT, CONSTANT, ..., CONSTANT: * Ranges in a case CONSTANT..CONSTANT: although that would require something other than a dict, maybe a binary search. Also could lead to arguments about whether the endpoint should be inclusive or exclusive! Maybe it should be spelt range(CONSTANT, CONSTANT): ? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Nov 9 02:50:32 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 09 Nov 2001 15:50:32 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python Message-ID: <200111090250.PAA12195@s454.cosc.canterbury.ac.nz> As an example of a real-life application which might benefit from a fast switch-statement in Python, I'd like to offer the following excerpt from Plex, which runs as part of a tight loop once per input character. if input_state == 1: cur_pos = next_pos # Begin inlined: c = self.read_char() buf_index = next_pos - buf_start_pos if buf_index < buf_len: c = buffer[buf_index] next_pos = next_pos + 1 else: discard = self.start_pos - buf_start_pos data = self.stream.read(0x1000) buffer = self.buffer[discard:] + data self.buffer = buffer buf_start_pos = buf_start_pos + discard self.buf_start_pos = buf_start_pos buf_len = len(buffer) buf_index = buf_index - discard if data: c = buffer[buf_index] next_pos = next_pos + 1 else: c = '' # End inlined: c = self.read_char() if c == '\n': cur_char = EOL input_state = 2 elif not c: cur_char = EOL input_state = 4 else: cur_char = c elif input_state == 2: cur_char = '\n' input_state = 3 elif input_state == 3: cur_line = cur_line + 1 cur_line_start = cur_pos = next_pos cur_char = BOL input_state = 1 elif input_state == 4: cur_char = EOF input_state = 5 else: # input_state = 5 cur_char = '' # End inlined self.next_char() else: # not new_state if trace: #TRACE# print "blocked" #TRACE# # Begin inlined: action = self.back_up() if backup_state: (action, cur_pos, cur_line, cur_line_start, cur_char, input_state, next_pos) = backup_state else: action = None break # while 1 # End inlined: action = self.back_up() Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From itojun@iijlab.net Fri Nov 9 02:51:29 2001 From: itojun@iijlab.net (itojun@iijlab.net) Date: Fri, 09 Nov 2001 11:51:29 +0900 Subject: [Python-Dev] configure.in linkage error Message-ID: <529.1005274289@itojun.org> at least on NetBSD-current and 1.5.2, - sys/socket.h requires sys/types.h - NULL is not declared in sys/socket.h nor netdb.h therefore, the following patch is necessary to make getaddrinfo(3) detection successful. NetBSD starfruit.itojun.org 1.5Y NetBSD 1.5Y (STARFRUIT) #3: Tue Nov 6 17:27:18 JST 2001 itojun@starfruit.itojun.org:/usr/home/itojun/NetBSD/src/sys/arch/i386/compile/STARFRUIT i386 itojun Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.279 diff -u -r1.279 configure.in --- configure.in 2001/10/31 12:11:47 1.279 +++ configure.in 2001/11/09 02:50:03 @@ -1431,8 +1431,10 @@ # for [no]getaddrinfo in netdb.h. AC_MSG_CHECKING(for getaddrinfo) AC_TRY_LINK([ +#include #include #include +#include ],[ getaddrinfo(NULL, NULL, NULL, NULL); ], [ From MarkH@ActiveState.com Fri Nov 9 03:28:18 2001 From: MarkH@ActiveState.com (Mark Hammond) Date: Fri, 9 Nov 2001 14:28:18 +1100 Subject: [Python-Dev] PEP 273: Import Modules from Zip Archives In-Reply-To: <3BE947CC.C1F84625@interet.com> Message-ID: [James] > But sys.prefix is obtained from a search of the directory > structure for a "landmark" file, namely os.py. When the Python > library is in a zip file, it is likely that no landmark files will > be found, and sys.prefix will contain garbage. Good point! :) > > Since sys.prefix is searched for, its name is unpredictable. We > need a known location for python22.zip. > > How about using the full path name of pythonXX.dll with the last three > characters changed to "zip"? This associates the libraries with the DLL, > which is more logical than associating them with the executable. And the > file name is identical but with "zip" instead of "dll". > > Does this work, and solve all embedding problems? Sounds OK to me. Mark. From fdrake@acm.org Fri Nov 9 08:23:16 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 9 Nov 2001 03:23:16 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011109082316.750AD28697@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ A variety of miscellaneous updates have been made over the past several days, most involving typographical or grammatical corrections. Some small clarifications have been made. From skip@pobox.com (Skip Montanaro) Fri Nov 9 08:25:09 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 09 Nov 2001 09:25:09 +0100 Subject: [Python-Dev] Python's footprint In-Reply-To: <20011108165105.A29947@gerg.ca> References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> Message-ID: Greg> wouldn't you also have to specify the name of the C Greg> identifier into which that documentation is put? I think the intent is to declare the __doc__ variables using DOCSTRING: static char int_doc[] = DOCSTRING("int(x[, base]) -> integer\n\ \n\ Convert a string or number to an integer, if possible. A floating point\n\ argument will be truncated towards zero (this does not include a string\n\ representation of a floating point number!) When converting a string, use\n\ the optional base. It is an error to supply a base when converting a\n\ non-string."); Note that if implemented, the DOCSTRING macro will need a Py_ prefix. -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ From skip@pobox.com (Skip Montanaro) Fri Nov 9 08:28:15 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 09 Nov 2001 09:28:15 +0100 Subject: [Python-Dev] Python's footprint In-Reply-To: <3BEB046B.24FBB6FA@lemburg.com> References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> <3BEB046B.24FBB6FA@lemburg.com> Message-ID: mal> Even better: why not work together with Martin to have the mal> doc-strings localized ?! (One of the possible languages could mal> then be the emtpy one ;-) I realize there's a smiley there, but wouldn't such a transformation affect all string literals in the application (e.g. most exception messages)? Skip From skip@pobox.com (Skip Montanaro) Fri Nov 9 08:30:57 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 09 Nov 2001 09:30:57 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: <15339.1500.338680.203793@anthem.wooz.org> References: <15339.1500.338680.203793@anthem.wooz.org> Message-ID: BAW> we (Pythonlabs) sat down and tried to figure out what still needs BAW> to be documented for Python 2.2. We came up with an impressive BAW> list. :) BAW> Here's the plan we came up with.... How about broadcasting this request to c.l.py? Skip From mal@lemburg.com Fri Nov 9 08:13:05 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:13:05 +0100 Subject: [Python-Dev] switch-based programming in Python References: <001601c168bb$7b108da0$070210ac@private> Message-ID: <3BEB9011.CD6FA455@lemburg.com> Barry Scott wrote: > > The fall through is the source of too many defects in C/C++ code. > And its rarely used in the wild according to report on this subject > a few years ago. Ok, let's drop the fallthrough (it's not good structured programming practice anyway, even though it can help in C). > Goto in python would be a terrible thing. Sure would ;-) But where did you find a mention of "goto" ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 9 08:19:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:19:50 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <15338.29736.970751.455689@beluga.mojam.com> <200111081555.fA8FtGc01312@mira.informatik.hu-berlin.de> <3BEAAEE2.892DEA58@lemburg.com> <200111081644.fA8GiKP01438@mira.informatik.hu-berlin.de> Message-ID: <3BEB91A6.3FE993C0@lemburg.com> "Martin v. Loewis" wrote: > > > Now would such a change be acceptable if the optimization would > > only be triggered for builtin immuatble types on both sides of > > the "==" ? > > It may be even acceptable with the language change if the procedures > for introducing incompatible language changes are followed (i.e. add a > __future__ import in one version, and only use the optimization if the > future is imported; in the next version, decide that the future is > there). Perhaps we could approach other types in one of the next versions (after 2.3). It possible I'd like to avoid creating new syntax. I think having strings, unicode and numbers as first candidates would already go a long way. > When there is clearly no language change, the change would be > acceptable to me if it would cause no "significant" slow-down if the > optimization wasn't triggered. I assume you'll need a run-time test, > so it is likely that there is atleast a small slow-down due to the > additional test. Of course, only experiments can show whether the > slow-down is "significant". I'm sure we'll find some definition of "significant" which matches these criteria ;-) Seriously, the optimization should only be triggered for large enough if-elif-else constructs -- otherwise it's likely that the dictionary lookup would take longer than the sequential compares. By tuning the optimization trigger level we can assure that the optimization will be a net win with high probability. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 9 08:33:24 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:33:24 +0100 Subject: [Python-Dev] switch-based programming in Python References: <200111072121.fA7LL1K02432@mira.informatik.hu-berlin.de> <3BEA4773.6FE53E12@lemburg.com> <200111080912.fA89Cbn01413@mira.informatik.hu-berlin.de> <3BEA648E.AD3968D1@lemburg.com> <200111081543.fA8FhWl01271@mira.informatik.hu-berlin.de> <3BEAACEF.23F89989@lemburg.com> Message-ID: <3BEB94D4.2CC924DB@lemburg.com> Skip Montanaro wrote: > > >> If you think you need an annotation, you may just as well propose to > >> introduce a switch statement into the language. > > mal> True, but that would probably be even harder to get accepted on > mal> python-dev (or would it ;-) ? > > >> switch x: > >> case 'foo': > >> ... > >> case 'bar': > >> ... > >> case 42: > >> ... > > If you restrict the case values to hashable literals do you need "case"? > One new keyword would be easier than two for Guido to swallow... > > One other post I saw in this thread used explicit breaks as is required in > C. I would get rid of that. When the current case's code ends, control > flow should just jump to the end of the switch. No other block in Python > falls through like that does it? Leaving out the break statement can also > be a subtle source of errors in C code and can probably be eliminated > without much loss of expressiveness. Besides, switches (especially those > used to implement state machines) are often executed inside loops. If break > is used to terminate the current case, it's not available to break out of > the enclosing loop and you're stuck with using a try/except/raise > combination or setting some state variable and checking it at the bottom of > each loop. Very good points ! I think everybody agreed on dropping the fallthrough and break idea. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 9 08:30:26 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:30:26 +0100 Subject: [Python-Dev] switch-based programming in Python References: <3BE951F4.3B79913C@lemburg.com> <20011108115018.A474@xs4all.nl> <3BEA6D57.E381D6AB@lemburg.com> <20011108162425.D474@xs4all.nl> <3BEAA883.E72D6B54@lemburg.com> <20011108175148.E474@xs4all.nl> Message-ID: <3BEB9422.B4CEFC51@lemburg.com> Thomas Wouters wrote: > > On Thu, Nov 08, 2001 at 04:45:07PM +0100, M.-A. Lemburg wrote: > > > BTW, how did you get XS4ALL into funding the www.python.org traffic, if > > they don't heavily depend on Python ? > > That's a complicated story. I'll be happy to explain it over a beer at IPC10 > (if we both make it there I probably won't :-( > ;P) but the short version is that XS4ALL is not an > ordinary company, we use a lot of opensource software, and my boss suggested > it. The traffic is peanuts, by the way, I think the more costly part is the > rackspace in our system room. Interesting; I would have thought that the traffic would at least cost as much as the rack space (in Germany we pay EUR 6 / GB traffic). > > I think that Skip's proposal would go a long way (sketching here > > a bit): > > > It should be possible for the compiler to detect an if-elif-else > > construct which has the following signature: > > > if x == 'first':... > > elif x == 'second':... > > else:... > > [..] > > > At runtime, the interpreter would check x for being one of the > > well-known immutable types (strings, unicode, numbers) and > > use the hash table for finding the right opcode snippet. > > Hmm... I don't think this will have as much impact as you think. But testing > it like Martin suggested would be a good idea, and the compiler/interpreter > is a fun thing to play and experiment with. Well, for that application space I'm after this would most probably make a difference (you typically have >10 cases in the if-elif-else). I think I'll make this a holiday experiment and then see what the real gain is. > [ About my switch proposal ] > > I think you missed some indents in your example. I added them again, > > removing the parens around x and tweaked the formatting a bit (also > > note the addition of a few breaks). > > Actually, no, all but the parentheses were intentional. I don't like needing > the break (hence my comments about fallthrough) and I think the switch, case > and else should all be indented to the same level, just like 'if/elif/else'. Ok. > > def whatis(x): > > switch x: > > case 'one': > > print '1' > > break > > > Turns out that this look very Pythonic :-) > > I like my version better, with the exception of the parentheses around 'x' > in 'switch(x):' :) > > > Sure smells like PEP-time :-) > > Aye, but lets do it while Guido is still on paternity leave so we at least > get to finish the proposal before it's -1'ed :) Let's make it a two part PEP: one part should focus on the optimization idea and the other one on a new syntax. That'll turn the -1 into a -0.5 which gets rounded towards 0 and then makes a difference ;) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 9 08:32:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:32:00 +0100 Subject: [Python-Dev] switch & bytecodehacks? References: <15338.55173.761770.654977@beluga.mojam.com> Message-ID: <3BEB9480.A9BAD9EC@lemburg.com> Skip Montanaro wrote: > > Could bytecodehacks be made to do the heavy lifting for MAL's switch > performance testing? Might be easier than manually creating code strings. Could work... how hard is it getting the idea behind bytecodehacks ? (I've never looked into this package) Thanks for the hint, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 9 08:37:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 09:37:41 +0100 Subject: [Python-Dev] Python's footprint References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> <3BEB046B.24FBB6FA@lemburg.com> Message-ID: <3BEB95D5.AE0501FB@lemburg.com> Skip Montanaro wrote: > > mal> Even better: why not work together with Martin to have the > mal> doc-strings localized ?! (One of the possible languages could > mal> then be the emtpy one ;-) > > I realize there's a smiley there, but wouldn't such a transformation > affect all string literals in the application (e.g. most exception > messages)? No, just doc-strings. Changing exception messages could cause problems due to programs relying on them (not good style, but likely in use...). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas.heller@ion-tof.com Fri Nov 9 08:43:49 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 09:43:49 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 References: <15339.1500.338680.203793@anthem.wooz.org> Message-ID: <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> > C API 4 > - new __ slots 1 > - subclassing types 1 (JH?) > - METH_O and friends (?) > - writing a new builtin type 2 (JH) - documenting all functions declared in Python header files ? Thomas From skip@pobox.com (Skip Montanaro) Fri Nov 9 08:50:55 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 09 Nov 2001 09:50:55 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111090217.PAA12184@s454.cosc.canterbury.ac.nz> References: <200111081651.fA8GpW901465@mira.informatik.hu-berlin.de> <200111090217.PAA12184@s454.cosc.canterbury.ac.nz> Message-ID: Greg> "Martin v. Loewis" : >> switch x: >> if 'foo': >> ... >> elif 'bar': >> ... Greg> I don't like that, because the 'if' has a different meaning from Greg> usual because of being inside a construct that is perhaps some Greg> distance away visually. How about: switch x: if 'foo': ... if 'bar': ... if 'baz': ... else: ... instead? Skip From skip@pobox.com (Skip Montanaro) Fri Nov 9 09:00:37 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 9 Nov 2001 10:00:37 +0100 Subject: [Python-Dev] switch & bytecodehacks? In-Reply-To: <3BEB9480.A9BAD9EC@lemburg.com> References: <15338.55173.761770.654977@beluga.mojam.com> <3BEB9480.A9BAD9EC@lemburg.com> Message-ID: <15339.39733.652305.565293@beluga.mojam.com> >> Could bytecodehacks be made to do the heavy lifting for MAL's switch >> performance testing? Might be easier than manually creating code >> strings. mal> Could work... how hard is it getting the idea behind bytecodehacks mal> ? (I've never looked into this package) Nor have I, but Michael Hudson is the author. I'm sure his gears are already turning... ;-) Skip From mwh@python.net Fri Nov 9 10:23:39 2001 From: mwh@python.net (Michael Hudson) Date: 09 Nov 2001 05:23:39 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: Jason Tishler's message of "Thu, 8 Nov 2001 15:20:52 -0500" References: <20011108152052.B816@dothill.com> Message-ID: <2md72s4504.fsf@starship.python.net> Jason Tishler writes: > --CGQ4QxZ4DCp/f9YC > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > > Michael, > > On Thu, Nov 08, 2001 at 01:52:01PM -0500, Michael Hudson wrote: > > It works. Do you want to see the patches or shall I just check the > > changes in? > > If they are just the standard "PyObject_HEAD_INIT(NULL)" style fix, then > please just commit them. Done. > > BTW, _cursesmodule.c doesn't compile; you get things like: > > > > Warning: resolving _stdscr by linking to __imp__stdscr (auto-import) > > [snip] > > build/temp.cygwin-1.3.3-i686-2.2/_cursesmodule.o: In function `PyCurses_InitScr': > > /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1842: undefined reference to `acs_map' > > [snip] > > I'm getting a feeling of deja vu from the above, but I can't quite > remember... I believe that the above is related to the auto-import > features recently added to Cygwin's binutils. Please send me your > > /etc/setup/installed.db > > via private email. In the meantime, I will dig some on my own. OK, this will follow (from a different machine). > > and contrary to README, > > The Cygwin section of Python README has become stale -- I need to submit > a doco patch. Sigh... > > > test_poll > > The test_poll problem was fixed by: > > http://sources.redhat.com/ml/cygwin-patches/2001-q3/msg00109.html > > and released in Cygwin 1.3.4. OK... > > and threads seem to work fine. > > There is still one known Cygwin pthreads hang. If interested, see the > following for the current state of affairs: > > http://sources.redhat.com/ml/cygwin-developers/2001-10/msg00193.html makes little sense to me, I'm afraid. Haven't had test_thread die on me yet, but then I've only run it a few times. > > test_strftime is still bust, though: > > The test_strftime problem was fixed by: > > http://sources.redhat.com/ml/newlib/2001/msg00504.html > > and released in Cygwin 1.3.4. ... but if test_poll works, how come this doesn't? How do I find out which version of cygwin I have? > I've attached the latest README, since it is the most complete description > of the current state. Thanks. Cheers, M. -- NUTRIMAT: That drink was individually tailored to meet your personal requirements for nutrition and pleasure. ARTHUR: Ah. So I'm a masochist on a diet am I? -- The Hitch-Hikers Guide to the Galaxy, Episode 9 From thomas@xs4all.net Fri Nov 9 11:10:35 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 9 Nov 2001 12:10:35 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> References: <20011108162425.D474@xs4all.nl> <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> Message-ID: <20011109121035.F474@xs4all.nl> On Fri, Nov 09, 2001 at 03:36:59PM +1300, Greg Ewing wrote: > Thomas Wouters : > > switch EXPR: > > case CONSTANT: > > [suite] > > case CONSTANT: > > [suite] > > ... > > else: > Looks good, except that I'd indent the cases as well, i.e. > switch EXPR: > case CONSTANT: > [suite] > case CONSTANT: > [suite] > else: > [suite] > To my mind the cases are logically a subordinate part of the > switch statement, and the indentation should reflect that. Hmm. Perhaps. I'm not entirely convinced but it's not big an issue. > Some alternatives: > Using only one keyword: > case EXPR: > CONSTANT: > [suite] > CONSTANT: > [suite] > else: > [suite] Can't be done in a LL(1) parser such as Python's. (And Guido already stated that even if we switch to a more capable parser/tokenizer, we still need to maintain this restriction for the sake of other tools parsing Python, e.g. IDLE.) > Using two, but reminding one less of C: > case EXPR: > of CONSTANT: > [suite] > of CONSTANT: > [suite] > else: > [suite] I've been considering this... I'm not sure I like it. The 'case' instead of 'switch' has precedence (shell) but the 'of' is totally new. Not that that's necessarily a problem, but to me, the switch+case naming is a lot easier to remember. However, I am a C programmer, so I'm bound to remember it well :) > Possible refinements: > * Multiple values in a case > CONSTANT, CONSTANT, ..., CONSTANT: Meaning what ? Any one of them ? That would solve one part of the fallthrough problem, but would require tuple-constants to be parenthesised. It's probably the most pythonic solution, though. The part of the fallthrough problem it solves is where you want multiple values to trigger the same suite. In C that is: switch(spam) { case SPAM: case HAM: case EGGS: .... but that works due to fallthrough. Doing it like that for Python (and requiring 'case SPAM: pass' for the 'case' that requires an empty body) doesn't strike me as terribly elegant or Pythonic. So yes, I think 'alternatives' (so to speak) should be expressed like that, much like it is in the 'except' case. (no pun intended.) > * Ranges in a case > CONSTANT..CONSTANT: Would require range-literals. PEP 204 :-) Also keep in mind 'CONSTANT' can be any hashable constant (regular or unicode strings, ints, longs, floats, tuples.) How do you do a range of floats ? > although that would require something other than a dict, > maybe a binary search. Implementation detail. :) > Also could lead to arguments about > whether the endpoint should be inclusive or exclusive! The range-literal PEP should solve that. > Maybe it should be spelt > range(CONSTANT, CONSTANT): Hrm, possible.... but a tad obscure, not to mention hackish in the implementation. 'range(1, 10)' is almost an expression, but we can't allow expressions (even 'constant-expressions' such as C defines them) because we don't do any constant folding in Python. I'd say not to go there now. It can always be added later (just like constant-expressions.) The range issue could possibly be solved using a programming trick (I mean technique) such as: ranges = [0 for x in range(10)] + [1 for x in range(10,20)] + [... etc] switch ranges[num]: case 0: num_in_range(10) case 1: num_in_range(10,20) case 2: ... etc. For more advanced range-switching, you'd make a class to help in the range-to-index translation. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake@acm.org Fri Nov 9 13:24:16 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 9 Nov 2001 08:24:16 -0500 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> Message-ID: <15339.55552.742808.534619@grendel.zope.com> Thomas Heller writes: > - documenting all functions declared in Python header files ? A good start for this would be creating a list of what's defined that isn't documented. This should include functions, macros, constants, and types. Did you just volunteer, or are my ears playing tricks on me? ;) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From AdrianLeu@Kelseus.com Fri Nov 9 12:21:10 2001 From: AdrianLeu@Kelseus.com (Adrian Leu) Date: Fri, 9 Nov 2001 12:21:10 -0000 Subject: [Python-Dev] Embedding python in C++! Please HELP ! Message-ID: <4317512F97B4D311A7EA0050DA3DA5100B7C22@ThisAddressDoesNotExist> Hi ! I have recently started using Python, mainly attracted by the easy way in which I can write a parser for some project I have. However, the problems occurred while trying to use the debug version. I have: * built Python from the source and made sure that all the _d libraries are in the right directories (./DLLs and /.libs); * created a small module which uses _winreg to set some registry keys; * written a small C++ application that uses basic: Py_Initialize(); ... PyImport_ImportModule(my_module) PyObject_CallMethod(my_module, my_method, etc) to basically load my module and call a method associated to it. However, although everything is working OK in Release mode, in Debug mode things are not working. My module cannot be imported for some reason or another. I have tried to use Python_d.exe to do the same thing interactively and what I get is that _winreg cannot be imported. Every time I try to do this, I get a: ImportError: No module named _winreg error. although the module is there. I am getting very frustrated about this. Is it something I am doing wrong? Also, I would like to know where could I get the source for win32 extensions. I would like to build these extensions and use them later to create a small window which I can open to enter some text. Please, please help ! It must be something trivial that I miss here. Thanks. Adrian. From skip@pobox.com (Skip Montanaro) Fri Nov 9 12:45:38 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 9 Nov 2001 13:45:38 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <20011109121035.F474@xs4all.nl> References: <20011108162425.D474@xs4all.nl> <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> <20011109121035.F474@xs4all.nl> Message-ID: <15339.53234.882023.906932@beluga.mojam.com> >> switch EXPR: >> case CONSTANT: >> [suite] >> case CONSTANT: >> [suite] >> else: >> [suite] >> To my mind the cases are logically a subordinate part of the switch >> statement, and the indentation should reflect that. Thomas> Hmm. Perhaps. I'm not entirely convinced but it's not big an Thomas> issue. Well, if nothing else, I think python-mode (and maybe other Python-aware editors?) would have to make switch a special case, because it would be the only statement with a colon at the end that *didn't* indent its subordinate clauses. >> * Multiple values in a case >> CONSTANT, CONSTANT, ..., CONSTANT: Thomas> Meaning what ? Any one of them ? That would solve one part of Thomas> the fallthrough problem, but would require tuple-constants to be Thomas> parenthesised. It's probably the most pythonic solution, Thomas> though. We already have some places where to use tuples you have to parenthesize them. Perhaps this is another case of that. When unparenthesized, it represents a series of alternatives. When it does have parens it's a tuple: switch point: if (0,0): do_origin() if (10,10): do_corner() if None: do_invalid() else: do_general(point) Skip From mwh@python.net Fri Nov 9 13:05:09 2001 From: mwh@python.net (Michael Hudson) Date: 09 Nov 2001 08:05:09 -0500 Subject: [Python-Dev] switch & bytecodehacks? In-Reply-To: Skip Montanaro's message of "Thu, 8 Nov 2001 20:05:41 +0100" References: <15338.55173.761770.654977@beluga.mojam.com> Message-ID: <2mr8r82iyi.fsf@starship.python.net> Skip Montanaro writes: > Could bytecodehacks be made to do the heavy lifting for MAL's switch > performance testing? Might be easier than manually creating code strings. ... and from the archives as python.net mail delivery is bust (grr) ... > Nor have I, but Michael Hudson is the author. I'm sure his gears > are already turning... ;-) Nope. Haven't worked on bytecodehacks in about 18 months. It lost it's appeal eventually. It doesn't even work properly with Python 2.0, let alone what's in CVS. It is surely easier to work with the compiler package (and if it's not, I'm not sure what the point of it is...). It doesn't help that I don't really care about the proposed optimization... Cheers, M. -- NUTRIMAT: That drink was individually tailored to meet your personal requirements for nutrition and pleasure. ARTHUR: Ah. So I'm a masochist on a diet am I? -- The Hitch-Hikers Guide to the Galaxy, Episode 9 From mal@lemburg.com Fri Nov 9 14:05:59 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 Nov 2001 15:05:59 +0100 Subject: [Python-Dev] switch-based programming in Python References: <20011108162425.D474@xs4all.nl> <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> <20011109121035.F474@xs4all.nl> <15339.53234.882023.906932@beluga.mojam.com> Message-ID: <3BEBE2C7.B8A30360@lemburg.com> [cases for tuples in switches] Guys, this is getting overboard... let's first hammer on the general idea and then start thinking about a space or comma here and there :-) I might find some time later today to write up the various bits and pieces as a PEP tonight or on the weekend. Stay tuned, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas.heller@ion-tof.com Fri Nov 9 14:23:11 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 15:23:11 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 References: <15339.1500.338680.203793@anthem.wooz.org><03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> Message-ID: <019601c1692a$21c7a700$e000a8c0@thomasnotebook> From: "Fred L. Drake, Jr." > > Thomas Heller writes: > > - documenting all functions declared in Python header files ? > > A good start for this would be creating a list of what's defined > that isn't documented. This should include functions, macros, > constants, and types. > Did you just volunteer, or are my ears playing tricks on me? ;) Yes, I'll try. Time to learn etags and combine it with a python filter (or does anyone have another suggestion)? Thomas From Donald Beaudry Fri Nov 9 14:58:17 2001 From: Donald Beaudry (Donald Beaudry) Date: Fri, 09 Nov 2001 09:58:17 -0500 Subject: [Python-Dev] switch-based programming in Python References: <20011108162425.D474@xs4all.nl> <200111090236.PAA12192@s454.cosc.canterbury.ac.nz> <20011109121035.F474@xs4all.nl> Message-ID: <200111091458.JAA23416@localhost.localdomain> Here's another, when EXPR: in CONSTANT_TUPLE: [suite] in CONSTANT_TUPLE: [suite] ... else: [suite] ...and no fall-through, please. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@init.com Lexington, MA 02421 ...So much code, so little time... From fdrake@acm.org Fri Nov 9 16:40:49 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 9 Nov 2001 11:40:49 -0500 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: <019601c1692a$21c7a700$e000a8c0@thomasnotebook> References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> <019601c1692a$21c7a700$e000a8c0@thomasnotebook> Message-ID: <15340.1809.325711.80242@grendel.zope.com> Thomas Heller writes: > Time to learn etags and combine it with a python filter > (or does anyone have another suggestion)? That sounds quite reasonable to me! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From newsletters@the-financial-news.org Fri Nov 9 15:22:27 2001 From: newsletters@the-financial-news.org (The Financial News) Date: Fri, 9 Nov 2001 16:22:27 +0100 Subject: [Python-Dev] Production Mini-plants in mobile containers Message-ID: This is a multi-part message in MIME format --=_NextPart_2rfkindysadvnqw3nerasdf Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable The Financial News, November 2001 Production Mini-plants in mobile containers =22...Science Network will supply to countries and developing regions the technology and the necessary support for the production in series of Mini-plants in mobile=20containers (40-foot). The Mini-plant system is designed in such a way that all the production machinery is fixed on the platform of the container, with all wiring,=20piping, and installation parts; that is to say, they are fully equipped... and the mini-plant is ready for production.=22 More than 700 portable production systems: Bakeries, Steel Nails, Welding Electrodes, Tire Retreading, Reinforcement Bar Bending for Construction Framework,=20Sheeting for Roofing, Ceilings and Fa=E7ades, Plated Drums, Aluminum Buckets, Injected Polypropylene Housewares, Pressed Melamine Items (Glasses, Cups,=20Plates, Mugs, etc.), Mufflers, Construction Electrically Welded Mesh, Plastic Bags and Packaging, Mobile units of medical assistance, Sanitary Material,=20Hypodermic Syringes, Hemostatic Clamps, etc.=20 For more information: Mini-plants in mobile containers By Steven P. Leibacher, The Financial News, Editor ------------------------------------------------------------------------- If you received this in error or would like to be removed from our list, please return us indicating: remove or un-subscribe in 'subject' field, Thanks. Editor =A9 2001 The Financial News. All rights reserved. --=_NextPart_2rfkindysadvnqw3nerasdf Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
The Financial News, November 2001

Production Mini-plants in mobile containers

=22...Science Network will supply to countries and developing regions the technology and the necessary support for the production in series of Mini-plants in mobile containers (40-foot). The Mini-plant system is designed in such a way that all the production machinery is fixed on the platform of the container, with all wiring, piping, and installation parts; that is to say, they are fully equipped... and the mini-plant is ready for production.=22

More than 700 portable production systems: Bakeries, Steel Nails, Welding Electrodes, Tire Retreading, Reinforcement Bar Bending for Construction Framework, Sheeting for Roofing, Ceilings and Façades, Plated Drums, Aluminum Buckets, Injected Polypropylene Housewares, Pressed Melamine Items (Glasses, Cups, Plates, Mugs, etc.), Mufflers, Construction Electrically Welded Mesh, Plastic Bags and Packaging, Mobile units of medical assistance, Sanitary Material, Hypodermic Syringes, Hemostatic Clamps, etc.

For more information: Mini-plants in mobile containers

By Steven P. Leibacher, The Financial News, Editor

-------------------------------------------------------------------------
If you received this in error or would like to be removed from our list, please return us indicating: remove or un-subscribe in 'subject' field, Thanks. Editor
© 2001 The Financial News. All rights reserved.


--=_NextPart_2rfkindysadvnqw3nerasdf-- From thomas.heller@ion-tof.com Fri Nov 9 16:10:23 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 17:10:23 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 References: <15339.1500.338680.203793@anthem.wooz.org><03a201c168fa$b936bed0$e000a8c0@thomasnotebook><15339.55552.742808.534619@grendel.zope.com><019601c1692a$21c7a700$e000a8c0@thomasnotebook> <15340.1809.325711.80242@grendel.zope.com> Message-ID: <021001c16939$1b403690$e000a8c0@thomasnotebook> So here is a script which prints out a list of undocumented symbols. The first few lines of its output are as follows: d PyArg_GetInt d PyArg_NoArgs d PyCF_MASK d PyCF_MASK_OBSOLETE t PyCFunctionObject d PyCFunction_Check p PyCFunction_Fini d PyCFunction_GET_FLAGS d PyCFunction_GET_FUNCTION d PyCFunction_GET_SELF p PyCFunction_GetFunction t PyCellObject d PyCell_Check d PyCell_GET d PyCell_SET t PyClassObject d PyClass_Check t PyCodeObject p PyCode_Addr2Line d PyCode_Check p PyCode_New t PyCompilerFlags and here is the script itself (or should I upload it to SF?). Thomas # undoc.py # Thomas Heller, 11/2001 # """This script prints out a list of undocumented symbols found in Python include files, prefixed by their tag kind. First, a temporary file is written which contains all Python include files, with DL_IMPORT simply removed. This file is passed to ctags, and the output is parsed into a dictionary mapping symbol names to tag kinds. Then, the .tex files from Python docs are read into a giant string. Finally all symbols not found in the docs are written to standard output, prefixed with their tag kind. """ # Source directory of Python SRCDIR = r"c:\sf\python\dist\src" # Which kind of tags do we need? TAG_KINDS = "dpt" # Doc sections to use DOCSECTIONS = ["api", "ext"] # end of customization section # I'm using EXUBERANT CTAGS here - see # http://ctags.sourceforge.net # # ctags fields are separated by tabs. # The first field is the name, the last field the type: # d macro definitions (and #undef names) # e enumerators # f function definitions # g enumeration names # m class, struct, or union members # n namespaces # p function prototypes and declarations [off] # s structure names # t typedefs # u union names # v variable definitions # x extern and forward variable declarations [off] import os, glob, re, sys, tempfile INCDIR = os.path.join(SRCDIR, "Include") DOCDIR = os.path.join(SRCDIR, "Doc") def findnames(file, prefix=""): names = {} for line in file.readlines(): if line[0] == '!': continue fields = line.split() name, tag = fields[0], fields[-1] if tag == 'd' and name.endswith('_H'): continue if name.startswith(prefix): names[name] = tag return names def print_undoc_symbols(prefix="Py"): incfile = tempfile.mktemp(".h") fp = open(incfile, "w") for file in glob.glob(os.path.join(INCDIR, "*.h")): text = open(file).read() # remove all DL_IMPORT, they will confuse ctags text = re.sub("DL_IMPORT", "", text) fp.write(text) fp.close() docs = [] for sect in DOCSECTIONS: for file in glob.glob(os.path.join(DOCDIR, sect, "*.tex")): docs.append(open(file).read()) docs = "\n".join(docs) fp = os.popen("ctags --c-types=%s -f - %s" % (TAG_KINDS, incfile)) dict = findnames(fp, prefix) names = dict.keys() names.sort() for name in names: if docs.find(name) == -1: print dict[name], name os.remove(incfile) if __name__ == '__main__': print_undoc_symbols() # --- EOF --- From skip@pobox.com (Skip Montanaro) Fri Nov 9 16:15:57 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 09 Nov 2001 17:15:57 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: <021001c16939$1b403690$e000a8c0@thomasnotebook> References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> <019601c1692a$21c7a700$e000a8c0@thomasnotebook> <15340.1809.325711.80242@grendel.zope.com> <021001c16939$1b403690$e000a8c0@thomasnotebook> Message-ID: Thomas> and here is the script itself (or should I upload it to SF?). I'd just tuck it away in Tools/ somewhere. Skip From mwh@python.net Fri Nov 9 16:16:09 2001 From: mwh@python.net (Michael Hudson) Date: 09 Nov 2001 11:16:09 -0500 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: "Thomas Heller"'s message of "Fri, 9 Nov 2001 17:10:23 +0100" References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> <019601c1692a$21c7a700$e000a8c0@thomasnotebook> <15340.1809.325711.80242@grendel.zope.com> <021001c16939$1b403690$e000a8c0@thomasnotebook> Message-ID: <2mn11v6hti.fsf@starship.python.net> "Thomas Heller" writes: > So here is a script which prints out a list of > undocumented symbols. > > The first few lines of its output are as follows: [...] > and here is the script itself (or should I upload it to SF?). Put it in Tools/scripts/, I'd have thought. Cheers, M. -- On the other hand, the following areas are subject to boycott in reaction to the rampant impurity of design or execution, as determined after a period of study, in no particular order: ... http://www.naggum.no/profile.html From thomas.heller@ion-tof.com Fri Nov 9 16:20:54 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 17:20:54 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> <019601c1692a$21c7a700$e000a8c0@thomasnotebook> <15340.1809.325711.80242@grendel.zope.com> <021001c16939$1b403690$e000a8c0@thomasnotebook> <2mn11v6hti.fsf@starship.python.net> Message-ID: <02b201c1693a$93bf9330$e000a8c0@thomasnotebook> > Put it in Tools/scripts/, I'd have thought. > I would feel better if someone could test it on his installation before. Thomas From fdrake@acm.org Fri Nov 9 17:15:34 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 9 Nov 2001 12:15:34 -0500 Subject: [Python-Dev] Documentation TODO for Python 2.2 In-Reply-To: <2mn11v6hti.fsf@starship.python.net> References: <15339.1500.338680.203793@anthem.wooz.org> <03a201c168fa$b936bed0$e000a8c0@thomasnotebook> <15339.55552.742808.534619@grendel.zope.com> <019601c1692a$21c7a700$e000a8c0@thomasnotebook> <15340.1809.325711.80242@grendel.zope.com> <021001c16939$1b403690$e000a8c0@thomasnotebook> <2mn11v6hti.fsf@starship.python.net> Message-ID: <15340.3894.298501.254410@grendel.zope.com> Michael Hudson writes: > Put it in Tools/scripts/, I'd have thought. Actually, I'd stick it in Doc/tools/, since it's usefulness is limited to the Python documentation. From there, it can use it's own location to find the other directories so it doesn't need to be sensitive to the current directory and doesn't need configuration constants to find the headers and documentation. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From thomas.heller@ion-tof.com Fri Nov 9 16:52:58 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 17:52:58 +0100 Subject: [Python-Dev] Documentation TODO for Python 2.2 References: <15339.1500.338680.203793@anthem.wooz.org><03a201c168fa$b936bed0$e000a8c0@thomasnotebook><15339.55552.742808.534619@grendel.zope.com><019601c1692a$21c7a700$e000a8c0@thomasnotebook><15340.1809.325711.80242@grendel.zope.com><021001c16939$1b403690$e000a8c0@thomasnotebook><2mn11v6hti.fsf@starship.python.net> <15340.3894.298501.254410@grendel.zope.com> Message-ID: <036401c1693f$0e0b4400$e000a8c0@thomasnotebook> > Actually, I'd stick it in Doc/tools/, since it's usefulness is > limited to the Python documentation. From there, it can use it's own > location to find the other directories so it doesn't need to be > sensitive to the current directory and doesn't need configuration > constants to find the headers and documentation. Checked in - Doc/tools/undoc_symbols.py but-I-wont-document-all-these-symbols-found-ly Thomas From thomas.heller@ion-tof.com Fri Nov 9 17:43:02 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 9 Nov 2001 18:43:02 +0100 Subject: [Python-Dev] __class_init__ References: <052501c1661f$b21e8b10$e000a8c0@thomasnotebook> Message-ID: <052701c16946$0d58dfc0$e000a8c0@thomasnotebook> I'll retract this request. Thomas ----- Original Message ----- From: "Thomas Heller" To: Sent: Monday, November 05, 2001 6:31 PM Subject: [Python-Dev] __class_init__ > ExtensionClass did include a __class_init__ class-method, > which has been called at the end of class creation. > > I've uploaded a (simple minded) patch to fileobject.c, > which implements the same behaviour for new style classes. > It is simple minded because it only iterates over the > tp_base member and not over tp_bases. Any chance this could > be included in Python 2.2? > > http://sourceforge.net/tracker/index.php?func=detail&aid=478374&group_id=5470&atid=305470 > > Thomas > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From martin@v.loewis.de Fri Nov 9 17:45:15 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 9 Nov 2001 18:45:15 +0100 Subject: [Python-Dev] Embedding python in C++! Please HELP ! Message-ID: <200111091745.fA9HjFo01505@mira.informatik.hu-berlin.de> > Is it something I am doing wrong? I think it is two things that you do wrong: 1. You are posting this question to python-dev. Please don't, use python-list@python.org instead. 2. You should build all extension modules for debug as well. In debug mode, importing winreg will look for winreg_d.pyd. HTH, Martin From tim.one@home.com Fri Nov 9 17:57:23 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 12:57:23 -0500 Subject: [Python-Dev] test_email failing Message-ID: C:\Code\python\PCbuild>python ../lib/test/regrtest.py test_email test_email test test_email failed -- Traceback (most recent call last): File "../lib/test\test_email.py", line 928, in test_formatdate gtime = time.strptime(gdate.split()[4], '%H:%M:%S') AttributeError: 'module' object has no attribute 'strptime' 1 test failed: test_email strptime is not available on all boxes (not even on all Unix boxes). From tim.one@home.com Fri Nov 9 18:00:27 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 13:00:27 -0500 Subject: [Python-Dev] test_file failing on Windows Message-ID: C:\Code\python\PCbuild>python ../lib/test/test_file.py bad error message for invalid mode: [Errno 0] Error: '@test' C:\Code\python\PCbuild> From jeremy@zope.com Fri Nov 9 18:03:17 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 9 Nov 2001 13:03:17 -0500 (EST) Subject: [Python-Dev] Re: test_file failing on Windows In-Reply-To: References: Message-ID: <15340.6757.570508.925091@slothrop.digicool.com> Heh. Didn't expect it would work, but knew I'd find out soon enough. Jeremy From paul@svensson.org Fri Nov 9 18:46:23 2001 From: paul@svensson.org (Paul Svensson) Date: Fri, 9 Nov 2001 13:46:23 -0500 (EST) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111091458.JAA23416@localhost.localdomain> Message-ID: On Fri, 9 Nov 2001, Donald Beaudry wrote: >Here's another, > > when EXPR: > in CONSTANT_TUPLE: > [suite] > in CONSTANT_TUPLE: > [suite] > ... > else: > [suite] > >...and no fall-through, please. No comment on the choice of keywords, but now that I see it, you're absolutely right on the indentation of the "else". /Paul From barry@wooz.org Fri Nov 9 19:32:20 2001 From: barry@wooz.org (Barry A. Warsaw) Date: Fri, 9 Nov 2001 14:32:20 -0500 Subject: [Python-Dev] Re: test_email failing References: Message-ID: <15340.12100.741251.891990@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> strptime is not available on all boxes (not even on all Unix TP> boxes). Dang. I forgot. Try it now. From jeremy@zope.com Fri Nov 9 19:32:42 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 9 Nov 2001 14:32:42 -0500 (EST) Subject: [Python-Dev] Re: test_file failing on Windows In-Reply-To: References: Message-ID: <15340.12122.864950.526560@slothrop.digicool.com> open_the_file() uses errno to check the return value of fopen(). On Windows, should it use GetLastError() ? It seems that errno usually does the right thing on Windows, in that it's usually set to the expected error when fopen() fails. The change I made checks for EINVAL on Unix, which means the mode argument was invalid. It looks like Windows sets errno to 0 in this case. I wondered if GetLastError() would provide a more helpful error code. Jeremy From barry@zope.com Fri Nov 9 19:50:06 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 9 Nov 2001 14:50:06 -0500 Subject: [Python-Dev] test_socket.py failing Message-ID: <15340.13166.651180.173268@anthem.wooz.org> % ./python Lib/test/test_socket.py socket.error anthem 192.168.1.2 anthem.wooz.org ['anthem', 'www.wooz.org', 'www'] ['192.168.1.2'] ['anthem.wooz.org', 'anthem', 'www.wooz.org', 'www'] 23 Traceback (most recent call last): File "Lib/test/test_socket.py", line 104, in ? socket.getnameinfo(('x', 0, 0, 0), 0) socket.error: IPv4 sockaddr must be 2 tuple From barry@zope.com Fri Nov 9 19:53:01 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 9 Nov 2001 14:53:01 -0500 Subject: [Python-Dev] test_socket.py failing References: <15340.13166.651180.173268@anthem.wooz.org> Message-ID: <15340.13341.191267.432021@anthem.wooz.org> >>>>> "BAW" == Barry A Warsaw writes: BAW> % ./python Lib/test/test_socket.py socket.error anthem BAW> 192.168.1.2 anthem.wooz.org ['anthem', 'www.wooz.org', 'www'] BAW> ['192.168.1.2'] ['anthem.wooz.org', 'anthem', 'www.wooz.org', BAW> 'www'] 23 Traceback (most recent call last): | File "Lib/test/test_socket.py", line 104, in ? | socket.getnameinfo(('x', 0, 0, 0), 0) BAW> socket.error: IPv4 sockaddr must be 2 tuple This patch "fixes" the test, but I'm not sure it's right. -Barry -------------------- snip snip -------------------- Index: test_socket.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/test/test_socket.py,v retrieving revision 1.21 diff -u -r1.21 test_socket.py --- test_socket.py 2001/11/02 23:34:52 1.21 +++ test_socket.py 2001/11/09 19:52:32 @@ -102,7 +102,7 @@ try: # On some versions, this crashes the interpreter. socket.getnameinfo(('x', 0, 0, 0), 0) -except socket.gaierror: +except socket.error: pass canfork = hasattr(os, 'fork') From tim.one@home.com Fri Nov 9 20:36:00 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 15:36:00 -0500 Subject: [Python-Dev] Re: test_email failing In-Reply-To: <15340.12100.741251.891990@anthem.wooz.org> Message-ID: test_email and test_file pass on Windows again (thanks!). Now test_compile fails: test_compile test test_compile produced unexpected output: ********************************************************************** *** line 2 of expected output missing: - testing complex args ********************************************************************** 1 test failed: test_compile I expect it's due to failing to regenerate the expected-output file, but whoever changed it can resolve that faster than me. From martin@v.loewis.de Fri Nov 9 20:47:56 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 9 Nov 2001 21:47:56 +0100 Subject: [Python-Dev] test_socket.py failing Message-ID: <200111092047.fA9KluP07717@mira.informatik.hu-berlin.de> > This patch "fixes" the test, but I'm not sure it's right. It is surprising that the code got that far. Do you have a computer named "x" in your environment? If yes, your patch is right. If no, it appears something is wrong with your getaddrinfo implementation (as called in socketmodule.c:2562). What system are you using? If you have the time to investigate, I'd be curious to find out why getaddrinfo doesn't give an error. If you don't have the time, just commit the patch. Regards, Martin From tim.one@home.com Fri Nov 9 20:57:43 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 15:57:43 -0500 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: <15340.12122.864950.526560@slothrop.digicool.com> Message-ID: [Jeremy] > open_the_file() uses errno to check the return value of fopen(). On > Windows, should it use GetLastError() ? The MS docs don't document anything about fopen failure modes, beyond that it returns NULL then. Strongly doubt GetLastError would help, as that's really for Win32 API calls ... OK, it's a bug in MS's internal _wopenfile.c: if the first character of the mode string is not in "rwa", that routine reutns NULL without setting errno. OTOH, if the *prefix* of the mode string is OK, it simply ignores trailing garbage (which, BTW, is OK by the C std): >>> open('ga', 'wb and then a bunch of useless crap') >>> open('ga', 'w!+ and the + is ignored') >>> > It seems that errno usually does the right thing on Windows, in that > it's usually set to the expected error when fopen() fails. Yes, but their error-in-mode code is braindead. The lower-level MS _tsopen does try to set errno on a bad mode: switch( oflag & (_O_RDONLY | _O_WRONLY | _O_RDWR) ) { case _O_RDONLY: /* read access */ fileaccess = GENERIC_READ; break; case _O_WRONLY: /* write access */ fileaccess = GENERIC_WRITE; break; case _O_RDWR: /* read and write access */ fileaccess = GENERIC_READ | GENERIC_WRITE; break; default: /* error, bad oflag */ errno = EINVAL; _doserrno = 0L; /* not an OS error */ return -1; } but the calling routine can't pass it a bad oflag so the EINVAL never triggers. > The change I made checks for EINVAL on Unix, which means the mode > argument was invalid. It looks like Windows sets errno to 0 in this > case. Actually not, and this looks like an arguable bug in our code: we should explictly set errno to 0 before calling fopen. The errno we see after fopen fails on Windows may be whatever value it had before the call. I'll do that. > I wondered if GetLastError() would provide a more helpful error code. Sorry, not in this case. From barry@zope.com Fri Nov 9 21:13:03 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 9 Nov 2001 16:13:03 -0500 Subject: [Python-Dev] test_socket.py failing References: <200111092047.fA9KluP07717@mira.informatik.hu-berlin.de> Message-ID: <15340.18143.29846.820277@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> This patch "fixes" the test, but I'm not sure it's right. MvL> It is surprising that the code got that far. Do you have a MvL> computer named "x" in your environment? If yes, your patch is MvL> right. I believe the answer is a qualified "yes" because of the funky wildcarding going on in my ISP's nameserver: >>> socket.gethostbyname('x') '66.92.162.103' MvL> If you have the time to investigate, I'd be curious to find MvL> out why getaddrinfo doesn't give an error. If you don't have MvL> the time, just commit the patch. Thanks, I committed the patch. -Barry From jack@oratrix.nl Fri Nov 9 21:30:42 2001 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 09 Nov 2001 22:30:42 +0100 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: Message by "Tim Peters" , Fri, 9 Nov 2001 15:57:43 -0500 , Message-ID: <20011109213047.00D5E1162D7@oratrix.oratrix.nl> Recently, "Tim Peters" said: > [Jeremy] > > open_the_file() uses errno to check the return value of fopen(). On > > Windows, should it use GetLastError() ? > > The MS docs don't document anything about fopen failure modes, beyond that > it returns NULL then. Same here with CodeWarrior on the Mac: stdio errors return NULL or -1 and that is it, errno isn't touched, not even for fopen() file not found, etc. If the ANSI standard requires errno to be set and people can point me to the right section I can submit an error report... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tim.one@home.com Fri Nov 9 22:50:42 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 17:50:42 -0500 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: <20011109213047.00D5E1162D7@oratrix.oratrix.nl> Message-ID: >> The MS docs don't document anything about fopen failure modes, >> beyond that it returns NULL then. [Jack Jansen] > Same here with CodeWarrior on the Mac: stdio errors return NULL or -1 > and that is it, errno isn't touched, not even for fopen() file not > found, etc. MS does set errno in most cases; the failure to set it for bad fopen() mode strings appears to be a bug in their code. > If the ANSI standard requires errno to be set and people can point me > to the right section I can submit an error report... No such luck, Jack: errno has always been mostly folklore in the C std, and is almost pure folklore in C99. The only mandatory defined errno values are EDOM, ERANGE and EILSEQ now, and under C99 a system is never required to set EDOM anymore, and ERANGE is required in only a handful of string->number conversion routines now. But if CodeWarrior claims conformance with any number of "OS-like" stds, the latter have elaborate errno requirements; e.g., see http://www.opengroup.org/onlinepubs/7908799/xsh/fopen.html From tim.one@home.com Fri Nov 9 23:07:23 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 18:07:23 -0500 Subject: [Python-Dev] __class_init__ In-Reply-To: <052701c16946$0d58dfc0$e000a8c0@thomasnotebook> Message-ID: [Thomas Heller] > I'll retract this request. Why? I haven't had time to look into it yet, but it's definitely on my agenda. Why don't you want it anymore? From barry.alan.scott@ntlworld.com Fri Nov 9 23:37:49 2001 From: barry.alan.scott@ntlworld.com (Barry Scott) Date: Fri, 9 Nov 2001 23:37:49 -0000 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <3BEB9011.CD6FA455@lemburg.com> Message-ID: <000f01c16977$8ad368e0$070210ac@private> switch with fall through is basically a goto in thin disguise. BArry -----Original Message----- From: M.-A. Lemburg [mailto:mal@lemburg.com] Sent: 09 November 2001 08:13 To: barry.alan.scott@ntlworld.com Cc: python-dev@python.org Subject: Re: [Python-Dev] switch-based programming in Python Barry Scott wrote: > > The fall through is the source of too many defects in C/C++ code. > And its rarely used in the wild according to report on this subject > a few years ago. Ok, let's drop the fallthrough (it's not good structured programming practice anyway, even though it can help in C). > Goto in python would be a terrible thing. Sure would ;-) But where did you find a mention of "goto" ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Fri Nov 9 23:51:44 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 18:51:44 -0500 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111081651.fA8GpW901465@mira.informatik.hu-berlin.de> Message-ID: [unattributed] > One new keyword would be easier than two for Guido to swallow... [Martin v. Loewis] > You'd introduce it through a __future__ import, so it wouldn't matter > if it is one or two. No, __future__ is a mechanism for barely tolerating incompatible change, not for inviting it. The only keyword added since "assert" is "yield", and it's the only one planned; every new keyword is going to break someone's code, and new keywords are still resisted mightily. However, I'm looking forward to seeing how Guido manages to add nice syntax for classmethods, staticmethods, properties, metatype selection, super, and __slots__ in 2.3 without any other new keywords . From tim.one@home.com Fri Nov 9 23:57:45 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 9 Nov 2001 18:57:45 -0500 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <20011109121035.F474@xs4all.nl> Message-ID: [Thomas Wouters] > How do you do a range of floats ? Bring flowers, and buy them all nice dinners. Try not to be *too* obvious that you're out to do them, though. knows-his-floats-ly y'rs - tim From Prabhu Ramachandran Sat Nov 10 08:30:25 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Sat, 10 Nov 2001 14:00:25 +0530 Subject: [Python-Dev] Proposal for a modified import mechanism. Message-ID: <15340.58785.94412.122093@monster.linux.in> Hi, Sorry about the cross posting. Over the last couple of weeks I described a problem and inconsistency with the way Python imports modules. For more details look at this thread: http://mail.python.org/pipermail/python-list/2001-November/070719.html In short, currently, import allows one to use non-absolute module names for modules that are in the current directory. If a module is not in the same directory, import then looks for modules in sys.path. Consequently, dealing with packages that are re-nested is a pain. Complex package structure also causes problems. I'd like to note that I was also not the only person who suffered from this issue -- four others on mentioned similar problems and some asked me to let them know if I found a solution. Subsequently, I proposed another approach that first looks in the local directory and then walks up the current package tree looking for modules before looking at sys.path. I also modified knee.py to obtain a working solution. More information is here: http://mail.python.org/pipermail/python-list/2001-November/071212.html the threading is messed up and starts here: http://mail.python.org/pipermail/python-list/2001-November/071218.html You can find the new module and a simple test package here: http://av.stanford.edu/~prabhu/download/ There is also a slightly enhanced version of knee.py included that supports caching module lookup failures suggested by Rainer and Gordon in: http://mail.python.org/pipermail/python-list/2001-November/071218.html it also fixes a bug where the parent package is an extension module. I therefore have a working import style that seems to handle importing packages in a more natural(?) and consistent(?) manner. I've also tested it out with a large package like scipy (http://www.scipy.org) with no trouble or significant performance problems: http://mail.python.org/pipermail/python-list/2001-November/071325.html I'd like to ask the Python developers if they'd consider (a) changing the way the current import works to do what I proposed, or, (b) add a new keyword like 'rimport' (or something) that does this recursive search through parent packages and loads modules. This was actually suggested by Gordon McMillan. Gordon actually suggested something stronger -- import only supports absolute names, rimport is relative import and rrimport is a recursive relative import. But this would break the current import since import currently aupports some relative lookup. So maybe import and rimport is a workable solution? (c) patch the existing knee.py with my fixes. Note: these fixes have nothing to do with the recirsive module lookup stuff -- knee.py is merely an improved version of the older one. Thanks for listening patiently and sorry again for all the cross posting. prabhu From mal@lemburg.com Sat Nov 10 18:16:16 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 10 Nov 2001 19:16:16 +0100 Subject: [Python-Dev] First Draft: PEP "Switching on Multiple Values" Message-ID: <3BED6EF0.BA8EBFAF@lemburg.com> Here's a first draft of a PEP on the subject we recently discussed on python-dev. I intend to do a few more rounds here and then post the PEP for final discussion to python-list and python-dev. Barry, could you assign a PEP number ? Thanks. -- PEP: 02XX Title: Switching on Multiple Values Version: $Revision: 1.0 $ Author: mal@lemburg.com (Marc-Andre Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 10-Nov-2001 Post-History:=20 Abstract This PEP proposes strategies to enhance Python's performance with respect to handling switching on a single variable having one of multiple possible values. Problem Up to Python 2.1, the typical way of writing multi-value switches=20 has been to use long switch constructs of the following type: if x =3D=3D 'first state': ... elif x =3D=3D 'second state': ... elif x =3D=3D 'third state': ... elif x =3D=3D 'fourth state': ... else: # default handling ... This works fine for short switch constructs, since the overhead of repeated loading of a local (the variable x in this case) and comparing it to some constant is low (it has a complexity of O(n) on average). However, when using such a construct to write a state machine such as is needed for writing parsers the number of possible states can easily reach 10 or more cases. The current solution to this problem lies in using a dispatch table to find the case implementing method to execute depending on the value of the switch variable (this can be tuned to have a complexity of O(1) on average, e.g. by using perfect hash tables). This works well for state machines which require complex and lengthy processing in the different case methods. It does perform well for ones which only process one or two instructions per case, e.g. def handle_data(self, data): self.stack.append(data) =20 A nice example of this is the state machine implemented in pickle.py which is used to serialize Python objects. Other prominent cases include XML SAX parsers and Internet protocol handlers. Proposed Solutions This PEP proposes two different but not necessarily conflicting solutions: 1. Adding an optimization to the Python compiler and VM which detects the above if-elif-else construct and generates special opcodes for it which use an read-only dictionary for storing jump offsets. 2. Adding new syntax to Python which mimics the C style switch statement. The first solution has the benefit of not relying on new keywords to the language, while the second looks cleaner. Both involve some run-time overhead to assure that the switching variable is immutable and hashable. Solution 1: Optimizing if-elif-else XXX This section currently only sketches the design. Issues: The new optimization should not change the current Python semantics (by reducing the number of __cmp__ calls and adding __hash__ calls in if-elif-else constructs which are affected by the optimiztation). To assure this, switching can only safely be implemented either if a "from __future__" style flag is used, or the switching variable is one of the builtin immutable types: int, float, string, unicode, etc. To prevent post-modifications of the jump-table dictionary (which could be used to reach protected code), the jump-table will have to be a read-only type (e.g. a read-only dictionary). The optimization should only be used for if-elif-else constructs which have a minimum number of n cases (where n is a number which has yet to be defined depending on performance tests). Implementation: It should be possible for the compiler to detect an if-elif-else construct which has the following signature: if x =3D=3D 'first':... elif x =3D=3D 'second':... else:... (ie. the left hand side alwys references the same variable, the right hand side some hashable immutable builtin type) The compiler could then setup a read-only (perfect) hash table, store it in the constants and add an opcode SWITCH which triggers the following run-time behaviour: At runtime, SWITCH would check x for being one of the well-known immutable types (strings, unicode, numbers) and use the hash table for finding the right opcode snippet. Solutions 2: Adding a switch statement to Python XXX This section currently only sketches the design. Syntax: switch EXPR: case CONSTANT: [suite] case CONSTANT: [suite] ... else: [suite] (modulo indentation variations) Implementation: The compiler would have to generate code similar to this: def whatis(x): switch(x): case 'one':=20 print '1' case 'two':=20 print '2' case 'three':=20 print '3' else:=20 print "D'oh!" into (ommitting POP_TOP's and SET_LINENO's): 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (switch-table-1) 12 SWITCH 26 (to 38) 14 LOAD_CONST 2 ('1') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP 43 22 LOAD_CONST 3 ('2') 25 PRINT_ITEM 26 PRINT_NEWLINE 27 JUMP 43 30 LOAD_CONST 4 ('3') 33 PRINT_ITEM 34 PRINT_NEWLINE 35 JUMP 43 38 LOAD_CONST 5 ("D'oh!") 41 PRINT_ITEM 42 PRINT_NEWLINE >>43 LOAD_CONST 0 (None) 46 RETURN_VALUE =20 Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. Issues: The switch statement should not implement fall-through behaviour (as does the switch statement in C). Each case defines a complete and independent suite; much like in a if-elif-else statement. This also enables using break in switch statments inside loops. There have been other proposals for the syntax which reuse existing keywords and avoid adding two new ones ("switch" and "case"). Others have argued that the keywords should use new terms to avoid confusion with the C keywords of the same name but slightly different semantics (e.g. fall-through without break). Some of the proposed variants: case EXPR: of CONSTANT: [suite] of CONSTANT: [suite] else: [suite] case EXPR: if CONSTANT: [suite] if CONSTANT: [suite] else: [suite] when EXPR: in CONSTANT_TUPLE: [suite] in CONSTANT_TUPLE: [suite] ... else: [suite] =20 The switch statement could be extended to allow tuples of values for one section (e.g. case 'a', 'b', 'c': ...). Another proposed extension would allow ranges of values (e.g. case 10..14: ...). These should probably be post-poned, but already kept in mind when designing and implementing a first version. Scope XXX Explain "from __future__ import switch" Credits Martin von L=F6wis (issues with the optimization) Thomas Wouters (switch statement + byte code compiler example) Skip Montanaro (dispatching ideas) Donald Beaudry (switch syntax) Greg Ewing (switch syntax) Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From arigo@ulb.ac.be Sat Nov 10 19:41:47 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Sat, 10 Nov 2001 20:41:47 +0100 (MET) Subject: [Python-Dev] Psyco 0.3.1 Message-ID: Hello everybody, Finally, the first usable version of Psyco. It compiles small but realistic code examples. http://homepages.ulb.ac.be/~arigo/psyco/psyco-0.3.1.tgz The included text files can also be read at http://homepages.ulb.ac.be/~arigo/psyco/psyco-0.3.1/README.txt and http://homepages.ulb.ac.be/~arigo/psyco/psyco-0.3.1/ISSUES.txt . Not all Python bytecodes are implemented, but you can already do quite a lot. The missing ones are the ones that cannot be really optimized any more, so they should be easy to implement as a call to a run-time function. I used Python 2.2b1's new typing system to figure out when an attribute is a method. It should still compile with older Python version but you will not get method access optimization. There are a few cases in which Psyco does not exactly follow Python's semantics, mainly with globals/builtins. Builtins are assumed never to change, and we assume that the user will not later add or remove a global variable that will shadow or expose a builtin. This issue and others are discussed in detail in ISSUES.txt. Produced code buffers are never freed. This is a problem for larger examples. See ISSUES.txt. This file will also give you cool examples of Python code which are known tricks to speed up the interpreter but which are no longer needed -- Psyco is able to optimize them correctly. One such nice example is 'range', which is now actually faster than 'xrange', and completely transparent (for x in sequence is equivalent to for i in range(len(sequence)): x = sequence[i]). Inspecting the produced machine code is now possible if you have GDB (the GNU debugger) installed. There is a script that presents the produced cross-calling code buffers as cross-linked HTML pages. And finally... (drums...) some timing results. This is for my notebook Intel PIII 700Mhz with all optimization turned on. Note however that none of these small examples is typical of Python code; they are all small compact algorithms not using complex data structures. f1() is designed specifically for the purpose of showing how fast Psyco is ;-) It works on integers only, and is not at all a real-world algorithm. f2() and f3() are too trivial to be timed. f4() is a function that loads a list of files and counts the number of times each character appears. Ok, this is also a case in which Psyco will clearly win. f5() is like f4() but completely loads the files in memory, whereas f4() loads them line by line. f6() computes the factorial modulo p, using long integers, whose operations are not optimized by Psyco. f7() is the Mandelbrot set. Float/complex not optimized by Psyco. In the latter two examples the gain is only on the stuff around the actual computations. The table is in seconds, as returned by time.clock(), so the numbers don't include delays for loading the (relatively large) code of Psyco itself nor any swapping. Psyco, first call Psyco, next calls Python -OO go1() 0.03 0.02 to 0.03 3.63 go4() 0.35 0.33 3.59 go5() 0.29 0.27 3.47 go6() 0.87 0.85 1.03 to 1.07 go7() 2.16 to 2.18 2.16 to 2.18 2.83 Interesting, isn't it ? A bientot, Armin. From ej@ee.duke.edu Sat Nov 10 19:03:02 2001 From: ej@ee.duke.edu (eric) Date: Sat, 10 Nov 2001 14:03:02 -0500 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. References: <3BED6E5B.8E142057@arakne.com> Message-ID: <033301c16a1a$59c1c690$c300a8c0@ericlaptop> I have to agree with Prabhu on this one. The current behavior of import, while fine for standard modules and even simple packages with a single level, is sub-optimal for packages that contain sub-packages. The proposed behavior solves the problem. Handling the packaging issues in SciPy was difficult, and even resulted in a (not always popular) decision to build and overwrite the Numeric package on machines that install SciPy. Prabhu's import doesn't resolve all the issues (I think packages may just be difficult...), but it would have solved this one. The proposed import allows us to put our own version of Numeric in the top SciPy directory. Then all SciPy sub-packages would grab this one instead of an existing site-packages/Numeric. That makes SciPy self-contained and allows people to try it out without worrying that it might break their current installation. There are other solutions to this problem, but Prabhu's fix is by far the easiest and most robust. Prabhu's import also has some other nice benefits. Some of the sub-packages in SciPy are useful outside of SciPy. Also sometimes it is easier to develop a packages outside of the SciPy framework. It would be nice to be able to develop a module or package 'foo' outside of SciPy and then move it into SciPy at a later date. However, every SciPy sub-package that referred to foo prior to its inclusion in SciPy now has to be updated from 'import foo' to 'import scipy.foo'. These kind of issues make it very painful and time consuming to rearrange package structures or move modules and sub-packages in and out of the package. Simplifying this will improves package development. > I'm personnally against anything that enlarges the search path uselessly; Hopefully I've explained why it is useful for complex packages. > because the obvious reason of increased name space collision, increased > run-time overhead etc... I'm missing something here because I don't understand why this increases name space collision. If the objection is to the fact that SciPy can have a version of Numeric in it that masks a Numeric installed in site-packages, I guess I consider this a feature, not a bug. Afterall, this is already the behavior for single level packages, extending it to multi-level packages seems natural. If this isn't your objection, please explain. The current runtime overhead isn't so bad. Prabhu sent me a few numbers on the SciPy import (which contains maybe 10-15 nested packages). I attached them below -- the overhead is less than 10%. It should be negligible for standard modules as only packages are really affected (right Prabhu?). $ python >>> import time >>> s = time.time (); import scipy; print time.time()-s 1.37971198559 >>> $ python >>> import my_import >>> import time >>> s = time.time (); import scipy; print time.time()-s 1.48667407036 There may be technical issues under the covers that make this hairier than it appears, but, from the standpoint of someone working on a large multi-level package, it looks like a good idea. see ya, eric ----- Original Message ----- From: "Frederic Giacometti" To: Cc: ; ; Sent: Saturday, November 10, 2001 1:13 PM Subject: Re: Proposal for a modified import mechanism. > > > > > > I'd like to ask the Python developers if they'd consider > > > > (a) changing the way the current import works to do what I > > proposed, or, > > > > (b) add a new keyword like 'rimport' (or something) that does this > > recursive search through parent packages and loads modules. This > > was actually suggested by Gordon McMillan. Gordon actually > > suggested something stronger -- import only supports absolute > > names, rimport is relative import and rrimport is a recursive > > relative import. But this would break the current import since > > import currently aupports some relative lookup. So maybe import > > and rimport is a workable solution? > > I'd rather introduce a __parent__ module attribute (in addition to the > existing __name__) so that, for instance, the following would do your job: > > from __parent__.__parent__.toto import something > > In its spirit, this is similar to the '..' of the file systems. > > For top-level modules, __parent__ would be None. > > I'm personnally against anything that enlarges the search path uselessly; > because the obvious reason of increased name space collision, increased > run-time overhead etc... > > Frederic Giacometti > > > From frederic.giacometti@arakne.com Sat Nov 10 21:43:54 2001 From: frederic.giacometti@arakne.com (Frederic Giacometti) Date: Sat, 10 Nov 2001 13:43:54 -0800 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. References: <3BED6E5B.8E142057@arakne.com> <033301c16a1a$59c1c690$c300a8c0@ericlaptop> Message-ID: <3BED9F9A.30F77DB3@arakne.com> eric wrote: > I have to agree with Prabhu on this one. The current behavior of import, > while fine for standard modules and even simple packages with a single > level, is sub-optimal for packages that contain sub-packages. The proposed > behavior solves the problem. > > Handling the packaging issues in SciPy was difficult, and even resulted in a > (not always popular) decision to build and overwrite the Numeric package on > machines that install SciPy. Prabhu's import doesn't resolve all the issues > (I think packages may just be difficult...), but it would have solved this > one. The proposed import allows us to put our own version of Numeric in the > top SciPy directory. Then all SciPy sub-packages would grab this one > instead of an existing site-packages/Numeric. But then, this is not an import problem. If you use Numeric, you call Numeric. If you call something other than Numeric, just give a different name, and all the confusion will go away. If you're worried that you've already encoded the Numeric name 50 times into 300 files; run a python script over these 300 files; this will do the renaming of the 15.000 occurences of the Numeric name. > That makes SciPy > self-contained and allows people to try it out without worrying that it > might break their current installation. There are other solutions to this > problem, but Prabhu's fix is by far the easiest and most robust. And then, in maintenance/integration phase, sometimes 'Numeric' will call Numeric, some other times it will your package ? What if somebody, for some reason I know nothing of (e.g. probably some integration) wants to call Numeric and your Numeric package in the same module ? Wish them tough luck to sort out this poisoned gift.... > Prabhu's import also has some other nice benefits. Some of the sub-packages > in SciPy are useful outside of SciPy. Also sometimes it is easier to > develop a packages outside of the SciPy framework. It would be nice to be > able to develop a module or package 'foo' outside of SciPy and then move it > into SciPy at a later date. However, every SciPy sub-package that referred > to foo prior to its inclusion in SciPy now has to be updated from 'import > foo' to 'import scipy.foo'. These kind of issues make it very painful and > time consuming to rearrange package structures or move modules and > sub-packages in and out of the package. There are basic python scripts which do this painlessly. If you're really working on a large project, there's a project architect which normally would take care of such things, and for whom this should not be a too much of a problem. > Simplifying this will improves > package development. > > > I'm personnally against anything that enlarges the search path uselessly; > > Hopefully I've explained why it is useful for complex packages. Python helps in many areas, but expecting it to palliate for the package design and architecture flaws that inexorably surface anytimes something non-trivial is developped, might be somehow at the edge. Python has not yet replaced the need for relevant software architects. > > > because the obvious reason of increased name space collision, increased > > run-time overhead etc... > > I'm missing something here because I don't understand why this increases > name space collision. If the objection is to the fact that SciPy can have a > version of Numeric in it that masks a Numeric installed in site-packages, I > guess I consider this a feature, not a bug. Actually, it is normally worse than a bug: it is a source of bug tomorrow in your application - of all the bugs you'll have when your programmer will be confusing the two Numeric packages, as well as all the mainteance and integration problems you'll have down the line -. But by then, hopefully for you, you'll be somewhere else... The sad reality of most projects :(( > Afterall, this is already the > behavior for single level packages, extending it to multi-level packages > seems natural. If this isn't your objection, please explain. > > The current runtime overhead isn't so bad. Prabhu sent me a few numbers on > the SciPy import (which contains maybe 10-15 nested packages). I attached > them below -- the overhead is less than 10%. It should be negligible for > standard modules as only packages are really affected (right Prabhu?). And that's how, when you cumulate of the overheads for all new features, you get potenially +100-200% overhead on the new releases. Albeit all the efforts of the Python team, Python 2.0 is up to 70% slower than python 1.5.2; Python 2.1.1 is up to 30% slower than python 2.0, and so on... So, +10% on only such a minor features is anything but negligible :((( FG From gmcm@hypernet.com Sat Nov 10 21:50:32 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 10 Nov 2001 16:50:32 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <033301c16a1a$59c1c690$c300a8c0@ericlaptop> Message-ID: <3BED5AD8.24350.78407262@localhost> eric wrote: [Frederic Giacometti] > > because the obvious reason of increased name space collision, > > increased run-time overhead etc... > > I'm missing something here because I don't understand why this > increases name space collision. Currently, os.py in a package masks the real one from anywhere inside the package. This would extend that to anywhere inside any nested subpackage. Whether that's a "neat" or a "dirty" trick is pretty subjective. The wider the namespace you can trample on, the more it tends to be "dirty". > If the objection is to the fact > that SciPy can have a version of Numeric in it that masks a > Numeric installed in site-packages, I guess I consider this a > feature, not a bug. Afterall, this is already the behavior for > single level packages, extending it to multi-level packages seems > natural. If this isn't your objection, please explain. Well, it's a feature that can crash Python. If the package (which the user has, and you have a hijacked, incompatible copy of) contains an extension module, all kinds of nasty things can happen when both are loaded. Submit patches to the package authors, or require a specific version, or write a wrapper that adapts to different versions or fork or do without. This is definitely a dirty trick. > The current runtime overhead isn't so bad. Under anything near normal usage, no - packages structures are nearly always shallow. It wouldn't be much work to construct a case where time spent in import doubled, however. When the "try relative, then try absolute" strategy was introduced with packages, it added insignificant overhead. It's not so insignificant now. When (and if) the standard library moves to a package structure, it's possilbe it will be seen as a burden. - Gordon From gmcm@hypernet.com Sat Nov 10 23:06:48 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 10 Nov 2001 18:06:48 -0500 Subject: [Python-Dev] Proposal for a modified import mechanism. In-Reply-To: <15340.58785.94412.122093@monster.linux.in> Message-ID: <3BED6CB8.27669.78864436@localhost> Prabhu wrote: [Just singling out one section, since I've said plenty on this subject at other points in these threads] [Prabhu works on knee.py] > it also fixes a bug where the parent package is an extension > module. Python provides no support for an extension module being a package parent module. More precisely, I think the fact that an extension module can be made to behave like a package parent module is an accident. There is special code in import for modules named __init__, and the code is bypassed for extension modules. I suspect you'd have to provide a pretty strong justification before this would become supported behavior. - Gordon From Prabhu Ramachandran Sun Nov 11 04:35:13 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Sun, 11 Nov 2001 10:05:13 +0530 Subject: [Python-Dev] Proposal for a modified import mechanism. In-Reply-To: <3BED6CB8.27669.78864436@localhost> References: <15340.58785.94412.122093@monster.linux.in> <3BED6CB8.27669.78864436@localhost> Message-ID: <15342.1.830165.25074@monster.linux.in> >>>>> "GMcM" == Gordon McMillan writes: GMcM> [Prabhu works on knee.py] >> it also fixes a bug where the parent package is an extension >> module. GMcM> Python provides no support for an extension module being a GMcM> package parent module. More precisely, I think the fact that GMcM> an extension module can be made to behave like a package GMcM> parent module is an accident. There is special code in GMcM> import for modules named __init__, and the code is bypassed GMcM> for extension modules. GMcM> I suspect you'd have to provide a pretty strong GMcM> justification before this would become supported behavior. I guess this was unclear. My addition is extremely simple and does not do anything new. Here is an illustration >>> import knee >>> import Numeric.array Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.1/knee.py", line 17, in import_hook m = load_tail(q, tail) File "/usr/local/lib/python2.1/knee.py", line 68, in load_tail m = import_module(head, mname, m) File "/usr/local/lib/python2.1/knee.py", line 97, in import_module parent and parent.__path__) AttributeError: 'Numeric' module has no attribute '__path__' >>> Point is, there is a line in knee.py (line 97 that assumes that there is a __path__ attribute for the passed parent. However, if parent is an extension module there is none. So I simply modified it. Here is the diff. $ diff knee.py /usr/local/lib/python2.1/knee.py 98,101c98 < except (ImportError, AttributeError): < # extension modules dont have a __path__ attribute. < # caching failures. < sys.modules[fqname] = None --- > except ImportError: In fact that is all I changed in knee.py! Which is why I said the changes are very small. Maybe I should have shown a patch but the mail was already long. prabhu From ej@ee.duke.edu Sun Nov 11 04:44:44 2001 From: ej@ee.duke.edu (eric) Date: Sat, 10 Nov 2001 23:44:44 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> Message-ID: <037a01c16a6b$993421a0$c300a8c0@ericlaptop> Hey Gordon, > eric wrote: > > [Frederic Giacometti] > > > > because the obvious reason of increased name space collision, > > > increased run-time overhead etc... > > > > I'm missing something here because I don't understand why this > > increases name space collision. > > Currently, os.py in a package masks the real one from > anywhere inside the package. This would extend that to > anywhere inside any nested subpackage. Whether that's a > "neat" or a "dirty" trick is pretty subjective. The wider the > namespace you can trample on, the more it tends to be "dirty". Yeah, I guess I come down on the "neat" side in this one. If I have a module or package called 'common' at the top level of a deep hierarchy, I'd like all sub-packages to inherit it. That seems intuitive to me and inline with the concept of a 'package'. Perhaps the hijacking of the Numeric example strikes a nerve, but inheriting the 'common' module shouldn't be so contentious. Also, if someone has the gall to hijack os.py at the top of your package directory structure, it seems very likely you want this new behavior everywhere within your package. I have a feeling this discussion has been around the block a few times when packages were first being developed... > > > If the objection is to the fact > > that SciPy can have a version of Numeric in it that masks a > > Numeric installed in site-packages, I guess I consider this a > > feature, not a bug. Afterall, this is already the behavior for > > single level packages, extending it to multi-level packages seems > > natural. If this isn't your objection, please explain. > > Well, it's a feature that can crash Python. If the package > (which the user has, and you have a hijacked, incompatible > copy of) contains an extension module, all kinds of nasty > things can happen when both are loaded. This I need to know about. How does this happen? So you have two extension modules, with the same name, one living in a package and the other living in site-packages. If you import both of these, their namespaces don't conflict in some strange way do they? Or are you talking about passing a structure (like a numeric array) generated in one ext module into a routine in the other ext module (expecting a different format of a numeric array) and then getting some strange (seg-fault even) kind of behavior? Anyway, I'd like a few more details for reasons orthogonal to this discussion. > Submit patches to the package authors, or require a specific > version, or write a wrapper that adapts to different versions or > fork or do without. This is definitely a dirty trick. We've done the "require specific version" option here, and "conveniently" upgraded the user's Numeric package for them. The problem is that some people use old versions of Numeric in production code, and don't want to risk an upgrade -- but still want to try out SciPy. I consider our solution a dirtier trick than encapsulating things completely within SciPy. I also don't think any nasty things could happen in this specific situation of having two relatively recent Numerics loaded up, but I could be wrong. > > > The current runtime overhead isn't so bad. > > Under anything near normal usage, no - packages structures > are nearly always shallow. It wouldn't be much work to > construct a case where time spent in import doubled, however. > > When the "try relative, then try absolute" strategy was > introduced with packages, it added insignificant overhead. It's > not so insignificant now. When (and if) the standard library > moves to a package structure, it's possilbe it will be seen as a > burden. I haven't followed the discussion as to whether the standard library will move to packages, but I'll be surpised if it does. I very much like the encapsulation offered by packages, but have found their current incarnation difficult to develop compared to simple modules. The current discussion concerns only one of the issues. As for overhead, I thought I'd get a couple more data points from distutils and xml since they are standard packages. The distutils import is pretty much a 0% hit. However, the xml import is *much* slower -- a factor of 3.5. Thats a huge hit and worth complaining about. I don't know if this can be optimized or not. If not, it may be a show stopper, even if the philosophical argument was uncontested. eric import speed numbers below: C:\temp>python ActivePython 2.1, build 210 ActiveState) based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import time >>> t1 = time.time();import distutils.command.build; t2 = time.time();print t2-t 1 0.519999980927 >>> C:\temp>python ActivePython 2.1, build 210 ActiveState) based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import my_import >>> import time >>> t1 = time.time();import distutils.command.build; t2 = time.time();print t2-t 1 0.511000037193 >>> C:\temp>python ActivePython 2.1, build 210 ActiveState) based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import my_import >>> import time >>> t1 = time.time();import xml.sax.saxutils; t2 = time.time();print t2-t1 1.35199999809 >>> C:\temp>python ActivePython 2.1, build 210 ActiveState) based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import time >>> t1 = time.time();import xml.sax.saxutils; t2 = time.time();print t2-t1 0.381000041962 >>> From ej@ee.duke.edu Sun Nov 11 05:40:54 2001 From: ej@ee.duke.edu (eric) Date: Sun, 11 Nov 2001 00:40:54 -0500 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. References: <3BED6E5B.8E142057@arakne.com> <033301c16a1a$59c1c690$c300a8c0@ericlaptop> <3BED9F9A.30F77DB3@arakne.com> Message-ID: <038301c16a73$70c5f790$c300a8c0@ericlaptop> Hey Frederic, > But then, this is not an import problem. > If you use Numeric, you call Numeric. If you call something other than Numeric, > just give a different name, and all the confusion will go away. This is certainly an option, but not a good one in my opinion. The main issue is we want to force a specific version of Numeric for SciPy while allowing people to keep their old standard version of Numeric available for their production code. Single level packages provide a handy way of doing this. Multi-level packages (like SciPy) do not. I guess I just don't see why making a package multi-level should inherently make it harder to do things. > If you're worried that you've already encoded the Numeric name 50 times into > 300 files; run a python script over these 300 files; this will do the renaming > of the 15.000 occurences of the Numeric name. Sure, but this is inconvenient, and something I think should be handled by the packaging facility, not by running a renaming script. > > > That makes SciPy > > self-contained and allows people to try it out without worrying that it > > might break their current installation. There are other solutions to this > > problem, but Prabhu's fix is by far the easiest and most robust. > > And then, in maintenance/integration phase, sometimes 'Numeric' will call > Numeric, some other times it will your package ? In the integration phase, users would need to change the "from Numeric import *" to "from scipy import *" (or search and replace Numeric with scipy) in their code as Numeric is completely subsumed into scipy. So, when not using SciPy, their legacy code can continue using an old version of Numeric. When switching to SciPy, the make the replacement. As I said, its mainly a version issue (with a few minor changes). > > What if somebody, for some reason I know nothing of (e.g. probably some > integration) wants to call Numeric and your Numeric package in the same module ? > Wish them tough luck to sort out this poisoned gift.... > > > Prabhu's import also has some other nice benefits. Some of the sub-packages > > in SciPy are useful outside of SciPy. Also sometimes it is easier to > > develop a packages outside of the SciPy framework. It would be nice to be > > able to develop a module or package 'foo' outside of SciPy and then move it > > into SciPy at a later date. However, every SciPy sub-package that referred > > to foo prior to its inclusion in SciPy now has to be updated from 'import > > foo' to 'import scipy.foo'. These kind of issues make it very painful and > > time consuming to rearrange package structures or move modules and > > sub-packages in and out of the package. > > There are basic python scripts which do this painlessly. If you're really > working on a large project, there's a project architect which normally would > take care of such things, and for whom this should not be a too much of a > problem. Hmmm. I guess the "project architect" in this case is jointly held by Travis Oliphant and yours truely. Neither of us are packaging guru's, but do have a fair amount of experience with Python. We worked quite a while on (and are still working on) packaging issues. Incidently, I have know idea what Travis O.'s opinion is on this specific topic. > > > > Simplifying this will improves > > package development. > > > > > I'm personnally against anything that enlarges the search path uselessly; > > > > Hopefully I've explained why it is useful for complex packages. > > Python helps in many areas, but expecting it to palliate for the package design > and architecture flaws that inexorably surface anytimes something non-trivial is > developped, might be somehow at the edge. Python has not yet replaced the need > for relevant software architects. Them thars fightin' words. ; ) I'm biased, but don't thinking scipy's architecture is flawed. It is simply a *very* large package of integrated sub-packages that also relies heavily on a 3rd evolving group of modules (Numeric). As such, it reveals the difficult issues that arise when trying to build large packages of integrated sub-packages that rely on a 3rd evolving group of modules... > > > > > > because the obvious reason of increased name space collision, increased > > > run-time overhead etc... > > > > I'm missing something here because I don't understand why this increases > > name space collision. If the objection is to the fact that SciPy can have a > > version of Numeric in it that masks a Numeric installed in site-packages, I > > guess I consider this a feature, not a bug. > > Actually, it is normally worse than a bug: it is a source of bug tomorrow in > your application - of all the bugs you'll have when your programmer will be > confusing the two Numeric packages, as well as all the mainteance and > integration problems you'll have down the line -. I disagree and don't think that is true in this (and many other) situations. People who want to use SciPy will migrate completely to it since it includes Numeric. What the sub-package option offers is a way to test SciPy and optionally use it while keeping their standard Numeric around for their production code. > > But by then, hopefully for you, you'll be somewhere else... The sad reality of > most projects :(( > > > Afterall, this is already the > > behavior for single level packages, extending it to multi-level packages > > seems natural. If this isn't your objection, please explain. > > > > The current runtime overhead isn't so bad. Prabhu sent me a few numbers on > > the SciPy import (which contains maybe 10-15 nested packages). I attached > > them below -- the overhead is less than 10%. It should be negligible for > > standard modules as only packages are really affected (right Prabhu?). > > And that's how, when you cumulate of the overheads for all new features, you get > potenially +100-200% overhead on the new releases. > Albeit all the efforts of the Python team, Python 2.0 is up to 70% slower than > python 1.5.2; Python 2.1.1 is up to 30% slower than python 2.0, and so on... > So, +10% on only such a minor features is anything but negligible :((( The computational cost of additional functionality is always a question of what portion of a program is impacted. If we were talking about 10% hit on looping structures or dictionary lookups or local variable lookups, then yes it needs extreme scrutiny. Adding 10% to a rare event is not worthy of note. I expect (and see) 0% overhead for importing standard modules (by far the most common case). Adding 10% overhead to importing a very large package with 10-15 nested sub-packages is just not a big deal. The 350% cost I saw (noted in a response to Gordon) is a *huge* deal and would need to be solved (moving to C would help) before this became standard. eric ----- Original Message ----- From: "Frederic Giacometti" To: "eric" Cc: ; ; ; Sent: Saturday, November 10, 2001 4:43 PM Subject: Re: Proposal for a modified import mechanism. > > > eric wrote: > > > I have to agree with Prabhu on this one. The current behavior of import, > > while fine for standard modules and even simple packages with a single > > level, is sub-optimal for packages that contain sub-packages. The proposed > > behavior solves the problem. > > > > Handling the packaging issues in SciPy was difficult, and even resulted in a > > (not always popular) decision to build and overwrite the Numeric package on > > machines that install SciPy. Prabhu's import doesn't resolve all the issues > > (I think packages may just be difficult...), but it would have solved this > > one. The proposed import allows us to put our own version of Numeric in the > > top SciPy directory. Then all SciPy sub-packages would grab this one > > instead of an existing site-packages/Numeric. > > But then, this is not an import problem. > If you use Numeric, you call Numeric. If you call something other than Numeric, > just give a different name, and all the confusion will go away. > If you're worried that you've already encoded the Numeric name 50 times into > 300 files; run a python script over these 300 files; this will do the renaming > of the 15.000 occurences of the Numeric name. > > > That makes SciPy > > self-contained and allows people to try it out without worrying that it > > might break their current installation. There are other solutions to this > > problem, but Prabhu's fix is by far the easiest and most robust. > > And then, in maintenance/integration phase, sometimes 'Numeric' will call > Numeric, some other times it will your package ? > > What if somebody, for some reason I know nothing of (e.g. probably some > integration) wants to call Numeric and your Numeric package in the same module ? > Wish them tough luck to sort out this poisoned gift.... > > > Prabhu's import also has some other nice benefits. Some of the sub-packages > > in SciPy are useful outside of SciPy. Also sometimes it is easier to > > develop a packages outside of the SciPy framework. It would be nice to be > > able to develop a module or package 'foo' outside of SciPy and then move it > > into SciPy at a later date. However, every SciPy sub-package that referred > > to foo prior to its inclusion in SciPy now has to be updated from 'import > > foo' to 'import scipy.foo'. These kind of issues make it very painful and > > time consuming to rearrange package structures or move modules and > > sub-packages in and out of the package. > > There are basic python scripts which do this painlessly. If you're really > working on a large project, there's a project architect which normally would > take care of such things, and for whom this should not be a too much of a > problem. > > > > Simplifying this will improves > > package development. > > > > > I'm personnally against anything that enlarges the search path uselessly; > > > > Hopefully I've explained why it is useful for complex packages. > > Python helps in many areas, but expecting it to palliate for the package design > and architecture flaws that inexorably surface anytimes something non-trivial is > developped, might be somehow at the edge. Python has not yet replaced the need > for relevant software architects. > > > > > > because the obvious reason of increased name space collision, increased > > > run-time overhead etc... > > > > I'm missing something here because I don't understand why this increases > > name space collision. If the objection is to the fact that SciPy can have a > > version of Numeric in it that masks a Numeric installed in site-packages, I > > guess I consider this a feature, not a bug. > > Actually, it is normally worse than a bug: it is a source of bug tomorrow in > your application - of all the bugs you'll have when your programmer will be > confusing the two Numeric packages, as well as all the mainteance and > integration problems you'll have down the line -. > > But by then, hopefully for you, you'll be somewhere else... The sad reality of > most projects :(( > > > Afterall, this is already the > > behavior for single level packages, extending it to multi-level packages > > seems natural. If this isn't your objection, please explain. > > > > The current runtime overhead isn't so bad. Prabhu sent me a few numbers on > > the SciPy import (which contains maybe 10-15 nested packages). I attached > > them below -- the overhead is less than 10%. It should be negligible for > > standard modules as only packages are really affected (right Prabhu?). > > And that's how, when you cumulate of the overheads for all new features, you get > potenially +100-200% overhead on the new releases. > Albeit all the efforts of the Python team, Python 2.0 is up to 70% slower than > python 1.5.2; Python 2.1.1 is up to 30% slower than python 2.0, and so on... > So, +10% on only such a minor features is anything but negligible :((( > > FG From Prabhu Ramachandran Sun Nov 11 07:42:03 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Sun, 11 Nov 2001 13:12:03 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <037a01c16a6b$993421a0$c300a8c0@ericlaptop> References: <3BED5AD8.24350.78407262@localhost> <037a01c16a6b$993421a0$c300a8c0@ericlaptop> Message-ID: <15342.11211.377711.175571@monster.linux.in> Hi, >>>>> "ej" == "eric" writes: >> Currently, os.py in a package masks the real one from anywhere >> inside the package. This would extend that to anywhere inside >> any nested subpackage. Whether that's a "neat" or a "dirty" >> trick is pretty subjective. The wider the namespace you can >> trample on, the more it tends to be "dirty". ej> Yeah, I guess I come down on the "neat" side in this one. If ej> I have a module or package called 'common' at the top level of ej> a deep hierarchy, I'd like all sub-packages to inherit it. ej> That seems intuitive to me and inline with the concept of a ej> 'package'. Perhaps the hijacking of the Numeric example ej> strikes a nerve, but inheriting the 'common' module shouldn't ej> be so contentious. Also, if someone has the gall to hijack ej> os.py at the top of your package directory structure, it seems ej> very likely you want this new behavior everywhere within your ej> package. I agree on this. Also each package is kind of isolated. Any module like os.py inside a sub package won't affect _every_ other sub package and will only affect packages that are nested inside this particular package. So there is some kind of safety net and its not like sticking everything inside sys.path. :) Also, right now, what prevents someone from sticking an os.py somewhere in sys.path and completely ruining standard behaviour. So, its not asif this new approach to importing package makes things dirty, you can very well do 'bad' things right now. [snip] ej> As for overhead, I thought I'd get a couple more data points ej> from distutils and xml since they are standard packages. The ej> distutils import is pretty much a 0% hit. However, the xml ej> import is *much* slower -- a factor of 3.5. Thats a huge hit ej> and worth complaining about. I don't know if this can be ej> optimized or not. If not, it may be a show stopper, even if ej> the philosophical argument was uncontested. >>>> import my_import import time t1 = time.time();import >>>> xml.sax.saxutils; t2 = time.time();print t2-t1 1.35199999809 >>>> import time t1 = time.time();import xml.sax.saxutils; t2 = >>>> time.time();print t2-t1 0.381000041962 IMHO, this is an unfair/wrong comparison. (0) I suspect that you did not first clean things up by doing a plain import xml.sax.saxutils a few times and then start testing. (1) import itself is implemented in C. my_import is pretty much completely in Python. Here is a fairer comparison (done after a few imports). >>> import time >>> s = time.time (); import xml.sax.saxutils; print time.time()-s 0.0434629917145 >>> import my_import >>> import time >>> s = time.time (); import xml.sax.saxutils; print time.time()-s 0.0503059625626 Which is still not bad at all and nothing close to 350% slowdown. But to see if the presently measured slowdown is not the parent lookup we really need to compare things against the modified (to cache failures) knee.py: >>> import knee >>> import time >>> s = time.time (); import xml.sax.saxutils; print time.time()-s 0.0477709770203 >>> import my_import >>> import time >>> s = time.time (); import xml.sax.saxutils; print time.time()-s 0.0501489639282 Which is really not very bad since its just a 5% slowdown. Here are more tests for scipy: >>> import time >>> s = time.time (); import scipy; print time.time()-s 1.36110007763 >>> import knee, time >>> s = time.time (); import scipy; print time.time()-s 1.48176395893 >>> import my_import, time >>> s = time.time (); import scipy; print time.time()-s 1.5150359869 Which means that doing the parent lookup stuff in this case is really not so bad and the biggest slowdown is mostly thanks to knee being implemented in Python. And there is no question of a 350% slowdown!! :) prabhu From Prabhu Ramachandran Sun Nov 11 08:03:59 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Sun, 11 Nov 2001 13:33:59 +0530 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. In-Reply-To: <033301c16a1a$59c1c690$c300a8c0@ericlaptop> References: <3BED6E5B.8E142057@arakne.com> <033301c16a1a$59c1c690$c300a8c0@ericlaptop> Message-ID: <15342.12527.94615.833898@monster.linux.in> >>>>> "ej" == ej writes: [snip] ej> The current runtime overhead isn't so bad. Prabhu sent me a ej> few numbers on the SciPy import (which contains maybe 10-15 ej> nested packages). I attached them below -- the overhead is ej> less than 10%. It should be negligible for standard modules ej> as only packages are really affected (right Prabhu?). It depends on how you do it. If you have a sub-package that tries to import a standard module it will go through all the parent packages searching for the module and when it doesn't find one it will check in sys.path. There are a few things to note: (1) For a module in a package, the first import will be naturally the slowest. (2) Subsequent imports will be faster since failures are cached and the package is already imported. (3) If the module in question is not inside a package there will be no slowdown whatsoever since there is no parent package at all. I've timed this with vtk and it seems to be correct. >>> import my_import, time # NOTE: I am not inside any package. >>> s = time.time (); import vtkpython; print time.time()-s 1.06130003929 >>> import time >>> s = time.time (); import vtkpython; print time.time()-s 1.06413698196 Its slower with standard import you may say - but that might just be my kernel's scheduling affecting things. I think its fair to conclude that there is no slowdown if you are not inside a package and based on my earlier timings, that recursive searching thru package parents is not too expensive either. There is one issue. lets say we have two sub-packages that have modules of the same name. Then if we explicitly want the other sub-packages module to be imported there is currently no way of doing it. In such a case maybe adding a __parent__ or using (__ as ni did) might be a good idea too. prabhu From Prabhu Ramachandran Sun Nov 11 08:08:18 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Sun, 11 Nov 2001 13:38:18 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <3BED5AD8.24350.78407262@localhost> References: <033301c16a1a$59c1c690$c300a8c0@ericlaptop> <3BED5AD8.24350.78407262@localhost> Message-ID: <15342.12786.547708.67405@monster.linux.in> >>>>> "GMcM" == Gordon McMillan writes: [snipped off other issues raised] >> The current runtime overhead isn't so bad. GMcM> Under anything near normal usage, no - packages structures GMcM> are nearly always shallow. It wouldn't be much work to GMcM> construct a case where time spent in import doubled, GMcM> however. But that can be said of almost anything. A nicer question to ask is -- for most circumstances (99%) is the import mechanism fast enough? GMcM> When the "try relative, then try absolute" strategy was GMcM> introduced with packages, it added insignificant GMcM> overhead. It's not so insignificant now. When (and if) the GMcM> standard library moves to a package structure, it's possilbe GMcM> it will be seen as a burden. Yes, which is why maybe adding an 'rimport' keyword (which you suggested) would be a more conservative option? prabhu From mclay@erols.com Sun Nov 11 10:24:51 2001 From: mclay@erols.com (Michael McLay) Date: Sun, 11 Nov 2001 05:24:51 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() Message-ID: I just submitted a patch that replaces the __slots__ notation with a new syntax that is more like the property descriptor. The old syntax looked as follows: >>> class B(object): """class B's docstring """ __slots__ = ['a','b','c','d'] The following example will create the eequivalent of this __slots__ example. >>> class B(object): """class B's docstring """ a = addmember() b = addmember() c = addmember() d = addmember() The next example show the use of the three parameters for addmember. The doc parameter becomes the docstring for the attribute. The types parameter can be a single type or a tuple. If it is present the member_set and member_get functions will call PyObject_IsInstance to verify the member is of the defined types. The default parameter must be of one of the defined types. If the member is not populated prior to accessing the member the default value will be returned as the value of the member. >>> class B(object): """class B's docstring """ a = addmember(types=int, default=56, doc="a docstring") b = addmember(types=int, doc="b's docstring") c = addmember(types=(int,float), default=5.0, doc="c docstring") d = addmember(types=(str,float), default="ham", doc="d docstring") >>> b = B() >>> b.a 56 >>> B.a.__doc__ 'a docstring' >>> b.d 'ham' >>> b.b Traceback (most recent call last): File "", line 1, in ? b.b TypeError: The value of B.b is of type 'type'. This is not one of the defined types >>> b.d = 23.3 >>> b.d = (34,) Traceback (most recent call last): File "", line 1, in ? b.d = (34,) TypeError: The type 'tuple' is not one of the declared types for B.d The zip file submitted with the patch includes a more detailed description of the patch. I hope the feature freeze won't rule out the patch for 2.2. The whole patch, including all of the test cases is less than 500 lines. My concern is that if __slots__ isn't fixed prior to releasing 2.2 we'll be stuck with the rather limited and ugly syntax Guido cooked up to test the capabilities of member descriptors. The patch also takes steps to isolate the member descriptor code from type_new. The type checking code is brain dead simple and only required about 20 lines of code. It is fully contained in the new member descriptor that was added. The cost to test if a type check is required is a single C compare. I'll be at an out of town meeting until next Saturday and I'm not sure if I'll have Internet access. From skip@pobox.com (Skip Montanaro) Sun Nov 11 11:41:58 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 11 Nov 2001 12:41:58 +0100 Subject: [Python-Dev] First Draft: PEP "Switching on Multiple Values" In-Reply-To: <3BED6EF0.BA8EBFAF@lemburg.com> References: <3BED6EF0.BA8EBFAF@lemburg.com> Message-ID: Just a few odd comments... mal> The current solution to this problem lies in using a dispatch mal> table to find the case implementing method to execute depending on mal> the value of the switch variable (this can be tuned to have a mal> complexity of O(1) on average, e.g. by using perfect hash mal> tables). This works well for state machines which require complex mal> and lengthy processing in the different case methods. It does should be "does not" mal> perform well for ones which only process one or two instructions mal> per case, e.g. ... mal> The first solution has the benefit of not relying on new keywords either "not requiring new" or "not relying on adding new" mal> to the language, while the second looks cleaner. Both involve some mal> run-time overhead to assure that the switching variable is mal> immutable and hashable. ... mal> The new optimization should not change the current Python mal> semantics (by reducing the number of __cmp__ calls and adding mal> __hash__ calls in if-elif-else constructs which are affected mal> by the optimiztation). To assure this, switching can only mal> safely be implemented either if a "from __future__" style mal> flag is used, or the switching variable is one of the builtin mal> immutable types: int, float, string, unicode, etc. and not an instance of a subclass of any of them. (I think that's implied by your text, but should be explicitly stated.) mal> To prevent post-modifications of the jump-table dictionary mal> (which could be used to reach protected code), the jump-table mal> will have to be a read-only type (e.g. a read-only mal> dictionary). There was discussion once upon a time about adding a read-write flag to dictionaries to prevent modification during sensitive times. I gather lists already have such a flag to allow in-place sorting to work. ... mal> It should be possible for the compiler to detect an mal> if-elif-else construct which has the following signature: mal> if x == 'first':... mal> elif x == 'second':... mal> else:... mal> (ie. the left hand side alwys references the same variable, mal> the right hand side some hashable immutable builtin type) I don't think it is required that the right-hand sides all be of the same type. Perhaps it would be worthwhile to mention this and illustrate with an example. (Odd usage, perhaps, but still agrees with the desired constrains I think.) mal> The compiler could then setup a read-only (perfect) hash mal> table, store it in the constants and add an opcode SWITCH mal> which triggers the following run-time behaviour: I think it would be sufficient to use a read-only dictionary. Perfect has tables are fine, but who wants to implement one? mal> At runtime, SWITCH would check x for being one of the mal> well-known immutable types (strings, unicode, numbers) and mal> use the hash table for finding the right opcode snippet. And if one of the constraints isn't met, it would simply jump to the original if/elif/else code? ... mal> The compiler would have to generate code similar to this: should be "would have to compile" or "would have to convert" ... mal> Issues: mal> There have been other proposals for the syntax which reuse mal> existing keywords and avoid adding two new ones ("switch" and mal> "case"). Here's a wacky idea (probably wouldn't work syntactically, but hey, you never know): if EXPR: == CONSTANT: suite == CONSTANT: suite == CONSTANT: suite else: suite No new keywords, but I suspect the compiler would have to look too far ahead to realise that it's not compiling a regular if statement. BTW, do you mean for your "suite"s to be optional? Here are some concrete examples using each syntax variant switch EXPR: switch x: case CONSTANT: case "first": [suite] print x case CONSTANT: case "second": [suite] x = x**2 ... print x else: else: [suite] print "whoops!" case EXPR: case x: of CONSTANT: of "first": [suite] print x of CONSTANT: of "second": [suite] print x**2 else: else: [suite] print "whoops!" case EXPR: case state: if CONSTANT: if "first": [suite] state = "second" if CONSTANT: if "second": [suite] state = "third" else: else: [suite] state = "first" when EXPR: when state: in CONSTANT_TUPLE: in ("first", "second"): [suite] print state in CONSTANT_TUPLE: state = next_state(state) [suite] in ("seventh",): ... print "done" else: break # out of loop! [suite] else: print "middle state" state = next_state(state) If you allow the suites to be optional (except in the else?), I think you can avoid the need for tuples without parens in the case clauses. For example: when state: in "first": in "second": print state state = next_state(state) in "seventh": print "done" break else: print "middle state" state = next_state(state) Skip From mclay@erols.com Sun Nov 11 15:45:35 2001 From: mclay@erols.com (Michael McLay) Date: Sun, 11 Nov 2001 10:45:35 -0500 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. Message-ID: mal> will have to be a read-only type (e.g. a read-only mal> dictionary). SM>There was discussion once upon a time about adding a read-write flag to SM>dictionaries to prevent modification during sensitive times. I gather SM> lists already have such a flag to allow in-place sorting to work. The member description returned by addmembers() allows types to be specified for an attribute. It would be trivial to add another flag to mark the member as read only. You could also add methods to the PyExtendedMemberDescrObject to allow the state to be turned on at the begining of the switch and off at the end of the switch. From mal@lemburg.com Sun Nov 11 18:32:19 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 11 Nov 2001 19:32:19 +0100 Subject: [Python-Dev] First Draft: PEP "Switching on Multiple Values" References: <3BED6EF0.BA8EBFAF@lemburg.com> Message-ID: <3BEEC433.B6FCE7EA@lemburg.com> Skip Montanaro wrote: > > Just a few odd comments... Thanks; I've added them to the next PEP revision. >... > > mal> To prevent post-modifications of the jump-table dictionary > mal> (which could be used to reach protected code), the jump-table > mal> will have to be a read-only type (e.g. a read-only > mal> dictionary). > There was discussion once upon a time about adding a read-write flag to > dictionaries to prevent modification during sensitive times. I gather lists > already have such a flag to allow in-place sorting to work. With the new subtyping mechanism I believe this should be easy enough to implement as subtype. Could be useful in other cases too. > ... > > mal> The compiler could then setup a read-only (perfect) hash > mal> table, store it in the constants and add an opcode SWITCH > mal> which triggers the following run-time behaviour: > I think it would be sufficient to use a read-only dictionary. Perfect has > tables are fine, but who wants to implement one? You're probably right; the compiler could check for perfectness though (e.g. the read-only dict could have a method or attribute for testing this). If its not a perfect hash, there are several possibilities of arranging for a perfect hash, e.g. double hashing, slightly increasing the table size etc. I don't think its really necessary to be picky about the perfectness unless non-perfect hash tables are common for the current combination of Python hash functions and dictionary implementation. > mal> At runtime, SWITCH would check x for being one of the > mal> well-known immutable types (strings, unicode, numbers) and > mal> use the hash table for finding the right opcode snippet. > And if one of the constraints isn't met, it would simply jump to the > original if/elif/else code? Right. > ... > > mal> Issues: > > mal> There have been other proposals for the syntax which reuse > mal> existing keywords and avoid adding two new ones ("switch" and > mal> "case"). > > Here's a wacky idea (probably wouldn't work syntactically, but hey, you > never know): > > if EXPR: > == CONSTANT: > suite > == CONSTANT: > suite > == CONSTANT: > suite > else: > suite > > No new keywords, but I suspect the compiler would have to look too far ahead > to realise that it's not compiling a regular if statement. Naa... too weird looking :-) > BTW, do you mean for your "suite"s to be optional? No. We decided against fall-through ideas... I'll change them to SUITE. > Here are some > concrete examples using each syntax variant > ... Thanks; I've added them to the PEP. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Sun Nov 11 19:44:11 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 11 Nov 2001 14:44:11 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() In-Reply-To: Message-ID: [Michael McLay] > ... > I hope the feature freeze won't rule out the patch for 2.2. That's up to the Release Manager (Barry this time), but I expect the introduction of a type-check gimmick alone leaves it no chance (that's a new direction for Python, so is PEP material unless Guido loves it intensely at once). > My concern is that if __slots__ isn't fixed prior to releasing 2.2 we'll > be stuck with the rather limited and ugly syntax Guido cooked up to > test the capabilities of member descriptors. I expect we will. Ditto __metatype__, long-winded super(), function-based property "declarations", and all the other new stuff. We're aiming for progress with the new features, not perfection . > ... > I'll be at an out of town meeting until next Saturday and I'm not sure > if I'll have Internet access. The last 2.2 beta should be released before your return. From jason@tishler.net Mon Nov 12 03:21:30 2001 From: jason@tishler.net (Jason Tishler) Date: Sun, 11 Nov 2001 22:21:30 -0500 Subject: [Python-Dev] PyObject_GenericGetAttr vs cygwin In-Reply-To: <2md72s4504.fsf@starship.python.net> Message-ID: <20011111222129.A1876@dothill.com> Michael, On Fri, Nov 09, 2001 at 05:23:39AM -0500, Michael Hudson wrote: > Jason Tishler writes: > > If they are just the standard "PyObject_HEAD_INIT(NULL)" style fix, then > > please just commit them. > > Done. Thanks. > > > BTW, _cursesmodule.c doesn't compile; you get things like: > > > > > > Warning: resolving _stdscr by linking to __imp__stdscr (auto-import) > > > [snip] > > > build/temp.cygwin-1.3.3-i686-2.2/_cursesmodule.o: In function `PyCurses_InitScr': > > > /cygdrive/c/src/python/dist/src/Modules/_cursesmodule.c:1842: undefined reference to `acs_map' > > > [snip] I'm investigating a patch from Norman Vine right now. I will try to submit it or a variation of it to the patch collector ASAP -- hopefully, in time for beta 2. > > There is still one known Cygwin pthreads hang. If interested, see the > > following for the current state of affairs: > > > > http://sources.redhat.com/ml/cygwin-developers/2001-10/msg00193.html > > makes little sense to me, I'm afraid. Haven't had test_thread die on > me yet, but then I've only run it a few times. Cygwin Python with threads seems to work for others too (without hangs). Unfortunately (or fortunately depending on your perspective), it hangs fairly often on my main Windows machine. > > > test_strftime is still bust, though: > > > > The test_strftime problem was fixed by: > > > > http://sources.redhat.com/ml/newlib/2001/msg00504.html > > > > and released in Cygwin 1.3.4. > > ... but if test_poll works, how come this doesn't? The only explanation that I can come up with is that you are using a snapshot that has the poll fix but not the strftime one. If you are still using the stock 1.3.3, then I'm at a loss to explain your observations. > How do I find out which version of cygwin I have? Just like on Unix: $ uname -a CYGWIN_NT-5.0 ALTHEA 1.3.4(0.46/3/2) 2001-10-26 21:17 i686 unknown Thanks, Jason From barry@zope.com Mon Nov 12 05:54:14 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 12 Nov 2001 00:54:14 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: Message-ID: <15343.25606.211967.846455@anthem.wooz.org> >>>>> "MM" == Michael McLay writes: MM> I hope the feature freeze won't rule out the patch for 2.2. Without BDFL override, yes, it does. >>>>> "TP" == Tim Peters writes: TP> I expect we will. Ditto __metatype__, long-winded super(), TP> function-based property "declarations", and all the other new TP> stuff. We're aiming for progress with the new features, not TP> perfection . I believe Guido knows that it will be impossible to get all this stuff right the first time, and even the 2.2 beta cycle won't shake out all the problems. I think his intention was to get the basic functionality in place for Python 2.2, and to clean up and improve the syntax and semantics in future releases. -Barry From thomas.heller@ion-tof.com Mon Nov 12 07:58:24 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 12 Nov 2001 08:58:24 +0100 Subject: [Python-Dev] __class_init__ References: Message-ID: <00ff01c16b4f$e06ca710$e000a8c0@thomasnotebook> From: "Tim Peters" > [Thomas Heller] > > I'll retract this request. > > Why? I haven't had time to look into it yet, but it's definitely on my > agenda. Why don't you want it anymore? Two reasons: - It seems I can also achieve what I want with attribute descriptors - I have no idea how it would be implemented. OTOH, it would definitely be nice to have... Thomas From mal@lemburg.com Mon Nov 12 08:59:04 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 12 Nov 2001 09:59:04 +0100 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: <15343.25606.211967.846455@anthem.wooz.org> Message-ID: <3BEF8F58.CF4E6166@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MM" == Michael McLay writes: > > MM> I hope the feature freeze won't rule out the patch for 2.2. > > Without BDFL override, yes, it does. > > >>>>> "TP" == Tim Peters writes: > > TP> I expect we will. Ditto __metatype__, long-winded super(), > TP> function-based property "declarations", and all the other new > TP> stuff. We're aiming for progress with the new features, not > TP> perfection . > > I believe Guido knows that it will be impossible to get all this stuff > right the first time, and even the 2.2 beta cycle won't shake out all > the problems. I think his intention was to get the basic > functionality in place for Python 2.2, and to clean up and improve the > syntax and semantics in future releases. I'd suggest that Guido marks those features he considers stable as such and clearly states which other features should still be condsidered experimental and not for production use. I intend to make some of the mx-datatypes subclassable but would want to have to support n different ways of implementing the details (I'll already have to support two different ways: classic and new style... wouldn't want to do classic, new style version 2.2, new style version 2.3, etc.) Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Mon Nov 12 12:48:06 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 12 Nov 2001 13:48:06 +0100 Subject: [Python-Dev] order of unittest.TestCase execution? Message-ID: <15343.50438.442326.206862@beluga.mojam.com> I'm trying to write a regression test for dumbdbm (well, officially I'm updating the current test, but it's taking a beating). I'm unclear on the order of method execution. Can I rely on multiple test methods to be run in alphabetical order? That's the case currently. If possible, I'd like to guarantee that the db creation test is run before the others. I couldn't find the code in test_support.py or unittest.py where a test case instance with multiple methods is run. Thx, Skip From Andreas Jung" Message-ID: <030d01c16b86$35758b10$02010a0a@suxlap> Have you been trying to put the database creation into the setUp() method that is executed before every testcase ? Andreas ----- Original Message ----- From: "Skip Montanaro" To: Sent: Monday, November 12, 2001 07:48 Subject: [Python-Dev] order of unittest.TestCase execution? > I'm trying to write a regression test for dumbdbm (well, officially I'm > updating the current test, but it's taking a beating). I'm unclear on the > order of method execution. Can I rely on multiple test methods to be run in > alphabetical order? That's the case currently. If possible, I'd like to > guarantee that the db creation test is run before the others. I couldn't > find the code in test_support.py or unittest.py where a test case instance > with multiple methods is run. > > Thx, > > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From jim@interet.com Mon Nov 12 14:30:02 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 12 Nov 2001 09:30:02 -0500 Subject: [Python-Dev] Caching directory files in import.c References: <3BE30079.D6A8FB52@interet.com> Message-ID: <3BEFDCEA.720FF3C1@interet.com> "James C. Ahlstrom" wrote: > Looking at the code, I saw that I could do an os.listdir(path), > and record the directory file names into the same dictionary. > Then it would not be necessary to perform a large number of > fopen()'s. The same dictionary lookup is used instead. > > Is this a good idea??? I now have benchmarks based on 2.2a3 which compare the speed of importing 100 modules from Python ./Lib for the original 2.2a3 versus the new logic that uses os.listdir to maintain a Python dictionary of directory contents. Note that this is not related to importing from zip files. The bottom line is that imports are 1.3 times faster for the local drive, and 1.8 to 3.0 times faster for the network drive. Benchmarks can be confusing. Importing from the local C: takes about 3 seconds after a re-boot, but repeated imports lowers this to 1 second. This must be a measure of Windows 2000's ability to cache file system data. Moving the "correct" directory, the one where the files really reside, from the beginning to the end of sys.path increases this only slightly for the local drive. I believe the times after re-boot, when the file cache is empty, is more representative of real Python imports. When importing from a network drive, things are different. Times are quite consistent, and don't show the scatter after reboot. They are also much longer, indicating that Windows 2000 with Samba is relatively ineffective in caching network file data. The new logic using os.listdir shows little change from local drive to network drive, and doesn't depend on the correct placement of the source path in sys.path. Here is the data: Original Using os.listdir --------------------- --------------------- Local drive, Start of path 3.2, 2.5, 3.2 -> 1.02 2.3, 2.5, 2.3 -> 0.87 Local drive, End of path 2.8, 3.9, 3.0 -> 1.32 Same as above. Net drive, Start of path 5.7, 5.7, 5.7 -> 5.7 2.1, 2.1, 2.1 -> 1.8 Net drive, End of path 9.4, 9.4, 9.3 -> 9.35 2.1, 2.1, 2.1 -> 1.8 Benchmarks were performed on a Pentium 4 clone, 1.4 GHz, 256 Meg. The machine was running Windows 2000. Times are in seconds, and are the time to import about 100 modules from Lib. "Local drive" means C:, "Net drive" means network using a Linux/Samba server. "Start of path" means sys.path had its default value. "End of path" means the correct Lib directory was moved to the end of sys.path. Initial times are after a re-boot of the system, the time after "->" is the time after repeated runs. Times to import from C: after a re-boot are rather highly variable, but are more realistic. JimA From skip@pobox.com (Skip Montanaro) Mon Nov 12 14:49:47 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 12 Nov 2001 15:49:47 +0100 Subject: [Python-Dev] order of unittest.TestCase execution? In-Reply-To: <030d01c16b86$35758b10$02010a0a@suxlap> References: <15343.50438.442326.206862@beluga.mojam.com> <030d01c16b86$35758b10$02010a0a@suxlap> Message-ID: <15343.57739.334927.256373@beluga.mojam.com> Andreas> Have you been trying to put the database creation into the Andreas> setUp() method that is executed before every testcase ? No, I was just trying to mimic the structure of the test_xmlrpc.py module. It put all the cases into one class, so I did to. I realized I wanted to run the creation test before the others and noticed that it was executing them in alphabetical order. I was just wondering if I could rely on that or if that was just a quirk of either the current test_support or unittest implementations and couldn't be relied on. I can probably fiddle things to work without the implicit ordering, but it's kind of hard to see how the other tests could succeed if the db creation test fails, so we might as well run it first. ;-) Skip From jim@interet.com Mon Nov 12 14:52:57 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 12 Nov 2001 09:52:57 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> Message-ID: <3BEFE249.371EA5FD@interet.com> Gordon McMillan wrote: > Currently, os.py in a package masks the real one from > anywhere inside the package. This would extend that to What??? When Python starts, it imports site.py which imports os.py. So os.py gets loaded, and won't normally get re-loaded. The os.py that gets loaded depends on sys.path. So if os.py is in package1, it won't get loaded for "import os", but it would get loaded for "import package1.os". Are you saying that "import package1.package2.os" will load package1/os.py? I hope that "import os" will not load package1/os.py, will it? Or am I totally confused. Jim Ahlstrom From mal@lemburg.com Mon Nov 12 09:19:12 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 12 Nov 2001 10:19:12 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 Message-ID: <3BEF9410.789D804@lemburg.com> Here's an updated version of the PEP. You will also find an HTML copy at: http://python.sourceforge.net/peps/pep-0275.html -- PEP: 0275 Title: Switching on Multiple Values Version: $Revision: 1.1 $ Author: mal@lemburg.com (Marc-Andre Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 10-Nov-2001 Post-History:=20 Abstract This PEP proposes strategies to enhance Python's performance with respect to handling switching on a single variable having one of multiple possible values. Problem Up to Python 2.2, the typical way of writing multi-value switches=20 has been to use long switch constructs of the following type: if x =3D=3D 'first state': ... elif x =3D=3D 'second state': ... elif x =3D=3D 'third state': ... elif x =3D=3D 'fourth state': ... else: # default handling ... This works fine for short switch constructs, since the overhead of repeated loading of a local (the variable x in this case) and comparing it to some constant is low (it has a complexity of O(n) on average). However, when using such a construct to write a state machine such as is needed for writing parsers the number of possible states can easily reach 10 or more cases. The current solution to this problem lies in using a dispatch table to find the case implementing method to execute depending on the value of the switch variable (this can be tuned to have a complexity of O(1) on average, e.g. by using perfect hash tables). This works well for state machines which require complex and lengthy processing in the different case methods. It does not perform well for ones which only process one or two instructions per case, e.g. def handle_data(self, data): self.stack.append(data) =20 A nice example of this is the state machine implemented in pickle.py which is used to serialize Python objects. Other prominent cases include XML SAX parsers and Internet protocol handlers. Proposed Solutions This PEP proposes two different but not necessarily conflicting solutions: 1. Adding an optimization to the Python compiler and VM which detects the above if-elif-else construct and generates special opcodes for it which use an read-only dictionary for storing jump offsets. 2. Adding new syntax to Python which mimics the C style switch statement. The first solution has the benefit of not relying on adding new keywords to the language, while the second looks cleaner. Both involve some run-time overhead to assure that the switching variable is immutable and hashable. Solution 1: Optimizing if-elif-else XXX This section currently only sketches the design.=20 Issues: The new optimization should not change the current Python semantics (by reducing the number of __cmp__ calls and adding __hash__ calls in if-elif-else constructs which are affected by the optimiztation). To assure this, switching can only safely be implemented either if a "from __future__" style flag is used, or the switching variable is one of the builtin immutable types: int, float, string, unicode, etc. (not subtypes, since it's not clear whether these are still immutable or not) To prevent post-modifications of the jump-table dictionary (which could be used to reach protected code), the jump-table will have to be a read-only type (e.g. a read-only dictionary). The optimization should only be used for if-elif-else constructs which have a minimum number of n cases (where n is a number which has yet to be defined depending on performance tests). Implementation: It should be possible for the compiler to detect an if-elif-else construct which has the following signature: if x =3D=3D 'first':... elif x =3D=3D 'second':... else:... i.e. the left hand side always references the same variable, the right hand side a hashable immutable builtin type. The right hand sides need not be all of the same type, but they should be comparable to the type of the left hand switch variable. The compiler could then setup a read-only (perfect) hash table, store it in the constants and add an opcode SWITCH in front of the standard if-elif-else byte code stream which triggers the following run-time behaviour: At runtime, SWITCH would check x for being one of the well-known immutable types (strings, unicode, numbers) and use the hash table for finding the right opcode snippet. If this condition is not met, the interpreter should revert to the standard if-elif-else processing by simply skipping the SWITCH opcode and procedding with the usual if-elif-else byte code stream. Solutions 2: Adding a switch statement to Python XXX This section currently only sketches the design. Syntax: switch EXPR: case CONSTANT: SUITE =20 case CONSTANT: SUITE =20 ... else: SUITE =20 (modulo indentation variations) The "else" part is optional. If no else part is given and none of the defined cases matches, a ValueError is raised. Implementation: The compiler would have to compile this into byte code similar to this: def whatis(x): switch(x): case 'one':=20 print '1' case 'two':=20 print '2' case 'three':=20 print '3' else:=20 print "D'oh!" into (ommitting POP_TOP's and SET_LINENO's): 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (switch-table-1) 12 SWITCH 26 (to 38) 14 LOAD_CONST 2 ('1') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP 43 22 LOAD_CONST 3 ('2') 25 PRINT_ITEM 26 PRINT_NEWLINE 27 JUMP 43 30 LOAD_CONST 4 ('3') 33 PRINT_ITEM 34 PRINT_NEWLINE 35 JUMP 43 38 LOAD_CONST 5 ("D'oh!") 41 PRINT_ITEM 42 PRINT_NEWLINE >>43 LOAD_CONST 0 (None) 46 RETURN_VALUE =20 Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. Issues: The switch statement should not implement fall-through behaviour (as does the switch statement in C). Each case defines a complete and independent suite; much like in a if-elif-else statement. This also enables using break in switch statments inside loops. If the interpreter finds that the switch variable x is not hashable, it should raise a TypeError at run-time pointing out the problem. There have been other proposals for the syntax which reuse existing keywords and avoid adding two new ones ("switch" and "case"). Others have argued that the keywords should use new terms to avoid confusion with the C keywords of the same name but slightly different semantics (e.g. fall-through without break). Some of the proposed variants: case EXPR: of CONSTANT: SUITE =20 of CONSTANT: SUITE =20 else: SUITE =20 case EXPR: if CONSTANT: SUITE =20 if CONSTANT: SUITE =20 else: SUITE =20 when EXPR: in CONSTANT_TUPLE: SUITE =20 in CONSTANT_TUPLE: SUITE =20 ... else: SUITE =20 =20 The switch statement could be extended to allow tuples of values for one section (e.g. case 'a', 'b', 'c': ...). Another proposed extension would allow ranges of values (e.g. case 10..14: ...). These should probably be post-poned, but already kept in mind when designing and implementing a first version. Examples: switch EXPR: switch x: case CONSTANT: case "first": SUITE print x case CONSTANT: case "second": SUITE x =3D x**2 ... print x else: else: SUITE print "whoops!" case EXPR: case x: of CONSTANT: of "first": SUITE print x of CONSTANT: of "second": SUITE print x**2 else: else: SUITE print "whoops!" case EXPR: case state: if CONSTANT: if "first": SUITE state =3D "second" if CONSTANT: if "second": SUITE state =3D "third" else: else: SUITE state =3D "first" when EXPR: when state: in CONSTANT_TUPLE: in ("first", "second"): SUITE print state in CONSTANT_TUPLE: state =3D next_state(state) SUITE in ("seventh",): ... print "done" else: break # out of loop! SUITE else: print "middle state" state =3D next_state(state) Scope XXX Explain "from __future__ import switch" Credits Martin von L=F6wis (issues with the optimization idea) Thomas Wouters (switch statement + byte code compiler example) Skip Montanaro (dispatching ideas, examples) Donald Beaudry (switch syntax) Greg Ewing (switch syntax) Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mwh@python.net Mon Nov 12 14:58:31 2001 From: mwh@python.net (Michael Hudson) Date: 12 Nov 2001 09:58:31 -0500 Subject: [Python-Dev] order of unittest.TestCase execution? In-Reply-To: Skip Montanaro's message of "Mon, 12 Nov 2001 15:49:47 +0100" References: <15343.50438.442326.206862@beluga.mojam.com> <030d01c16b86$35758b10$02010a0a@suxlap> <15343.57739.334927.256373@beluga.mojam.com> Message-ID: <2mn11svxwo.fsf@starship.python.net> Skip Montanaro writes: > Andreas> Have you been trying to put the database creation into the > Andreas> setUp() method that is executed before every testcase ? > > No, I was just trying to mimic the structure of the test_xmlrpc.py module. > It put all the cases into one class, so I did to. I realized I wanted to > run the creation test before the others and noticed that it was executing > them in alphabetical order. I was just wondering if I could rely on that or > if that was just a quirk of either the current test_support or unittest > implementations and couldn't be relied on. Well, there's this: def getTestCaseNames(self, testCaseClass): """Return a sorted sequence of method names found within testCaseClass """ testFnNames = filter(lambda n,p=self.testMethodPrefix: n[:len(p)] == p, dir(testCaseClass)) for baseclass in testCaseClass.__bases__: for testFnName in self.getTestCaseNames(baseclass): if testFnName not in testFnNames: # handle overridden methods testFnNames.append(testFnName) if self.sortTestMethodsUsing: testFnNames.sort(self.sortTestMethodsUsing) return testFnNames in unittest.py & self.sortTestMethodsUsing seems to be cmp() by default. Don't know if this is documented or subject to change, though. It seems you could define a subclass of TestLoader, override getTestCaseNames and pass an instance as the testLoader arg to unitest.main() if you want to be sure. Cheers, M. -- ARTHUR: Why should he want to know where his towel is? FORD: Everybody should know where his towel is. ARTHUR: I think your head's come undone. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From jim@interet.com Mon Nov 12 15:02:59 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 12 Nov 2001 10:02:59 -0500 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. References: <3BED6E5B.8E142057@arakne.com> <033301c16a1a$59c1c690$c300a8c0@ericlaptop> <3BED9F9A.30F77DB3@arakne.com> <038301c16a73$70c5f790$c300a8c0@ericlaptop> Message-ID: <3BEFE4A3.63DF1A21@interet.com> eric wrote: > (and see) 0% overhead for importing standard modules (by far the most > common case). Adding 10% overhead to importing a very large package > with 10-15 nested sub-packages is just not a big deal. The 350% cost I > saw (noted in a response to Gordon) is a *huge* deal and would need to be > solved (moving to C would help) before this became standard. I have code which caches directory contents, and related benchmarks. This might help, and could be combined with a new Python module for importing packages, say, as a new method "importer" in __init__.py. Please see python-dev, "Caching directory files in import.c". JimA From barry@zope.com Mon Nov 12 15:04:54 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 12 Nov 2001 10:04:54 -0500 Subject: [Python-Dev] order of unittest.TestCase execution? References: <15343.50438.442326.206862@beluga.mojam.com> <030d01c16b86$35758b10$02010a0a@suxlap> <15343.57739.334927.256373@beluga.mojam.com> Message-ID: <15343.58646.703692.787069@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> No, I was just trying to mimic the structure of the SM> test_xmlrpc.py module. It put all the cases into one class, SM> so I did to. I realized I wanted to run the creation test SM> before the others and noticed that it was executing them in SM> alphabetical order. I was just wondering if I could rely on SM> that or if that was just a quirk of either the current SM> test_support or unittest implementations and couldn't be SM> relied on. I don't think you should count on execution order. I don't think it's guaranteed and it will make your tests much more fragile. -Barry From nas@python.ca Mon Nov 12 15:23:05 2001 From: nas@python.ca (Neil Schemenauer) Date: Mon, 12 Nov 2001 07:23:05 -0800 Subject: [Python-Dev] optimizing simple function calls [was PEP 275 (switching)] In-Reply-To: <3BEF9410.789D804@lemburg.com>; from mal@lemburg.com on Mon, Nov 12, 2001 at 10:19:12AM +0100 References: <3BEF9410.789D804@lemburg.com> Message-ID: <20011112072305.A24558@glacier.arctrix.com> M.-A. Lemburg wrote: > The current solution to this problem lies in using a dispatch > table to find the case implementing method to execute depending on > the value of the switch variable (this can be tuned to have a > complexity of O(1) on average, e.g. by using perfect hash > tables). This works well for state machines which require complex > and lengthy processing in the different case methods. It does not > perform well for ones which only process one or two instructions > per case, e.g. > > def handle_data(self, data): > self.stack.append(data) It would be nice if we could make simple methods like this faster. Maybe something like a special fast path for methods with no blocks and no variable or keyword arguments. I think optimizing small functions would be a greater overall benefit to the average Python program. Neil From Prabhu Ramachandran Mon Nov 12 15:53:20 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Mon, 12 Nov 2001 21:23:20 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <3BEFE249.371EA5FD@interet.com> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> Message-ID: <15343.61552.414932.44100@monster.linux.in> >>>>> "JCA" == James C Ahlstrom writes: JCA> When Python starts, it imports site.py which imports os.py. JCA> So os.py gets loaded, and won't normally get re-loaded. The JCA> os.py that gets loaded depends on sys.path. JCA> So if os.py is in package1, it won't get loaded for "import JCA> os", but it would get loaded for "import package1.os". Are JCA> you saying that "import package1.package2.os" will load JCA> package1/os.py? I hope that "import os" will not load JCA> package1/os.py, will it? Or am I totally confused. Ummm doing an 'import os' will import the package1/os.py and *not* the standard one. This will happen even though os.py was imported earlier by site.py. This is what Gordon was objecting to in the first place and why he proposes using rimport, rrimport etc. to make things more explicit. prabhu From jack@oratrix.nl Mon Nov 12 15:55:11 2001 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 12 Nov 2001 16:55:11 +0100 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: Message by "Tim Peters" , Fri, 9 Nov 2001 17:50:42 -0500 , Message-ID: <20011112155512.A18ED303181@snelboot.oratrix.nl> > > Same here with CodeWarrior on the Mac: stdio errors return NULL or -1 > > and that is it, errno isn't touched, not even for fopen() file not > > found, etc. > > MS does set errno in most cases; the failure to set it for bad fopen() mode > strings appears to be a bug in their code. > > > If the ANSI standard requires errno to be set and people can point me > > to the right section I can submit an error report... > > No such luck, Jack: errno has always been mostly folklore in the C std, and > is almost pure folklore in C99. Ah. Then, shouldn't we have an option WITHOUT_STDIO_ERRNO or somesuch, and ignore the errno values if it is defined? "Cannot open file" doesn't say much, but it's better than "Errno 0". If I'm right in guessing that configure is only used on unix (am I?) then adding support for this option to configure isn't needed, I guess. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jeremy@zope.com Mon Nov 12 16:13:05 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 12 Nov 2001 11:13:05 -0500 (EST) Subject: [Python-Dev] order of unittest.TestCase execution? In-Reply-To: <2mn11svxwo.fsf@starship.python.net> References: <15343.50438.442326.206862@beluga.mojam.com> <030d01c16b86$35758b10$02010a0a@suxlap> <15343.57739.334927.256373@beluga.mojam.com> <2mn11svxwo.fsf@starship.python.net> Message-ID: <15343.62737.346826.783285@slothrop.digicool.com> I believe the unittest philosophy is that the tests can be run in any order. This is usually phrased the other way round: Do not write tests that depend on being executed in a particular order. The default test runner runs all the tests before it reports an errors, so I'm not sure there's much advantage to running a particular test first. Jeremy From mclay@erols.com Mon Nov 12 16:14:02 2001 From: mclay@erols.com (Michael McLay) Date: Mon, 12 Nov 2001 11:14:02 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() Message-ID: <3BEFF54A.CB3AA942@erols.com> Barry Warsaw writes: > Without BDFL override, yes, it does. > > >>>>> "TP" == Tim Peters writes: > > TP> I expect we will. Ditto __metatype__, long-winded super(), > TP> function-based property "declarations", and all the other new > TP> stuff. We're aiming for progress with the new features, not > TP> perfection . > > I believe Guido knows that it will be impossible to get all this stuff > right the first time, and even the 2.2 beta cycle won't shake out all > the problems. I think his intention was to get the basic > functionality in place for Python 2.2, and to clean up and improve the > syntax and semantics in future releases. I hope Python Labs doesn't rush Python 2.2 out early just to meet an artificial schedule. MAL's concerns about cluttering Python with several variations for spelling the same kind of feature is shared by me. I'm sure other long time users of Python would rather wait a few weeks to get something with fewer new warts. Not all of these changes have the same level of impact on day to day programming. The __metatype__ and super() feature will be used far less than __slots__. Spelling __slots__ in a more pythonic manner now will eliminate having lots of ugly code being supported forever. Tim suggested that the type checking feature might be enough to get the patch rejected. For the moment let's assume I had not added that feature. The patch does still makes adding member_descriptors more consistent with the syntax used to add properties. There is consistency in usage with property(fget, fset, doc) addmember(default, doc) Adding the default capability to adding a member is important because it will eliminate a very common initialization pattern. Instead of writing: class B(object): __slots__ = ['a','b'] def __init__(self,a=1,b="cats and dogs"): self.a=a self.b=b The addmember() feature reduces this to an easy to read: class B(object): a = addmember(default=1, doc="the ever popular a attribute") b = addmember(default="cats and dogs", doc="the b docstring") This also allows docstrings to be added to the attributes, a feature that isn't possible with the current __slots__ spelling. The type checking feature can easily be removed. I hope this doesn't happen. This feature replaces another very common design pattern. While dynamic typing is very powerful, there are many occasions when data is required to be of a specific type. For instance, engineering data exchange standards are filled with type checking requirements. Spacecraft crash into Mars for a lack of good checks. This is not only embarrassing, it's expensive. Examples of type checking in Python are described in: - http://mail.python.org/pipermail/python-dev/2001-October/017852.html The approach to adding this capability using addmember() is very pythonic. It is completely unintrusive to anyone who doesn't need it and it is very easy to use for those occasions when you must use it. From mal@lemburg.com Mon Nov 12 15:42:16 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 12 Nov 2001 16:42:16 +0100 Subject: [Python-Dev] optimizing simple function calls [was PEP 275 (switching)] References: <3BEF9410.789D804@lemburg.com> <20011112072305.A24558@glacier.arctrix.com> Message-ID: <3BEFEDD8.64324AA1@lemburg.com> Neil Schemenauer wrote: > > M.-A. Lemburg wrote: > > The current solution to this problem lies in using a dispatch > > table to find the case implementing method to execute depending on > > the value of the switch variable (this can be tuned to have a > > complexity of O(1) on average, e.g. by using perfect hash > > tables). This works well for state machines which require complex > > and lengthy processing in the different case methods. It does not > > perform well for ones which only process one or two instructions > > per case, e.g. > > > > def handle_data(self, data): > > self.stack.append(data) > > It would be nice if we could make simple methods like this faster. > Maybe something like a special fast path for methods with no blocks and > no variable or keyword arguments. I think optimizing small functions > would be a greater overall benefit to the average Python program. Sure would, it's just that some of the overhead cannot be optimized away, e.g. creation and initializing of frame objects, argument formatting etc. Perhaps there's some way to avoid all this overhead and maybe not even leave the ceval loop... e.g. by preallocating frame objects for often used code objects or improving their reuseability. Code objects could, for example, provide a prefilled frame object template which then only needs to be copied onto a fresh frame object (via memcpy()) at frame init time. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jeremy@zope.com Mon Nov 12 16:33:19 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 12 Nov 2001 11:33:19 -0500 (EST) Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15343.61552.414932.44100@monster.linux.in> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> Message-ID: <15343.63951.137972.487852@slothrop.digicool.com> >>>>> "PR" == Prabhu Ramachandran writes: PR> Ummm doing an 'import os' will import the package1/os.py and PR> *not* the standard one. This will happen even though os.py was PR> imported earlier by site.py. This is what Gordon was objecting PR> to in the first place and why he proposes using rimport, PR> rrimport etc. to make things more explicit. Of course, you can use the existing mechanism to do this: 'from package1 import os'. The use of an explicit name seems like the clearest route when you have a package-local module that shadows a top-level module -- no need to understand details of relative imports, no question about what is intended by the code. I haven't followed this thread closely. Is there some reason that explicit names in imports is not sufficient? Jeremy From tim.one@home.com Mon Nov 12 17:12:18 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 12:12:18 -0500 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: <20011112155512.A18ED303181@snelboot.oratrix.nl> Message-ID: [Jack Jansen] > Ah. Then, shouldn't we have an option WITHOUT_STDIO_ERRNO or > somesuch, and ignore the errno values if it is defined? We already have NO_FOPEN_ERRNO; the comment says "Metroworks [sic] only". > "Cannot open file" doesn't say much, but it's better than "Errno 0". #ifdef NO_FOPEN_ERRNO /* Metroworks only, not testable, so unchanged */ if (errno == 0) { PyErr_SetString(PyExc_IOError, "Cannot open file"); return NULL; } #endif So it's not saying "Errno 0" in that case. > If I'm right in guessing that configure is only used on unix (am I?) Yes. > then adding support for this option to configure isn't needed, I guess. I'm not clear on what you would like beyond NO_FOPEN_ERRNO. From tim.one@home.com Mon Nov 12 18:09:13 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 13:09:13 -0500 Subject: [Python-Dev] order of unittest.TestCase execution? In-Reply-To: <15343.50438.442326.206862@beluga.mojam.com> Message-ID: [Skip Montanaro] > I'm trying to write a regression test for dumbdbm (well, officially I'm > updating the current test, but it's taking a beating). I'm unclear on > the order of method execution. Can I rely on multiple test methods > to be run in alphabetical order? You can but you shouldn't . You can call run_unittest() more than once, though! Stick the thing you want to test first in its own class, and run that unittest first. From Prabhu Ramachandran Mon Nov 12 18:20:33 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Mon, 12 Nov 2001 23:50:33 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15343.63951.137972.487852@slothrop.digicool.com> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> Message-ID: <15344.4849.71643.868791@monster.linux.in> >>>>> "JH" == Jeremy Hylton writes: >>>>> "PR" == Prabhu Ramachandran writes: PR> Ummm doing an 'import os' will import the package1/os.py and PR> *not* the standard one. This will happen even though os.py PR> was imported earlier by site.py. This is what Gordon was PR> objecting to in the first place and why he proposes using PR> rimport, rrimport etc. to make things more explicit. JH> Of course, you can use the existing mechanism to do this: JH> 'from package1 import os'. The use of an explicit name seems JH> like the clearest route when you have a package-local module JH> that shadows a top-level module -- no need to understand JH> details of relative imports, no question about what is JH> intended by the code. JH> I haven't followed this thread closely. Is there some reason JH> that explicit names in imports is not sufficient? Yes indeed there is. I've already explained my reasons twice. Eric also explained why this was important for Scipy. Anyway, in short, its a big pain re-nesting packages. Also for any package that has a deep enough structure its a real pain accessing packages. from pkg import subpkg is also not the best way to do imports. I personally prefer import pkg.subpkg and I believe this is the recommended way of doing imports. prabhu From arigo@ulb.ac.be Mon Nov 12 20:41:02 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Mon, 12 Nov 2001 21:41:02 +0100 Subject: [Python-Dev] Re: psyco References: <200111121346.OAA12120@tepid.osl.fast.no> Message-ID: <001301c16bba$657a7da0$85ce043e@oemcomputer> Hello Kjetil, > tried to give the latest psyco a whirl. more specifically twist pystone a > bit to use a psyco proxy to get an idea of the speedup for the benchmark. > now, the downsize is that psyco segfaults with an assertion failure ;) A few trivial and one more subtle bug prevented any seriously-sized test to complete. I fixed it. http://homepages.ulb.ac.be/~arigo/psyco The pystone benchmark is not at all typical Python code :-) psyco still performs a x2 speed-up but this is not representative of possible results (it suffers from missing knowledge about integer multiplications and divisions as well as (more importantly) methods of user classes). Armin From jack@oratrix.nl Mon Nov 12 20:50:58 2001 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 12 Nov 2001 21:50:58 +0100 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: Message by "Tim Peters" , Mon, 12 Nov 2001 12:12:18 -0500 , Message-ID: <20011112205103.999071162D7@oratrix.oratrix.nl> Recently, "Tim Peters" said: > [Jack Jansen] > > Ah. Then, shouldn't we have an option WITHOUT_STDIO_ERRNO or > > somesuch, and ignore the errno values if it is defined? > > We already have NO_FOPEN_ERRNO; the comment says "Metroworks [sic] only". You're absolutely right. And I know it, I wrote it:-) I'll try to turn my brain ON before posting in the future, -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tim.one@home.com Mon Nov 12 21:13:39 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 16:13:39 -0500 Subject: [Python-Dev] Re: psyco In-Reply-To: <001301c16bba$657a7da0$85ce043e@oemcomputer> Message-ID: [Armin Rigo] > The pystone benchmark is not at all typical Python code :-) Indeed, pystone is the least typical Python program I've ever seen : it restricts itself to a subset of Python aping a C program aping an Ada program, constructed in turn to do a precise number of specific operations, where the operation counts were obtained from tracing a collection of real Ada integer systems programs and summing how many of this-and-that happened at runtime in aggregate. So it literally makes no sense. OTOH, pystone is the best predictor of Zope performance my employer has -- if nothing else, it does measure the speed of SET_LINENO opcodes . > psyco still performs a x2 speed-up but this is not representative of > possible results (it suffers from missing knowledge about integer > multiplications and divisions as well as (more importantly) methods of > user classes). 2X on pystone is nothing to snort at -- it's good! Note that there is only one class in pystone, used to emulate a C struct. Proc1() calls its .copy() method twice to emulate struct assignment, but that's the only use of class methods. By construction, there are no "killer hot spots" in pystone -- no single trick can speed it a lot. And speeding floats won't help it at all . From tim.one@home.com Mon Nov 12 21:41:35 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 16:41:35 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() In-Reply-To: <3BEF8F58.CF4E6166@lemburg.com> Message-ID: [M.-A. Lemburg] > I'd suggest that Guido marks those features he considers stable > as such and clearly states which other features should still > be condsidered experimental and not for production use. > > I intend to make some of the mx-datatypes subclassable but would > want to have to support n different ways of implementing the details > (I'll already have to support two different ways: classic and > new style... wouldn't want to do classic, new style version 2.2, > new style version 2.3, etc.) The clearest statement to date is probably here: http://aspn.activestate.com/ASPN/Mail/Message/827383 Note that it ends with: We should document this more clearly and in more detail. And we should. The internal details are pretty stable now; *perhaps* they'll need to be rearranged to cater to things that can't be done at all now, but that's true of every part of the language (albeit especially true of large new features). __slots__ et alia are shallow spelling details, where "shallow" doesn't mean unimportant but that changing the spelling has scant effect on the internals -- note how Michael bragged about how short and unintrusive his __slot__-respelling patch was <0.9 wink>. Adding new ways to spell things shouldn't have any effect on how you need to implement your extension types. Note that we've already gone thru the exercise of making all the basic builtin types subclassable, so have some confidence that the subclassing APIs are both usable and solid. They've also been stable (e.g., I can't think of any change to them between the last alpha and current CVS). From jeremy@alum.mit.edu Mon Nov 12 18:35:47 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Mon, 12 Nov 2001 13:35:47 -0500 (EST) Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.4849.71643.868791@monster.linux.in> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> Message-ID: <15344.5763.557118.787529@walden.zope.com> >>>>> "PR" == Prabhu Ramachandran writes: >>>>> "JH" == Jeremy Hylton writes: JH> I haven't followed this thread closely. Is there some reason JH> that explicit names in imports is not sufficient? PR> Yes indeed there is. I've already explained my reasons twice. PR> Eric also explained why this was important for Scipy. I've gone back through the messages on python-dev, but don't seem a clear summary of the issues that lead to your proposed change. The best I can come up with are: 1) packages that are re-nested are a pain and 2) complex package structures also cause problems. Eric has a specific set of issues with SciPy that involve packages that are developed and used externally but also included in SciPy. I have had a hard time trying to figure out precisely what the problems are. PR> Anyway, in short, its a big pain re-nesting packages. Also for PR> any package that has a deep enough structure its a real pain PR> accessing packages. What does "re-nesting" mean? It get the impression you mean putting one package inside another after it was developed and pacakged for use as a top-level package. If so, it doesn't seem like a problem that occurs that often, right? I'd be hesitant to add features to the import mechanism to cater to an infrequent case. I'd rather see the imports be explicit "import root.a.b.c" than "import b.c". Then re-nesting requires all the import statements to be edited. It's more typing and might even require a simple script to do search-and-replace, but it doesn't sound like a prohibitive burden. I expect there is more to the issue than just wanting to avoid some extra typing. A short PEP that describes the specific problems being solved and discussing alternatives would help. PR> from pkg import subpkg is also not the best way to do imports. I PR> personally prefer import pkg.subpkg and I believe this is the PR> recommended way of doing imports. Why do you think this is the recommended way of doing imports? I use both in my code and haven't been able to come up with a clear rationale for doing one or the other. The from ... import form seems useful when the name of the package/module is long or when it's only one or two names I'm using. Jeremy From tim.one@home.com Mon Nov 12 21:52:32 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 16:52:32 -0500 Subject: [Python-Dev] RE: test_file failing on Windows In-Reply-To: <20011112205103.999071162D7@oratrix.oratrix.nl> Message-ID: [Tim] > We already have NO_FOPEN_ERRNO; the comment says "Metroworks > [sic] only". [Jack Jansen] > You're absolutely right. And I know it, I wrote it:-) > > I'll try to turn my brain ON before posting in the future, See, that's the difference between you and Guido: you apologize, while Guido would imply (if not claim outright) that this was an instance of his time machine at work. So you come off looking like a klutz, where Guido would have exploited it to enhance his reputation as a Living God . admiring-the-perfidious-dutch-and-their-nefarious-schemes-ly y'rs - tim From gmcm@hypernet.com Mon Nov 12 22:17:19 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 12 Nov 2001 17:17:19 -0500 Subject: [Python-Dev] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.5763.557118.787529@walden.zope.com> References: <15344.4849.71643.868791@monster.linux.in> Message-ID: <3BF0041F.517.82A5AD5B@localhost> Jeremy wrote: [relative and recursively-relative imports] > I'd rather see the imports be explicit "import root.a.b.c" than > "import b.c". Then re-nesting requires all the import statements > to be edited. It's more typing and might even require a simple > script to do search-and-replace, but it doesn't sound like a > prohibitive burden. As a (minor) data point, if "b.c" is resolved as a relative import, it will be faster than the absolute form ("import a.b.c"). Having re-arranged a number of packages, I have some sympathy for Prabhu's complaint. OTOH, this is a feature which only helps package authors (not package users, who are likely to have a somewhat harder time finding their way around the package). [Prabhu] > PR> from pkg import subpkg is also not the best way to do > imports. I PR> personally prefer import pkg.subpkg and I > believe this is the PR> recommended way of doing imports. [Jeremy] > Why do you think this is the recommended way of doing imports? I > use both in my code and haven't been able to come up with a clear > rationale for doing one or the other. The from ... import form > seems useful when the name of the package/module is long or when > it's only one or two names I'm using. When you have circular imports, someone must use the "import a.b.c" form. This can show up in some surprising ways, especially when the package in question desparately needs re-arranging . - Gordon From michel@zope.com Mon Nov 12 22:21:01 2001 From: michel@zope.com (Michel Pelletier) Date: Mon, 12 Nov 2001 14:21:01 -0800 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: <3BEFF54A.CB3AA942@erols.com> Message-ID: <3BF04B4D.1DE451E@zope.com> Michael McLay wrote: > Tim suggested that the type checking feature might be enough to get the > patch rejected. For the moment let's assume I had not added that > feature. Given this assumption, i like your patch and the spelling and think it is reasonable. Without the assumption, I would suggest you throw down some details on the types-sig list. Lots of folks have ideas and its a rather hot topic for some (but the discussion has always been very civil!). I for one, would like to revive the type checking discussion because it's a sticky problem that needs to be solved. So far, there have been a few different proposed solutions. PaulP proposed and implemented a prototype that did flexible type checking. I proposed (PEP 245) an interface syntax that is somewhat orthogonal to that. Clark Evans also talks about type checking in PEP 246. > The patch does still makes adding member_descriptors more consistent > with > the syntax used to add properties. I agree. -Michel From barry@zope.com Mon Nov 12 22:32:27 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 12 Nov 2001 17:32:27 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> Message-ID: <15344.19963.181812.444906@anthem.wooz.org> >>>>> "JH" == Jeremy Hylton writes: JH> I'd rather see the imports be explicit "import root.a.b.c" JH> than "import b.c". Then re-nesting requires all the import JH> statements to be edited. It's more typing and might even JH> require a simple script to do search-and-replace, but it JH> doesn't sound like a prohibitive burden. Note that applications can achieve the same thing without editing code by doing sys.path manipulations. JH> I expect there is more to the issue than just wanting to avoid JH> some extra typing. A short PEP that describes the specific JH> problems being solved and discussing alternatives would help. Indeed. We've been here before (perhaps, several "befores" :). Every time this comes up I get the feeling like there are easy ways to accomplish what you want if you think of the problem differently, or I'm missing something fundamental about the problem, and/or the problem has never been specified identified, or people are trying to solve too many problems at once. Are the needs of application authors different than library authors? -Barry From michel@zope.com Mon Nov 12 23:10:18 2001 From: michel@zope.com (Michel Pelletier) Date: Mon, 12 Nov 2001 15:10:18 -0800 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.19963.181812.444906@anthem.wooz.org> Message-ID: <3BF056DA.74724E5C@zope.com> "Barry A. Warsaw" wrote: > > Are the needs of application authors different than library authors? This is the best place to start, almost everyone on this list plays both roles to one degree or another. I've read Prabhu's emails and I understand his problem. He's explained it a couple of times, but in general users and their needs have been unclear which I think spawned most of this discussion. I'm actually sort of interested in more about the idea of 'looking up' for packages that Prabhu mentioned and think it could be very useful. In Zope we call this "acquisition" and we use this pattern many, many times to override general site policies and objects with more specific ones the farther "down" you go in the object heirarchy. This not only gives us a nice customization model, but it also gives us a nice delegation model, ie, those responsible on high (that's all of you) can dictate what is and is not the standard library "policy" and users below you (that's me and all the other lusers) can specificly override that mandate at a lower level without interfering with other users. -Michel From michel@zope.com Mon Nov 12 23:23:37 2001 From: michel@zope.com (Michel Pelletier) Date: Mon, 12 Nov 2001 15:23:37 -0800 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: <3BEFF54A.CB3AA942@erols.com> <3BF04B4D.1DE451E@zope.com> Message-ID: <3BF059F9.FF08277@zope.com> Michel Pelletier wrote: > PaulP proposed and implemented a > prototype that did flexible type checking. I proposed (PEP 245) an > interface syntax that is somewhat orthogonal to that. Clark Evans also > talks about type checking in PEP 246. There's also Grouch: http://www.mems-exchange.org/software/grouch/ -Michel From thomas@xs4all.net Tue Nov 13 00:14:18 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 13 Nov 2001 01:14:18 +0100 Subject: [Python-Dev] switch in Python. Message-ID: <20011113011418.C466@xs4all.nl> I uploaded a proof-of-concept of the switch statement to SourceForge: https://sourceforge.net/tracker/index.php?func=detail&aid=481118&group_id=5470&atid=305470 It's a diff against the current CVS tree, taken from 'src/' -- so apply in src/ with -p0. It's a quick and almost certainly buggy hack. It bypasses the c_consts_dict in an ugly way, and might break code that tries to do stuff with code objects. It probably doesn't keep track of the stack properly, and skips SET_LINENO opcodes. It only allows string or numerical literals, though it does get the '1000 == 1000.0 == 1000L' trick for free. It doesn't contain changes to docs or dis.py, and it doesn't use a future statement. It _does_ contain a change to distutils, because it uses a variable 'switch' :-) And I haven't profiled it in any way. In my defense, I don't recall writing it, so I was probably sleeping at the time (or at least I should have been.) Anyway, here's how it works: """ def whatis(x): switch x: case "number one": return "The Larch." case "spam", "Spam", "SPAM": return "It's " + "SPAM "*31 case 1.1: return "one dot one" case 1000L: return "a lot." case 1j: return "Not *really* there." else: return "D'oh, donno that!" for val in (1j, "number one", 1.1, 1000, 1000L, 1000.0, 1000.0005, "SPAM", "spam", "spaM", "eggs and ham"): print "What is", repr(val), "?" print whatis(val) print """ thomas@stalker:~/python-cvs/dist/src$ ./python example.py What is 1j ? Not *really* there. What is 'number one' ? The Larch. What is 1.1000000000000001 ? one dot one What is 1000 ? a lot. What is 1000L ? a lot. What is 1000.0 ? a lot. What is 1000.0005 ? D'oh, donno that! What is 'SPAM' ? It's SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM What is 'spam' ? It's SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM What is 'spaM' ? D'oh, donno that! What is 'eggs and ham' ? D'oh, donno that! -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Tue Nov 13 00:33:14 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 12 Nov 2001 19:33:14 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() In-Reply-To: <3BEFF54A.CB3AA942@erols.com> Message-ID: [Michael McLay] > I hope Python Labs doesn't rush Python 2.2 out early just to meet an > artificial schedule. Given the effort that goes into making them up, there's nothing artificial about them . > MAL's concerns about cluttering Python with several variations for > spelling the same kind of feature is shared by me. It seemed to me MAL was worried about something else, namely not having to redo the guts of his provisions for user subclassing of sundry mx objects (coded in C). > I'm sure other long time users of Python would rather wait a few weeks > to get something with fewer new warts. I don't expect that's a realistic tradeoff. While I'm sure Guido could make excellent progress cleaning it all up given "a few weeks" with nothing else on his plate, the latter isn't going to happen -- in reality it's going to stretch over months, accompanied by much debate. Given holiday schedules and other commitments, if we don't do the December release as planned, I expect 2.2 would slip beyond the Python Conference at best. We don't want that. > Not all of these changes have the same level of impact on day to day > programming. The __metatype__ and super() feature will be used far > less than __slots__. This will vary by programmer. > Spelling __slots__ in a more pythonic manner now will eliminate having > lots of ugly code being supported forever. Where? Inside Python? > Tim suggested that the type checking feature might be enough to get the > patch rejected. For the moment let's assume I had not added that > feature. The patch does still makes adding member_descriptors more > consistent with the syntax used to add properties. There is consistency > in usage with > > property(fget, fset, doc) > addmember(default, doc) I don't accept that's a Good Thing, though -- there's nothing notably Pythonic about the spelling of properties now either, and indeed it all looks very Lispish ("everything's a function application, and everything looks like everything else"). At least "__slots__" doesn't look like "property()" -- there's a visual clue that they're not the same thing. Note too that your property example is wrong: you've passed a docstring in the fdel method's position. This is a symptom of strained spelling. > Adding the default capability to adding a member is important because > it will eliminate a very common initialization pattern. Instead of > writing: > > class B(object): > __slots__ = ['a','b'] > def __init__(self,a=1,b="cats and dogs"): > self.a=a > self.b=b > > The addmember() feature reduces this to an easy to read: > > class B(object): > a = addmember(default=1, doc="the ever popular a attribute") > b = addmember(default="cats and dogs", doc="the b docstring") Sorry, to my eye that's hard to read, and I bet a doughnut Guido will dream up something better when he feels the time is right. IOW, IMO the suggestion is itself another wart. I agree something's wrong with the current semantics: if you reference an uninitialized __slot__ vrbl now you don't get NameError. > This also allows docstrings to be added to the attributes, a feature > that isn't possible with the current __slots__ spelling. Neither with non-slot attributes. Is not adding docstring ability to one specific flavor of attribute, but not others, also a wart? > The type checking feature can easily be removed. I hope this doesn't > happen. This feature replaces another very common design pattern. While > dynamic typing is very powerful, there are many occasions when data is > required to be of a specific type. For instance, engineering data > exchange standards are filled with type checking requirements. > Spacecraft crash into Mars for a lack of good checks. This is not only > embarrassing, it's expensive. This is the province of the Types SIG, and adding one typecheck gimmick to one isolated feature would be a wart too. > Examples of type checking in Python are described in: > - http://mail.python.org/pipermail/python-dev/2001-October/017852.html Among many other places, like, say, the Types SIG . > The approach to adding this capability using addmember() is very > pythonic. It is completely unintrusive to anyone who doesn't need it > and it is very easy to use for those occasions when you must use it. Write a PEP. The issues deserve consideration, and more than we can make time for before 2.2. If Guido thinks otherwise he'll speak up. From greg@cosc.canterbury.ac.nz Tue Nov 13 00:46:31 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 13 Nov 2001 13:46:31 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: Message-ID: <200111130046.NAA12752@s454.cosc.canterbury.ac.nz> Paul Svensson : > On Fri, 9 Nov 2001, Donald Beaudry wrote: > > when EXPR: > > in CONSTANT_TUPLE: > > [suite] > > else: > > [suite] > you're absolutely right on the indentation of the "else". Really? To me, the else is just another branch, and should be on the same level as all the others. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From paul@svensson.org Tue Nov 13 01:49:05 2001 From: paul@svensson.org (Paul Svensson) Date: Mon, 12 Nov 2001 20:49:05 -0500 (EST) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111130046.NAA12752@s454.cosc.canterbury.ac.nz> Message-ID: On Tue, 13 Nov 2001, Greg Ewing wrote: >Paul Svensson : > >> On Fri, 9 Nov 2001, Donald Beaudry wrote: > >> > when EXPR: >> > in CONSTANT_TUPLE: >> > [suite] >> > else: >> > [suite] > >> you're absolutely right on the indentation of the "else". > >Really? To me, the else is just another branch, and should >be on the same level as all the others. If you ignore how you get there, it's just another branch. But, how you get there is the greates distinction between the "else" branch and the other branches, and I think this should be emphasized, not ignored. Compare how Python uses "else" not only with "if" statements, but also with for, while, and except. Having the "else" indented with the "when" also makes it immediately obvious that there can't be more than one, and it has to go at the end. Besides, I find it visually more appealing. /Paul From Prabhu Ramachandran Tue Nov 13 05:08:34 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Tue, 13 Nov 2001 10:38:34 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.5763.557118.787529@walden.zope.com> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> Message-ID: <15344.43730.821724.769885@monster.linux.in> >>>>> "JH" == Jeremy Hylton writes: >>>>> "PR" == Prabhu Ramachandran writes: JH> I haven't followed this thread closely. Is there some reason JH> that explicit names in imports is not sufficient? PR> Yes indeed there is. I've already explained my reasons twice. PR> Eric also explained why this was important for Scipy. [snip] JH> I have had a hard time trying to figure out precisely what the JH> problems are. I think you got it mostly right. Let me try to elaborate on it. (1) Re-nesting a package is a pain. What I mean by re-nesting is that say I have a package, A, that is separate (and that has its own sub packages) and now I want it as part of another package, B. Lets further suppose that the module which re-nests the package, B, tracks the development of A and keeps their copy updated. In this case A is developed as a standalone package and B adds something to it that A cannot/refuses to use. With the current approach B would be forced to modify A every time A changes in some significant way simply because A was re-nested. Yes, this is contrived but such situations do occur. To make things clearer. My main objection is that the name of a package when one imports it depends on its parent packages name. This is IMHO absurd. foo/ sub/ sub1/ From sub1 if you had to import anything from sub you'd have to do import foo.sub.module. So if foo is now part of something else - you have to change all references to foo. (2) If you have a complex package with more than 2-3 nested sub directories it becomes a huge pain to use clean import statements and not have to type long lines just to get to different modules. (3) If you argue that import must always do only absolute imports then why are sibling packages allowed? i.e. if there are two modules in the same directory Python currently allows one to import them with a relative name rather than an absolute foo.sub.pkg kind of name. If this is valid, then its natural to expect that searching also be done in the local package structure. (4) Yes, its possible re-factoring code but sometimes this can be a pain if you have a CVS tree and you want to re-organize your package structure. Bernhard Herzog posted a solution for my specific problem, so that really is not the issue. In my case the current cvsroot for my sources is this: cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mayavi/mayavi/ and Bernhard's solution would create a directory structure like so: cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mayavi/mayavi/mayavi/ Which is pretty crazy if you ask me, its bad enough as it is. :) this will solve my particular problem but is dirty. JH> What does "re-nesting" mean? It get the impression you mean JH> putting one package inside another after it was developed and JH> pacakged for use as a top-level package. If so, it doesn't JH> seem like a problem that occurs that often, right? I'd be JH> hesitant to add features to the import mechanism to cater to JH> an infrequent case. I've had about 4 others mailing me about their related problems. So I wouldn't classify this as a rare problem that can be safely ignored. JH> I'd rather see the imports be explicit "import root.a.b.c" JH> than "import b.c". Then re-nesting requires all the import JH> statements to be edited. It's more typing and might even JH> require a simple script to do search-and-replace, but it JH> doesn't sound like a prohibitive burden. It all depends. I think Eric explained his position pretty clearly. I'm convinced that Python's import structure needs improvement. JH> I expect there is more to the issue than just wanting to avoid JH> some extra typing. A short PEP that describes the specific JH> problems being solved and discussing alternatives would help. Well, its all about convenience anyway - if not we'd all be talking to computers in binary! Why do we need 'high-level' programming languages? Yes, I'm digressing into te philosophy of computing but I dont think syntactic sugar is something to be ignored because its silly. PR> from pkg import subpkg is also not the best way to do PR> imports. I personally prefer import pkg.subpkg and I believe PR> this is the recommended way of doing imports. JH> Why do you think this is the recommended way of doing imports? JH> I use both in my code and haven't been able to come up with a JH> clear rationale for doing one or the other. The from JH> ... import form seems useful when the name of the JH> package/module is long or when it's only one or two names I'm JH> using. Well, the Python howto explains it much better than I could hope to: http://py-howto.sourceforge.net/doanddont/node8.html Since re-loading packages is important for me, I prefer using plain imports. prabhu From Prabhu Ramachandran Tue Nov 13 05:20:03 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Tue, 13 Nov 2001 10:50:03 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.19963.181812.444906@anthem.wooz.org> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.19963.181812.444906@anthem.wooz.org> Message-ID: <15344.44419.359241.47222@monster.linux.in> >>>>> "BAW" == Barry A Warsaw writes: JH> I'd rather see the imports be explicit "import root.a.b.c" JH> than "import b.c". Then re-nesting requires all the import JH> statements to be edited. It's more typing and might even JH> require a simple script to do search-and-replace, but it JH> doesn't sound like a prohibitive burden. BAW> Note that applications can achieve the same thing without BAW> editing code by doing sys.path manipulations. Its not the application that I'm concerned about - an application is typically a single/few file(s) and editing them to suit things is certainly not an issue. But editing 100 files inside a package each time the parent changes is nuts. There is another way to get around this by manipulating __path__ inside a sub package. But this leads to the same module being imported several times. This is what I use currently and its evil. :( JH> I expect there is more to the issue than just wanting to avoid JH> some extra typing. A short PEP that describes the specific JH> problems being solved and discussing alternatives would help. BAW> Indeed. We've been here before (perhaps, several "befores" BAW> :). Every time this comes up I get the feeling like there BAW> are easy ways to accomplish what you want if you think of the So do I need to write a PEP? Is there some special formality/format I need to keep in mind? BAW> problem differently, or I'm missing something fundamental BAW> about the problem, and/or the problem has never been BAW> specified identified, or people are trying to solve too many BAW> problems at once. BAW> Are the needs of application authors different than library BAW> authors? I would think so. prabhu From barry@zope.com Tue Nov 13 06:01:21 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 13 Nov 2001 01:01:21 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> Message-ID: <15344.46897.223437.983547@anthem.wooz.org> >>>>> "PR" == Prabhu Ramachandran writes: PR> (1) Re-nesting a package is a pain. What I mean by PR> re-nesting is that say I have a package, A, that is separate PR> (and that has its own sub packages) and now I want it as part PR> of another package, B. Why would you want to do that? Why not just keep them separate top-level packages that cooperate? Or export A's names in B's modules? I think distutils helps out here because it's now easy to install A in a way that B could just use, or add to. FWIW, we knit things together as well, e.g. with StandaloneZODB. It's got a bunch of top-level packages that are treated as a single entity via a figment of CVS's imagination. So what if it installs a bunch of separate top-level package names that aren't all treed under a single package? PR> Lets further suppose that the module which re-nests the PR> package, B, tracks the development of A and keeps their copy PR> updated. Okay. PR> In this case A is developed as a standalone package and B adds PR> something to it that A cannot/refuses to use. Okay. PR> With the current approach B would be forced to modify A every PR> time A changes in some significant way simply because A was PR> re-nested. Yes, this is contrived but such situations do PR> occur. Why does B have to add packages to A's namespace? Why can't the B author simply use distutils to ensure that vanilla A is installed, import the bits and pieces of A that you want to expose, overriding what you want to change, and export an interface through B that clients can use instead of A? I.e. through the use of "from foo import bar" and "from foo import bar as baz", you can present whatever public interface you want, through B's namespace, and mimic as much or as little of A's as you want. PR> Its not the application that I'm concerned about - an PR> application is typically a single/few file(s) and editing them PR> to suit things is certainly not an issue. Well, not /all/ applications! JH> I expect there is more to the issue than just wanting to avoid JH> some extra typing. A short PEP that describes the specific JH> problems being solved and discussing alternatives would help. BAW> Indeed. We've been here before (perhaps, several "befores" BAW> :). Every time this comes up I get the feeling like there BAW> are easy ways to accomplish what you want if you think of the PR> So do I need to write a PEP? Is there some special PR> formality/format I need to keep in mind? PEP 1 and PEP 9 are your guidelines to proper PEP form and procedure. BAW> Are the needs of application authors different than library BAW> authors? PR> I would think so. That would be good to outline in your PEP then . -Barry From Prabhu Ramachandran Tue Nov 13 08:38:31 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Tue, 13 Nov 2001 14:08:31 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <3BF056DA.74724E5C@zope.com> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.19963.181812.444906@anthem.wooz.org> <3BF056DA.74724E5C@zope.com> Message-ID: <15344.56327.337652.24737@monster.linux.in> >>>>> "MP" == Michel Pelletier writes: MP> I'm actually sort of interested in more about the idea of MP> 'looking up' for packages that Prabhu mentioned and think it MP> could be very useful. In Zope we call this "acquisition" and MP> we use this pattern many, many times to override general site MP> policies and objects with more specific ones the farther MP> "down" you go in the object heirarchy. This not only gives us MP> a nice customization model, but it also gives us a nice MP> delegation model, ie, those responsible on high (that's all of MP> you) can dictate what is and is not the standard library MP> "policy" and users below you (that's me and all the other MP> lusers) can specificly override that mandate at a lower level MP> without interfering with other users. Yes, this is a nicer and probably more convincing use/argument than I had proposed. prabhu From Prabhu Ramachandran Tue Nov 13 08:56:45 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Tue, 13 Nov 2001 14:26:45 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.46897.223437.983547@anthem.wooz.org> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> <15344.46897.223437.983547@anthem.wooz.org> Message-ID: <15344.57421.641192.998739@monster.linux.in> >>>>> "BAW" == Barry A Warsaw writes: >>>>> "PR" == Prabhu Ramachandran writes: PR> (1) Re-nesting a package is a pain. What I mean by re-nesting [Re-nesting packages contrived example] BAW> Why would you want to do that? Why not just keep them BAW> separate top-level packages that cooperate? Or export A's BAW> names in B's modules? I think distutils helps out here BAW> because it's now easy to install A in a way that B could just BAW> use, or add to. Umm, that was a contrived example so might not be very sensible. For a more realistic one I think I'll pass the question on to Eric. I think Eric did mention his difficulty with SciPy here: http://mail.python.org/pipermail/python-list/2001-November/071794.html BAW> Why does B have to add packages to A's namespace? Why can't BAW> the B author simply use distutils to ensure that vanilla A is BAW> installed, import the bits and pieces of A that you want to BAW> expose, overriding what you want to change, and export an BAW> interface through B that clients can use instead of A? BAW> I.e. through the use of "from foo import bar" and "from foo BAW> import bar as baz", you can present whatever public interface BAW> you want, through B's namespace, and mimic as much or as BAW> little of A's as you want. Ture, its possible to do things and work around situations with the current scheme. I guess I need to come up with something that definitively proves my point. Will think about it. Maybe Gordon has a better/more convincing argument? I think Michel Pelletier also had a different point of view on this. PR> Its not the application that I'm concerned about - an PR> application is typically a single/few file(s) and editing them PR> to suit things is certainly not an issue. BAW> Well, not /all/ applications! Indeed. I guess I caused confusion here. I was talking of my particular application where I ran into problems with re-nesting and too much typing I was referring to that. I certainly don't intend changing every single application when there is no need for that. PR> So do I need to write a PEP? Is there some special PR> formality/format I need to keep in mind? BAW> PEP 1 and PEP 9 are your guidelines to proper PEP form and BAW> procedure. Thanks. Will look at them. prabhu From skip@pobox.com (Skip Montanaro) Tue Nov 13 09:26:19 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 13 Nov 2001 10:26:19 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 In-Reply-To: <3BEF9410.789D804@lemburg.com> References: <3BEF9410.789D804@lemburg.com> Message-ID: <15344.59195.231638.813325@beluga.mojam.com> mal> Syntax: mal> switch EXPR: mal> case CONSTANT: mal> SUITE mal> case CONSTANT: mal> SUITE mal> ... mal> else: mal> SUITE mal> (modulo indentation variations) mal> The "else" part is optional. If no else part is given and none mal> of the defined cases matches, a ValueError is raised. Hmmm... This doesn't jive well with current if statement semantics. I can write if x == "first": dofirst() and no ValueError is raised if x == "second". Why should switch be any different? Skip From mal@lemburg.com Tue Nov 13 09:33:46 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 13 Nov 2001 10:33:46 +0100 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: Message-ID: <3BF0E8FA.65FACA5C@lemburg.com> Tim Peters wrote: > > [Michael McLay] > > I hope Python Labs doesn't rush Python 2.2 out early just to meet an > > artificial schedule. > > Given the effort that goes into making them up, there's nothing artificial > about them . > > > MAL's concerns about cluttering Python with several variations for > > spelling the same kind of feature is shared by me. > > It seemed to me MAL was worried about something else, namely not having to > redo the guts of his provisions for user subclassing of sundry mx objects > (coded in C). Right and as far as I understood your posting, there's nothing to worry about (which is good :-). OTOH, I would also like to use some of the new features for new code I write in Python and there things still look a lot less stable, which I find unfortunate. May be FUD and indeed I hope it is... Perhaps Guido should just write up a short list of features which he considers stable and another list of features which may still change in future Python releases. I suppose .__slots__ is one of the latter. Adding static class methods and the like probably too. While #ifdefs are easy in C, they're a nightmare in Python and I wouldn't want to start coding in an interface-driver pattern just to make Python code portable between releases (hey, I still support Python 1.5.2 for much of my stuff...). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Nov 13 09:52:11 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 13 Nov 2001 10:52:11 +0100 Subject: [Python-Dev] Replacing __slots__ with addmembers() References: Message-ID: <3BF0ED4B.163CF4FF@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > I'd suggest that Guido marks those features he considers stable > > as such and clearly states which other features should still > > be condsidered experimental and not for production use. > > > > I intend to make some of the mx-datatypes subclassable but would > > want to have to support n different ways of implementing the details > > (I'll already have to support two different ways: classic and > > new style... wouldn't want to do classic, new style version 2.2, > > new style version 2.3, etc.) > > The clearest statement to date is probably here: > > http://aspn.activestate.com/ASPN/Mail/Message/827383 > > Note that it ends with: > > We should document this more clearly and in more detail. > > And we should. With "we" I presume you mean the Python Labs Team ;-) You seem to have more insight into the workings behind this than anyone else. > The internal details are pretty stable now; *perhaps* they'll need to be > rearranged to cater to things that can't be done at all now, but that's true > of every part of the language (albeit especially true of large new > features). Good to know. Now at least I can start turning mxDateTime into a new style class :-) > __slots__ et alia are shallow spelling details, where "shallow" doesn't mean > unimportant but that changing the spelling has scant effect on the > internals -- note how Michael bragged about how short and unintrusive his > __slot__-respelling patch was <0.9 wink>. Adding new ways to spell things > shouldn't have any effect on how you need to implement your extension types. > Note that we've already gone thru the exercise of making all the basic > builtin types subclassable, so have some confidence that the subclassing > APIs are both usable and solid. They've also been stable (e.g., I can't > think of any change to them between the last alpha and current CVS). True, but I'm also thinking about writing new code in Python which uses these features and there I don't see the stability of the API just yet (but would really like them to stabilize *before* 2.2 moves out the door and even if this means waiting until after Christmas ;-). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Nov 13 09:45:19 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 13 Nov 2001 10:45:19 +0100 Subject: [Python-Dev] switch in Python. References: <20011113011418.C466@xs4all.nl> Message-ID: <3BF0EBAF.5E42BFF5@lemburg.com> Thomas Wouters wrote: > > I uploaded a proof-of-concept of the switch statement to SourceForge: > > https://sourceforge.net/tracker/index.php?func=detail&aid=481118&group_id=5470&atid=305470 > > It's a diff against the current CVS tree, taken from 'src/' -- so apply in > src/ with -p0. Cool. I'll add the pointer to the PEP 275. > It's a quick and almost certainly buggy hack. It bypasses the c_consts_dict > in an ugly way, and might break code that tries to do stuff with code > objects. It probably doesn't keep track of the stack properly, and skips > SET_LINENO opcodes. It only allows string or numerical literals, though it > does get the '1000 == 1000.0 == 1000L' trick for free. It doesn't contain > changes to docs or dis.py, and it doesn't use a future statement. It _does_ > contain a change to distutils, because it uses a variable 'switch' :-) And I > haven't profiled it in any way. In my defense, I don't recall writing it, so > I was probably sleeping at the time (or at least I should have been.) One question: What happens if the implementation finds that x is not hashable ? > Anyway, here's how it works: > > """ > def whatis(x): > switch x: > case "number one": > return "The Larch." > case "spam", "Spam", "SPAM": > return "It's " + "SPAM "*31 > case 1.1: > return "one dot one" > case 1000L: > return "a lot." > case 1j: > return "Not *really* there." > else: > return "D'oh, donno that!" > > for val in (1j, "number one", 1.1, 1000, 1000L, 1000.0, 1000.0005, > "SPAM", "spam", "spaM", "eggs and ham"): > print "What is", repr(val), "?" > print whatis(val) > print > """ > > thomas@stalker:~/python-cvs/dist/src$ ./python example.py > > What is 1j ? > Not *really* there. > > What is 'number one' ? > The Larch. > > What is 1.1000000000000001 ? > one dot one > > What is 1000 ? > a lot. > > What is 1000L ? > a lot. > > What is 1000.0 ? > a lot. > > What is 1000.0005 ? > D'oh, donno that! > > What is 'SPAM' ? > It's SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM > SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM > SPAM SPAM > What is 'spam' ? > It's SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM > SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM > SPAM SPAM > What is 'spaM' ? > D'oh, donno that! > > What is 'eggs and ham' ? > D'oh, donno that! Very nice indeed :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Tue Nov 13 10:05:10 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 13 Nov 2001 11:05:10 +0100 Subject: [Python-Dev] switch in Python. In-Reply-To: <3BF0EBAF.5E42BFF5@lemburg.com> References: <20011113011418.C466@xs4all.nl> <3BF0EBAF.5E42BFF5@lemburg.com> Message-ID: <20011113110510.N4462@xs4all.nl> On Tue, Nov 13, 2001 at 10:45:19AM +0100, M.-A. Lemburg wrote: > One question: What happens if the implementation finds that x is > not hashable ? It _should_ propagate the error upwards, but I see now that the error gets eaten for some reason. The runtime part of switch is very small: case SWITCH: w = POP(); v = POP(); x = PyDict_GetItem(w, v); if (!x && PyErr_Occurred()) break; if (!x) { JUMPBY(oparg); continue; } JUMPBY(PyInt_AsLong(x)); continue; That is, PyDict_GetItem() is supposed to return NULL for missing keys, but without setting an exception. So if an exception is set, it's a regular error in the dict lookup, and the error is simply propagated upwards. Oh, wait, I think I see the problem: stack underflow. The value reaching the switch is not the one put in, when passing (e.g.) a list to 'whatis'. No time to look at this now though. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Tue Nov 13 10:08:07 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 13 Nov 2001 11:08:07 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 References: <3BEF9410.789D804@lemburg.com> <15344.59195.231638.813325@beluga.mojam.com> Message-ID: <3BF0F107.E891A27B@lemburg.com> Skip Montanaro wrote: > > mal> Syntax: > > mal> switch EXPR: > mal> case CONSTANT: > mal> SUITE > mal> case CONSTANT: > mal> SUITE > mal> ... > mal> else: > mal> SUITE > > mal> (modulo indentation variations) > > mal> The "else" part is optional. If no else part is given and none > mal> of the defined cases matches, a ValueError is raised. > > Hmmm... This doesn't jive well with current if statement semantics. I can > write > > if x == "first": > dofirst() > > and no ValueError is raised if x == "second". Why should switch be any > different? Hmm, you may have a point there. If the programmer wants an exception to be raised in case none of the values matches, she can put that code into the else-clause... I'll have to think about this some more, but it seems that you're right. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paul-python@svensson.org Tue Nov 13 12:52:42 2001 From: paul-python@svensson.org (Paul Svensson) Date: Tue, 13 Nov 2001 07:52:42 -0500 (EST) Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 In-Reply-To: <15344.59195.231638.813325@beluga.mojam.com> Message-ID: On Tue, 13 Nov 2001, Skip Montanaro wrote: > > mal> Syntax: > > mal> switch EXPR: > mal> case CONSTANT: > mal> SUITE > mal> case CONSTANT: > mal> SUITE > mal> ... > mal> else: > mal> SUITE > > mal> (modulo indentation variations) > > mal> The "else" part is optional. If no else part is given and none > mal> of the defined cases matches, a ValueError is raised. > >Hmmm... This doesn't jive well with current if statement semantics. I can >write > > if x == "first": > dofirst() > >and no ValueError is raised if x == "second". Why should switch be any >different? A switch is a different beast, and should be considered afresh, and not just as syntactic sugar for a (restricted) if-elif-else list. However, in all other places Python allows an else clause, a missing one is treated as ``else: pass'', and I don't see any compelling reason why a switch should behave otherwise. /Paul From jack@oratrix.nl Tue Nov 13 14:32:55 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 13 Nov 2001 15:32:55 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 In-Reply-To: Message by Paul Svensson , Tue, 13 Nov 2001 07:52:42 -0500 (EST) , Message-ID: <20011113143255.E6B02303183@snelboot.oratrix.nl> Even though I'm not sure I like the switch idea (and I won't even contemplate how Guido will react when he comes back and sees what we've been spending our time on:-) there's one very special case of switch that I would like, and that's the Algol 68 style switch on type. If we had something like def foo(x): switch type(x): case int: do something case string: do something else this would be a nice point to hook into for something that tries to compile Python to C or somesuch. Hmm, you would probably need a tuple-based switch as well: switch type(x), type(y): case int, int: .... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From arigo@ulb.ac.be Tue Nov 13 14:58:07 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Tue, 13 Nov 2001 15:58:07 +0100 (MET) Subject: [Python-Dev] Re: psyco In-Reply-To: Message-ID: Hello everybody, > 2X on pystone is nothing to snort at -- it's good! Thanks, but I believe that much better results can be obtained. However, I have a problem here. I am starting to run out of free time for Psyco. I would like to put efforts in helping people understand how Psyco works, so that it can be developped up to a really usable state -- it is currently more of a proof-of-concept, but it would be just too bad to loose this opportunity, I guess. Besides, I believe that with some more work we could really turn the current version into something great. That's just too much for me alone now (I should be doing a math Ph.D. these days you know ;-). So my request is, Does anyone feel like investing some time in Psyco just on the basis of the current preliminary results ? Shall I write a technical in-depth overview ? A bientot, Armin. From barry@zope.com Tue Nov 13 14:59:14 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 13 Nov 2001 09:59:14 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> <15344.46897.223437.983547@anthem.wooz.org> <15344.57421.641192.998739@monster.linux.in> Message-ID: <15345.13634.801472.695445@anthem.wooz.org> >>>>> "PR" == Prabhu Ramachandran writes: PR> Umm, that was a contrived example so might not be very PR> sensible. For a more realistic one I think I'll pass the PR> question on to Eric. I think Eric did mention his difficulty PR> with SciPy here: PR> http://mail.python.org/pipermail/python-list/2001-November/071794.html I'm not wholly unsympathetic to the problem, but I'm trying to give some pushback because of just the reason Michel gives. Even though I can think of my own cool uses of an "acquisitional import", I think Python should be really careful here. One of the deep problems with implicit acquisition is that you often don't know where something really comes from. I'm worried that building this into the import mechanism will make it harder to figure out where something comes from. Explicit is better than implicit. But having said all that, it's clear to me that we've mined this vein enough on the list. It's PEP time! -Barry From Prabhu Ramachandran Tue Nov 13 18:07:24 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Tue, 13 Nov 2001 23:37:24 +0530 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15345.13634.801472.695445@anthem.wooz.org> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> <15344.46897.223437.983547@anthem.wooz.org> <15344.57421.641192.998739@monster.linux.in> <15345.13634.801472.695445@anthem.wooz.org> Message-ID: <15345.24924.396195.958560@monster.linux.in> >>>>> "BAW" == Barry A Warsaw writes: [on the suggested package import improvements] BAW> I'm not wholly unsympathetic to the problem, but I'm trying BAW> to give some pushback because of just the reason Michel BAW> gives. Even though I can think of my own cool uses of an BAW> "acquisitional import", I think Python should be really BAW> careful here. One of the deep problems with implicit BAW> acquisition is that you often don't know where something BAW> really comes from. I'm worried that building this into the BAW> import mechanism will make it harder to figure out where BAW> something comes from. Explicit is better than implicit. BAW> But having said all that, it's clear to me that we've mined BAW> this vein enough on the list. It's PEP time! Great! I'm very happy and thankful that you folks have been so patient with me and are listening to this stuff. I have a few other things to wind up this week, so I'll try starting to write the PEP this weekend. I'll first run it by Eric and maybe pester Gordon a bit before I pass it on to the experts. Thanks again! prabhu From mal@lemburg.com Tue Nov 13 18:44:43 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 13 Nov 2001 19:44:43 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 References: <20011113143255.E6B02303183@snelboot.oratrix.nl> Message-ID: <3BF16A1B.38249F48@lemburg.com> Jack Jansen wrote: > > Even though I'm not sure I like the switch idea (and I won't even contemplate > how Guido will react when he comes back and sees what we've been spending our > time on:-) there's one very special case of switch that I would like, and > that's the Algol 68 style switch on type. If we had something like > def foo(x): > switch type(x): > case int: > do something > case string: > do something else > this would be a nice point to hook into for something that tries to compile > Python to C or somesuch. > > Hmm, you would probably need a tuple-based switch as well: > switch type(x), type(y): > case int, int: > .... All this should be possible with the proposed extensions since type objects are immutable and hashable. I'll add your example to the PEP. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jim@interet.com Tue Nov 13 19:05:11 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 13 Nov 2001 14:05:11 -0500 Subject: [Python-Dev] Caching directory files in import.c References: <3BE30079.D6A8FB52@interet.com> Message-ID: <3BF16EE7.4D130A17@interet.com> I now have benchmarks for the zip import code. The network drive is faster than the local drive when using zip! I plan to update the PEP to reflect the changes. Case Original Using os.listdir Zip Uncomp. Zip Compr. ---- -------------------- --------------------- ---------- ---------- 1 3.2, 2.5, 3.2 -> 1.02 2.3, 2.5, 2.3 -> 0.87 1.66->0.93 1.5->1.07 2 2.8, 3.9, 3.0 -> 1.32 Same as case 1. 3 5.7, 5.7, 5.7 -> 5.7 2.1, 2.1, 2.1 -> 1.8 1.25->0.99 1.19->1.13 4 9.4, 9.4, 9.3 -> 9.35 Same as case 3. Case 1: Local drive C:, sys.path has its default value. Case 2: Local drive C:, move the correct directory to the end of sys.path. Case 3: Network drive, sys.path has its default value. Case 4: Network drive, move the correct directory to the end of sys.path. Benchmarks were performed on a Pentium 4 clone, 1.4 GHz, 256 Meg. The machine was running Windows 2000 with a Linux/Samba network server. Times are in seconds, and are the time to import about 100 modules from Lib. "Uncomp" means uncompressed zip archive, "Compr" means compressed. The Python version is 2.2a3. Initial times are after a re-boot of the system; the time after "->" is the time after repeated runs. Times to import from C: after a re-boot are rather highly variable for the "Original" case, but are more realistic. JimA From thomas@xs4all.net Tue Nov 13 19:16:56 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 13 Nov 2001 20:16:56 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 In-Reply-To: <20011113143255.E6B02303183@snelboot.oratrix.nl> References: <20011113143255.E6B02303183@snelboot.oratrix.nl> Message-ID: <20011113201656.D466@xs4all.nl> On Tue, Nov 13, 2001 at 03:32:55PM +0100, Jack Jansen wrote: > Even though I'm not sure I like the switch idea (and I won't even contemplate > how Guido will react when he comes back and sees what we've been spending our > time on:-) there's one very special case of switch that I would like, and > that's the Algol 68 style switch on type. If we had something like > def foo(x): > switch type(x): > case int: > do something > case string: > do something else > this would be a nice point to hook into for something that tries to compile > Python to C or somesuch. Unfortunately, type-names/objects aren't compile-time constants, so we can't implement this without some kind of namespace-modification-notification technique. Hmm... Or perhaps we could do the normal lookup, compare the then-current 'int' vs. the one we looked up, and if they aren't equal re-initialize the jump dict.... But *shudder*. > Hmm, you would probably need a tuple-based switch as well: > switch type(x), type(y): > case int, int: > .... I think you mean 'case (int, int):' there. Constant-tuples aren't really a problem to implement, though it would require either a lot of code duplication or a bit of refactoring, which is why my proof-of-concept doesn't offer them. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From marpet@linuxpl.org Tue Nov 13 20:16:09 2001 From: marpet@linuxpl.org (marpet@linuxpl.org) Date: Tue, 13 Nov 2001 21:16:09 +0100 Subject: [Python-Dev] Re: psyco In-Reply-To: References: Message-ID: <20011113211609.452195c0.marpet@linuxpl.org> On Tue, 13 Nov 2001 15:58:07 +0100 (MET) Armin Rigo wrote: > Hello everybody, > > > 2X on pystone is nothing to snort at -- it's good! > > Thanks, but I believe that much better results can be obtained. However, I > have a problem here. I am starting to run out of free time for Psyco. > > I would like to put efforts in helping people understand how Psyco works, > so that it can be developped up to a really usable state -- it is > currently more of a proof-of-concept, but it would be just too bad to > loose this opportunity, I guess. Besides, I believe that with some more > work we could really turn the current version into something great. That's > just too much for me alone now (I should be doing a math Ph.D. these days > you know ;-). So my request is, Does anyone feel like investing some time > in Psyco just on the basis of the current preliminary results ? Shall I > write a technical in-depth overview ? off course, it will always be worthwile although I am not the one to take the responsibility of dragging it along (lack of time/experience/guts/whatever) I will be very happy to see it in usable state ;-) regards -- Marek Pêtlicki Linux User ID=162988 From James_Althoff@i2.com Tue Nov 13 21:31:21 2001 From: James_Althoff@i2.com (James_Althoff@i2.com) Date: Tue, 13 Nov 2001 13:31:21 -0800 Subject: [Python-Dev] PEP 276 Simple Iterator for ints Message-ID: All, Below is the first draft of PEP 276 "Simple Iterator for ints". (Available at http://python.sourceforge.net/peps/pep-0276.html) Feel free to comment (positive, negative, bipolar, or in between ;-). Please copy me on any comments that you think should be incorporated into the next revision (so that I don't unintentionally miss something on python-list). Thanks, Jim ===================================== PEP: 276 Title: Simple Iterator for ints Version: $Revision: 1.1 $ Last-Modified: $Date: 2001/11/13 20:52:37 $ Author: james_althoff@i2.com (Jim Althoff) Status: Draft Type: Standards Track Created: 12-Nov-2001 Python-Version: 2.3 Post-History: Abstract Python 2.1 added new functionality to support iterators[1]. Iterators have proven to be useful and convenient in many coding situations. It is noted that the implementation of Python's for-loop control structure uses the iterator protocol as of release 2.1. It is also noted that Python provides iterators for the following builtin types: lists, tuples, dictionaries, strings, and files. This PEP proposes the addition of an iterator for the builtin type int (types.IntType). Such an iterator would simplify the coding of certain for-loops in Python. Specification Define an iterator for types.intType (i.e., the builtin type "int") that is returned from the builtin function "iter" when called with an instance of types.intType as the argument. The returned iterator has the following behavior: - Assume that object i is an instance of types.intType (the builtin type int) and that i > 0 - iter(i) returns an iterator object - said iterator object iterates through the sequence of ints 0,1,2,...,i-1 Example: iter(5) returns an iterator object that iterates through the sequence of ints 0,1,2,3,4 - if i <= 0, iter(i) returns an "empty" iterator, i.e., one that throws StopIteration upon the first call of its "next" method In other words, the conditions and semantics of said iterator is consistent with the conditions and semantics of the range() and xrange() functions. Note that the sequence 0,1,2,...,i-1 associated with the int i is considered "natural" in the context of Python programming because it is consistent with the builtin indexing protocol of sequences in Python. Python lists and tuples, for example, are indexed starting at 0 and ending at len(object)-1 (when using positive indices). In other words, such objects are indexed with the sequence 0,1,2,...,len(object)-1 Rationale A common programming idiom is to take a collection of objects and apply some operation to each item in the collection in some established sequential order. Python provides the "for in" looping control structure for handling this common idiom. Cases arise, however, where it is necessary (or more convenient) to access each item in an "indexed" collection by iterating through each index and accessing each item in the collection using the corresponding index. For example, one might have a two-dimensional "table" object where one requires the application of some operation to the first column of each row in the table. Depending on the implementation of the table it might not be possible to access first each row and then each column as individual objects. It might, rather, be possible to access a cell in the table using a row index and a column index. In such a case it is necessary to use an idiom where one iterates through a sequence of indices (indexes) in order to access the desired items in the table. (Note that the commonly used DefaultTableModel class in Java-Swing-Jython has this very protocol). Another common example is where one needs to process two or more collections in parallel. Another example is where one needs to access, say, every second item in a collection. There are many other examples where access to items in a collection is facilitated by a computation on an index thus necessitating access to the indices rather than direct access to the items themselves. Let's call this idiom the "indexed for-loop" idiom. Some programming languages provide builtin syntax for handling this idiom. In Python the common convention for implementing the indexed for-loop idiom is to use the builtin range() or xrange() function to generate a sequence of indices as in, for example: for rowcount in range(table.getRowCount()): print table.getValueAt(rowcount, 0) or for rowcount in xrange(table.getRowCount()): print table.getValueAt(rowcount, 0) From time to time there are discussions in the Python community about the indexed for-loop idiom. It is sometimes argued that the need for using the range() or xrange() function for this design idiom is: - Not obvious (to new-to-Python programmers), - Error prone (easy to forget, even for experienced Python programmers) - Confusing and distracting for those who feel compelled to understand the differences and recommended usage of xrange() vis-a-vis range() - Unwieldy, especially when combined with the len() function, i.e., xrange(len(sequence)) - Not as convenient as equivalent mechanisms in other languages, - Annoying, a "wart", etc. And from time to time proposals are put forth for ways in which Python could provide a better mechanism for this idiom. Recent examples include PEP 204, "Range Literals", and PEP 212, "Loop Counter Iteration". Most often, such proposal include changes to Python's syntax and other "heavyweight" changes. Part of the difficulty here is that advocating new syntax implies a comprehensive solution for "general indexing" that has to include aspects like: - starting index value - ending index value - step value - open intervals versus closed intervals versus half opened intervals Finding a new syntax that is comprehensive, simple, general, Pythonic, appealing to many, easy to implement, not in conflict with existing structures, not excessively overloading of existing structures, etc. has proven to be more difficult than one might anticipate. The proposal outlined in this PEP tries to address the problem by suggesting a simple "lightweight" solution that helps the most common case by using a proven mechanism that is already available (as of Python 2.1): namely, iterators. Because for-loops already use "iterator" protocol as of Python 2.1, adding an iterator for types.IntType as proposed in this PEP would enable by default the following shortcut for the indexed for-loop idiom: for rowcount in table.getRowCount(): print table.getValueAt(rowcount, 0) The following benefits for this approach vis-a-vis the current mechanism of using the range() or xrange() functions are claimed to be: - Simpler, - Less cluttered, - Focuses on the problem at hand without the need to resort to secondary implementation-oriented functions (range() and xrange()) And compared to other proposals for change: - Requires no new syntax - Requires no new keywords - Takes advantage of the new and well-established iterator mechanism And generally: - Is consistent with iterator-based "convenience" changes already included (as of Python 2.1) for other builtin types such as: lists, tuples, dictionaries, strings, and files. Preliminary discussion on the Python interest mailing list suggests a reasonable amount of initial support for this PEP (along with some dissents/issues noted below). Backwards Compatibility The proposed mechanism is generally backwards compatible as it calls for neither new syntax nor new keywords. All existing, valid Python programs should continue to work unmodified. However, this proposal is not perfectly backwards compatible in the sense that certain statements that are currently invalid would, under the current proposal, become valid. Tim Peters has pointed out two such examples: 1) The common case where one forgets to include range() or xrange(), for example: for rowcount in table.getRowCount(): print table.getValueAt(rowcount, 0) in Python 2.2 raises a TypeError exception. Under the current proposal, the above statement would be valid and would work as (presumably) intended. Presumably, this is a good thing. As noted by Tim, this is the common case of the "forgotten range" mistake (which one currently corrects by adding a call to range() or xrange()). 2) The (hopefully) very uncommon case where one makes a typing mistake when using tuple unpacking. For example: x, = 1 in Python 2.2 raises a TypeError exception. Under the current proposal, the above statement would be valid and would set x to 0. The PEP author has no data as to how common this typing error is nor how difficult it would be to catch such an error under the current proposal. He imagines that it does not occur frequently and that it would be relatively easy to correct should it happen. Issues: Based on some preliminary discussion on the Python interest mailing list, the following concerns have been voiced: - Is it obvious that iter(5) maps to the sequence 0,1,2,3,4? Response: Given, as noted above, that Python has a strong convention for indexing sequences starting at 0 and stopping at (inclusively) the index whose value is one less than the length of the sequence, it is argued that the proposed sequence is reasonably intuitive to a Python programmer while being useful and practical. - "in" (as in "for i in x") does not match standard English usage in this case. "up to" or something similar might be better. Response: Not everyone felt that matching standard English perfectly is a requirement. It is noted that "for:else:" doesn't match standard English very well either. And few are excited about adding a new keyword, especially just to get a somewhat better match to standard English usage. - Possible ambiguity for i in 10: print i might be mistaken for for i in (10,): print i Response: The predicted ambiguity was not readily apparent to several of the posters. - It would be better to add special new syntax such as: for i in 0..10: print i Response: There are other PEPs that take this approach[2][3]. - It would be better to reuse the ellipsis literal syntax (...) Response: Shares disadvantages of other proposals that require changes to the syntax. Needs more design to determine how it would handle the general case of start,stop,step, open/closed/half-closed intervals, etc. Needs a PEP. - It would be better to reuse the slicing literal syntax attached to the int class, e.g., int[0:10] Response: Same as previous response. In addition, design consideration needs to be given to what it would mean if one uses slicing syntax after some arbitrary class other than class int. Needs a PEP. - Might dissuade newbies from using the indexed for-loop idiom when the standard "for item in collection:" idiom is clearly better. Response: The standard idiom is so nice when "it fits" that it needs neither extra "carrot" nor "stick". On the other hand, one does notice cases of overuse/misuse of the standard idiom (due, most likely, to the awkwardness of the indexed for-loop idiom), as in: for item in sequence: print sequence.index(item) - Doesn't handle the general case of start,stop,step Response: use the existing range() or xrange() mechanisms. Or, see below. Extension If one wants to handle general indexing (start,stop,step) without having to resort to using the range() or xrange() functions then the following could be incorporated into the current proposal. Add an "iter" method (or use some other preferred name) to types.IntType with the following signature: def iter(start=0, step=1): This method would have the (hopefully) obvious semantics. Then one could do, for example: x = 100 for i in x.iter(start=1, step=2): print i Under this extension (for x bound to an int), for i in x: would be equivalent to for i in x.iter(): and to for i in x.iter(start=0, step=1): This extension is consistent with the generalization provided by the current mechanism for dictionaries whereby one can use: for k in d.iterkeys(): for v in d.itervalues(): for k,v in d.iteritems(): depending on one's needs, given that for i in d: has a meaning aimed at the most common and useful case (d.iterkeys()). Implementation An implementation is not available at this time and although the author is not qualified to comment on such he will, nonetheless, speculate that this might be straightforward and, hopefully, might consist of little more than setting the tp_iter slot in types.IntType to point to a simple iterator function that would be similar to -- or perhaps even a wrapper around -- the xrange() function. References [1] PEP 234, Iterators http://python.sourceforge.net/peps/pep-0234.html [2] PEP 204, Range Literals http://python.sourceforge.net/peps/pep-0204.html [3] PEP 212, Loop Counter Iteration http://python.sourceforge.net/peps/pep-0212.html Copyright This document has been placed in the public domain. From thomas@xs4all.net Wed Nov 14 11:03:32 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 14 Nov 2001 12:03:32 +0100 Subject: [Python-Dev] PyDict_GetItem & dict_subscript Message-ID: <20011114120332.E466@xs4all.nl> While trying to figure out why passing an unhashable object to switch didn't cause a TypeError: unhashable object, I found out PyDict_GetItem and dict_subscript behave differently in that respect. PyDict_GetItem does: hash = PyObject_Hash(key); if (hash == -1) { PyErr_Clear(); return NULL; } whereas dict_subscript does: hash = PyObject_Hash(key); if (hash == -1) return NULL; PyDict_SetItem and PyDict_DelItem both behave like dict_subscript (that is, they propagate the error upwards, instead of clearing the error.) Is this intentional ? Why ? :P And what is the 'right' way to do a PyDict_GetItem() that gives the proper error-message without relying on the need for that particular dict (which could be a subtype that has more or less restrictions) to have hashable keys ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas.heller@ion-tof.com Wed Nov 14 11:20:18 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 14 Nov 2001 12:20:18 +0100 Subject: [Python-Dev] PyDict_GetItem & dict_subscript References: <20011114120332.E466@xs4all.nl> Message-ID: <0dd501c16cfe$69ccd890$e000a8c0@thomasnotebook> From: "Thomas Wouters" > > While trying to figure out why passing an unhashable object to switch didn't > cause a TypeError: unhashable object, I found out PyDict_GetItem and > dict_subscript behave differently in that respect. PyDict_GetItem does: > > hash = PyObject_Hash(key); > if (hash == -1) { > PyErr_Clear(); > return NULL; > } > > whereas dict_subscript does: > > hash = PyObject_Hash(key); > if (hash == -1) > return NULL; > > PyDict_SetItem and PyDict_DelItem both behave like dict_subscript (that is, > they propagate the error upwards, instead of clearing the error.) Is this > intentional ? It seems so, at least it's documented. > Why ? :P Probably because retrieving a value from a dictionary, and trying something else (after clearing the error) if it fails is so common... > And what is the 'right' way to do a PyDict_GetItem() > that gives the proper error-message without relying on the need for that > particular dict (which could be a subtype that has more or less > restrictions) to have hashable keys ? I'd use PyObject_GetItem(), but... Thomas From mwh@python.net Wed Nov 14 11:35:39 2001 From: mwh@python.net (Michael Hudson) Date: 14 Nov 2001 06:35:39 -0500 Subject: [Python-Dev] Re: psyco In-Reply-To: Armin Rigo's message of "Tue, 13 Nov 2001 15:58:07 +0100 (MET)" References: Message-ID: <2mofm5bn5g.fsf@starship.python.net> [python.net mail unblocks again, sigh] Armin Rigo writes: > Hello everybody, > > > 2X on pystone is nothing to snort at -- it's good! > > Thanks, but I believe that much better results can be > obtained. However, I have a problem here. I am starting to run out > of free time for Psyco. Shame! > I would like to put efforts in helping people understand how Psyco > works, so that it can be developped up to a really usable state -- > it is currently more of a proof-of-concept, but it would be just too > bad to loose this opportunity, I guess. Besides, I believe that with > some more work we could really turn the current version into > something great. I've been meaning to take a good look at pysco since you first announced it, but have never found the time/inspiration needed. It would be a shame if it were to die. > That's just too much for me alone now (I should be doing a math > Ph.D. these days you know ;-). Same here! ...googles... logic, eh? Nutter . Out of curiosity, do you know people at vub? I was at a conference with Michel van den Bergh and Tor Lowen in September... (small world). > So my request is, Does anyone feel like investing some time in Psyco > just on the basis of the current preliminary results ? Shall I > write a technical in-depth overview ? I you can find the time to write it, I'll do my best to find the time to read it properly. What else is the first year of a PhD for? Cheers, M. -- [Perl] combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript. -- Jamie Zawinski From thomas@xs4all.net Wed Nov 14 11:58:10 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 14 Nov 2001 12:58:10 +0100 Subject: [Python-Dev] PyDict_GetItem & dict_subscript In-Reply-To: <0dd501c16cfe$69ccd890$e000a8c0@thomasnotebook> References: <20011114120332.E466@xs4all.nl> <0dd501c16cfe$69ccd890$e000a8c0@thomasnotebook> Message-ID: <20011114125810.F466@xs4all.nl> On Wed, Nov 14, 2001 at 12:20:18PM +0100, Thomas Heller wrote: > From: "Thomas Wouters" > > While trying to figure out why passing an unhashable object to switch didn't > > cause a TypeError: unhashable object, I found out PyDict_GetItem and > > dict_subscript behave differently in that respect. PyDict_GetItem does: > > > > hash = PyObject_Hash(key); > > if (hash == -1) { > > PyErr_Clear(); > > return NULL; > > } > > > > whereas dict_subscript does: > > > > hash = PyObject_Hash(key); > > if (hash == -1) > > return NULL; > > > > PyDict_SetItem and PyDict_DelItem both behave like dict_subscript (that is, > > they propagate the error upwards, instead of clearing the error.) Is this > > intentional ? > It seems so, at least it's documented. Is it ? Where ? The API docs (both devel and current versions) have only this to say: PyObject* PyDict_GetItem(PyObject *p, PyObject *key) Return value: Borrowed reference. Returns the object from dictionary p which has a key 'key'. Returns NULL if the key 'key' is not present, but without setting an exception. > I'd use PyObject_GetItem(), but... Hm, I guess. Not quite intuitive to me, but it'll do. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Wed Nov 14 12:03:11 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 14 Nov 2001 13:03:11 +0100 Subject: [Python-Dev] PyDict_GetItem & dict_subscript References: <20011114120332.E466@xs4all.nl> <0dd501c16cfe$69ccd890$e000a8c0@thomasnotebook> Message-ID: <3BF25D7F.A03B6D52@lemburg.com> Thomas Heller wrote: > > From: "Thomas Wouters" > > > > While trying to figure out why passing an unhashable object to switch didn't > > cause a TypeError: unhashable object, I found out PyDict_GetItem and > > dict_subscript behave differently in that respect. PyDict_GetItem does: > > > > hash = PyObject_Hash(key); > > if (hash == -1) { > > PyErr_Clear(); > > return NULL; > > } > > > > whereas dict_subscript does: > > > > hash = PyObject_Hash(key); > > if (hash == -1) > > return NULL; > > > > PyDict_SetItem and PyDict_DelItem both behave like dict_subscript (that is, > > they propagate the error upwards, instead of clearing the error.) Is this > > intentional ? > > It seems so, at least it's documented. Yes, that's intentional. The reason is that historically PyDict_GetItem() was always used in this way by Python's internal code and many extensions. > > Why ? :P > > Probably because retrieving a value from a dictionary, and trying something > else (after clearing the error) if it fails is so common... Right. > > And what is the 'right' way to do a PyDict_GetItem() > > that gives the proper error-message without relying on the need for that > > particular dict (which could be a subtype that has more or less > > restrictions) to have hashable keys ? > > I'd use PyObject_GetItem(), but... Because of security concerns, the dict must be a read-only dictionary implementation. Using generic APIs is the wrong approach here. Of course, since we don't have a read-only dictionary type, the argument is not relevant to your prototype ;-) I'd suggest to use this construct: if ((value = PyDict_GetItem(key)) == NULL) { if (PyObject_Hash(key) == -1) return NULL; else goto notFound; } -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From gward@python.net Wed Nov 14 16:02:03 2001 From: gward@python.net (Greg Ward) Date: Wed, 14 Nov 2001 11:02:03 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15344.43730.821724.769885@monster.linux.in> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> Message-ID: <20011114110203.A22764@gerg.ca> [Prabhu Ramachandran, claiming "from foo import bar" considered bad] > Well, the Python howto explains it much better than I could hope to: > > http://py-howto.sourceforge.net/doanddont/node8.html > > Since re-loading packages is important for me, I prefer using plain > imports. Nobody else has responded to this, so I should. The above page (just a paragraph, really) claims that "from foo import bar" is bad because it binds the same object to two different names (or rather, to the same name in two different namespace). True enough, but it then claims this is a Bad Thing because things can go wrong in the face of module reloading, or (aack!) changes to function definitions at run-time. Well, let's get one thing straight here: consider the two language features, "from foo import bar" and "change function definition at runtime". These two language features interact in unpleasant ways. Which language feature should then be considered dangerous, dodgy, to be avoided in production code, etc? I don't think there's much controversy there. As for module reloading, this is another thing that simply cannot work in real life. It's nice that you can do it at the command-line, but real-life patterns of code and data mean module reloading in non-trivial applications simply cannot work. Here's the explanation I posted to the quixote-users list just yesterday: More importantly, it is fundamentally impossible for module reloading to work in Python. Believe me, I've tried several times, and each time I run up against the same brick wall: if module b's global namespace has an instance of class a.A, what happens when you reload module a? The a.A instance in b still has a bunch of bound methods that point to the old code in the old version of a. You lose. So you reload b and a at the same time: the a.A instance in b's global namespace has to be recreated (ie. b re-imported) *after* a is re-imported. What if there's a cyclic dependency, ie. a's namespace has a b.B instance and b's has an a.A method? (Yes, you'd be nuts to set things up this way, but it's possible.) And what about other namespaces (eg. instances floating in memory, closures, ...) that have a reference to b's a.A instance? OK, you have to scrub all those namespaces too. It boils down to doing sys.modules.clear(), which has its own problems. If anyone has a solution to this, I'm all ears, but for now I'm pretty well convinced that it cannot be done. In case it's not obvious, I think that Python how-to document needs revision in this regard. Moshe? Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ Do radioactive cats have 18 half-lives? From Donald Beaudry Wed Nov 14 16:49:55 2001 From: Donald Beaudry (Donald Beaudry) Date: Wed, 14 Nov 2001 11:49:55 -0500 Subject: [Python-Dev] switch-based programming in Python References: Message-ID: <200111141649.LAA26118@localhost.localdomain> I was hoping someone would take the bait ;) Paul Svensson wrote, > On Tue, 13 Nov 2001, Greg Ewing wrote: > > >Paul Svensson : > > > >> On Fri, 9 Nov 2001, Donald Beaudry wrote: > > > >> > when EXPR: > >> > in CONSTANT_TUPLE: > >> > [suite] > >> > else: > >> > [suite] > > > >> you're absolutely right on the indentation of the "else". > > > >Really? To me, the else is just another branch, and should > >be on the same level as all the others. > > If you ignore how you get there, it's just another branch. > But, how you get there is the greates distinction > between the "else" branch and the other branches, > and I think this should be emphasized, not ignored. > > Compare how Python uses "else" not only with "if" statements, > but also with for, while, and except. > > Having the "else" indented with the "when" also makes it > immediately obvious that there can't be more than one, > and it has to go at the end. > > Besides, I find it visually more appealing. I like the way it looks too, but the semantics are where it gets sticky. In a for or while, the else clause only gets executed when the statement terminates "normally" (not due to a break). Following this model, one might expect the else clause associated with a 'when' statement to be executed whenever a when's in caluse terminates normally. But what does "normally" mean in this context? On the other hand, if the else clause is to be like the default clause on a C switch statment (what most would expect) I have to agree to indenting the 'else' to the same level as the 'in'. The potential confusion here could be enough to argue against the use of else to mark the default clause. So, how about this: when EXPR: in CONSTANT-TUPLE: SUITE not in CONSTANT-TUPLE: SUITE else: SUITE With this mess, the else can be as it is with the for and while statements (though a definition of normal termination is still needed) and 'not in ():' could be used to mark the default clause. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@init.com Lexington, MA 02421 ...Will hack for sushi... From barry@zope.com Wed Nov 14 16:57:36 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 14 Nov 2001 11:57:36 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> <20011114110203.A22764@gerg.ca> Message-ID: <15346.41600.362634.797121@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> If anyone has a solution to this, I'm all ears, but for now GW> I'm pretty well convinced that it cannot be done. Jim Fulton has an idea called "association object": http://mail.python.org/pipermail/python-dev/2000-January/001809.html -Barry From thomas@xs4all.net Wed Nov 14 17:48:40 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 14 Nov 2001 18:48:40 +0100 Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111141649.LAA26118@localhost.localdomain> References: <200111141649.LAA26118@localhost.localdomain> Message-ID: <20011114184840.G466@xs4all.nl> On Wed, Nov 14, 2001 at 11:49:55AM -0500, Donald Beaudry wrote: > when EXPR: > in CONSTANT-TUPLE: You guys need to quit calling it a CONSTANT-TUPLE. It isn't. Like the multiple arguments to the "print" statements, it's just 'multiple values', *not* a tuple. The difference is subtle, probably, but definately there ;) Asside from that, I prefer 'switch', 'case' and 'else' all to be on the same indentation level. That way it's visually most like 'if/elif/else', what it also acts most like. I can live with having 'case' indented relative to 'switch', but 'else' should be part of 'switch', not 'case'. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jeremy@zope.com Wed Nov 14 18:14:39 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 14 Nov 2001 13:14:39 -0500 (EST) Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <20011114110203.A22764@gerg.ca> References: <3BED5AD8.24350.78407262@localhost> <3BEFE249.371EA5FD@interet.com> <15343.61552.414932.44100@monster.linux.in> <15343.63951.137972.487852@slothrop.digicool.com> <15344.4849.71643.868791@monster.linux.in> <15344.5763.557118.787529@walden.zope.com> <15344.43730.821724.769885@monster.linux.in> <20011114110203.A22764@gerg.ca> Message-ID: <15346.46223.837119.950578@slothrop.digicool.com> >>>>> "GW" == Greg Ward writes: GW> Nobody else has responded to this, so I should. The above page GW> (just a paragraph, really) claims that "from foo import bar" is GW> bad because it binds the same object to two different names (or GW> rather, to the same name in two different namespace). I this were a bad thing, we'd have to recommend that people not use assignments. GW> True enough, but it then claims this is a Bad Thing because GW> things can go wrong in the face of module reloading, or (aack!) GW> changes to function definitions at run-time. [discussion of various attempts to do a useful reload()] GW> If anyone has a solution to this, I'm all ears, but for now I'm GW> pretty well convinced that it cannot be done. The current behavior is a natural consequence of the way references work. I don't think there's any sensible way to change it. Even if you change import to bind a name to a reference-to-a-module, such that the reference was checked on each use and always refered to the most recent copy of a module, it wouldn't be sufficient. Each instance that uses a class was a reference to the class itself, not the class-in-the-current module. The same pattern occurs for every possible kind of reference. I'm strongly opposed to changing import because it would create one special case and that special case would cause confusion more than anything. Jeremy From paul@pfdubois.com Wed Nov 14 18:22:35 2001 From: paul@pfdubois.com (Paul Dubois) Date: Wed, 14 Nov 2001 10:22:35 -0800 Subject: [Python-Dev] __slots__ came from? References: Message-ID: <000f01c16d39$55652520$09860cc0@CLENHAM> I wanted to learn more about this __slots__ (proposal?)(decision?) because of the possible relationship to the Properties package that is part of Numeric/MA. I can't find the PEP that relates to it. Help? From tim@zope.com Wed Nov 14 18:42:38 2001 From: tim@zope.com (Tim Peters) Date: Wed, 14 Nov 2001 13:42:38 -0500 Subject: [Python-Dev] __slots__ came from? In-Reply-To: <000f01c16d39$55652520$09860cc0@CLENHAM> Message-ID: [Paul Dubois] > I wanted to learn more about this __slots__ (proposal?)(decision?) > because of the possible relationship to the Properties package that is > part of Numeric/MA. I can't find the PEP that relates to it. Help? __slots__ belong in PEP 253, but there's just a todo placeholder for them in there now. See Guido's 2.2 type/class tutorial for a pragmatic intro: http://www.python.org/2.2/descrintro.html Also see "Properties" there! 2.2 __slots__ are largely a memory-saving feature, 2.2 properties largely for computed attributes (a way to capture setting, getting and deleting of specific attributes without needing catch-all hooks (like __getattr__ and __setattr__)). It's mondo cool stuff. From gmcm@hypernet.com Wed Nov 14 20:53:35 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 14 Nov 2001 15:53:35 -0500 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <15346.41600.362634.797121@anthem.wooz.org> Message-ID: <3BF2937F.10157.8CA5BD9B@localhost> Barry wrote: > Jim Fulton has an idea called "association object": > > http://mail.python.org/pipermail/python-dev/2000-January/001809.h > tml No, that post pretty much echoes what Greg said. Jim (IIRC) brought up using association objects to deal with circular imports, where both are of the "from" variety (which currently fails - one of them must be a plain import). Jeremy's right - it don't do squat for reload unless you make *all* references association objects (in which case it ain't Python). - Gordon From martin@v.loewis.de Wed Nov 14 21:01:29 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 14 Nov 2001 22:01:29 +0100 Subject: [Python-Dev] Experimental features and __slots__ Message-ID: <200111142101.fAEL1T402101@mira.informatik.hu-berlin.de> Recently, there has been a concern that if __slots__ is added to the language, we'll be stuck with it forever. I want to propose a strategy that doesn't require things to be cast in stone right now. PEP 5 should be enhanced to allow the notion of experimental features. Programs using experimental features should expect to break from one release to another, with no need to deprecate the experimental feature. If desired, __future__ import could be required to activate experimental features, with a prospect that they disappear in the next release. However, this is not strictly necessary: Documenting experimental features may be sufficient. For 2.2, I propose that all __ attributes relating to new-style classes are experimental. Regards, Martin From just@letterror.com Wed Nov 14 21:07:22 2001 From: just@letterror.com (Just van Rossum) Date: Wed, 14 Nov 2001 22:07:22 +0100 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <20011114110203.A22764@gerg.ca> Message-ID: <20011114220726-r01010800-15c2b85e-0920-010c@10.0.0.23> Greg Ward wrote: > More importantly, it is fundamentally impossible for module > reloading to work in Python. Believe me, I've tried several times, > and each time I run up against the same brick wall: [ ..... ] > > If anyone has a solution to this, I'm all ears, but for now I'm pretty > well convinced that it cannot be done. Has anyone ever tried something like this: make a copy of module.__dict__ module.__dict__.clear() execute source in module.__dict__ for each function in module: assign new func attrs to *old* func object (possible for func_code, func_defaults, func_doc and func_dict) inject old (but updated!) func object back into module.__dict__ for each class in module: for each func in class: # XXX something with __bases__... ? This way, at least all functions and classes will be the *same* objects as before, but updated. What I use a lot is reloading individual methods (possible in the MacPython IDE by selected a method and "run" the selection). This has some serious disadvantages: - linenumbers on code below the updated method(s) won't be correct anymore - references to bound methods still reference the *old* method. Still, I have found this to be an enormous productivity gain: you're really hacking on live objects. Maybe a scheme like the one above can make this even more transparent? Just From niemeyer@conectiva.com Wed Nov 14 22:07:03 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Wed, 14 Nov 2001 20:07:03 -0200 Subject: [Python-Dev] Python's footprint In-Reply-To: <20011108165105.A29947@gerg.ca> References: <20011108142106.A2559@ibook.distro.conectiva> <20011108165105.A29947@gerg.ca> Message-ID: <20011114200703.B14313@ibook.distro.conectiva> --TYecfFk8j8mZq+dy Content-Type: multipart/mixed; boundary="b5gNqxB1S1yM7hjW" Content-Disposition: inline --b5gNqxB1S1yM7hjW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > > It means that about 10% of python's executable is documentation. [...] > Anyways, that sounds like a useful idea. It would probably be a big > patch that touches lots of files, so it's unlikely to get into Python > 2.2. You might consider whipping up a patch now to get it under > consideration early in 2.3's life-cycle. Ok. The patch is ready (attached). It's very simple. Just introducing two new macros: Py_DOCSTR() to be used in usual doc strings, and WITH_DOC_STRINGS, for more complex ones (sys module's doc string comes into my mind). I'd just like to know the moment when it is going to be applied, so I can change every documentation string accordingly and submit the patch. I could do this right now, for sure. But if it's going to be applied just for 2.3, the patch will certainly be broken at that time. Thanks! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --b5gNqxB1S1yM7hjW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="Python-2.2-docstr.patch" Content-Transfer-Encoding: quoted-printable --- Python-2.2.orig/pyconfig.h.in Wed Nov 14 17:54:31 2001 +++ Python-2.2/pyconfig.h.in Wed Nov 14 19:08:08 2001 @@ -765,3 +765,13 @@ #define STRICT_SYSV_CURSES /* Don't use ncurses extensions */ #endif =20 +/* Define if you want to have inline documentation. */ +#undef WITH_DOC_STRINGS + +/* Define macro for inline documentation. */ +#ifdef WITH_DOC_STRINGS +#define Py_DOCSTR(x) x +#else +#define Py_DOCSTR(x) "" +#endif + --- Python-2.2.orig/configure.in Wed Nov 14 17:54:31 2001 +++ Python-2.2/configure.in Wed Nov 14 19:20:07 2001 @@ -1305,6 +1305,20 @@ fi AC_MSG_RESULT($with_cycle_gc) =20 +# Check for --with-doc-strings +AC_MSG_CHECKING(for --with-doc-strings) +AC_ARG_WITH(doc-strings, +[ --with(out)-doc-strings disable/enable documentation strings]) + +if test -z "$with_doc_strings" +then with_doc_strings=3D"yes" +fi +if test "$with_doc_strings" !=3D "no" +then + AC_DEFINE(WITH_DOC_STRINGS) +fi +AC_MSG_RESULT($with_doc_strings) + # Check for Python-specific malloc support AC_MSG_CHECKING(for --with-pymalloc) AC_ARG_WITH(pymalloc, --b5gNqxB1S1yM7hjW-- --TYecfFk8j8mZq+dy Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE78usHIlOymmZkOgwRAl1NAKCO0TFTRa/tiPcfkltlQvJk8kBrxACg03d0 Y1e6KuoSN3zH+kinMKxUjsE= =9pvB -----END PGP SIGNATURE----- --TYecfFk8j8mZq+dy-- From barry@zope.com Wed Nov 14 22:55:08 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 14 Nov 2001 17:55:08 -0500 Subject: [Python-Dev] pickle/cPickle raises SystemError Message-ID: <15346.63052.411305.98704@yyz.digicool.com> Did anybody else know that pickle and cPickle can raise a SystemError? I'm working on a rewrite of the pickle documentation and was playing around with some of the undefined corners when I stumbled across this. SystemError is clearly the wrong exception to raise, given the way the exception is documented in the exceptions module! I've submitted SF bug #481882 which describes the problem and contains candidate patches (currently only for pickle.py). I've no idea why pickle.py's find_class() is masking KeyError, AttributeError, and ImportError and transforming them into SystemError. I think the exceptions should just be allowed to percolate up. Alternatively (although less correct, IMO) would be to transform them into UnpicklingErrors. The Python test suite seems to not cover this point, and it's not documented anywhere that I can tell, so I feel justified in fixing this for Python 2.2. I'm concerned about losing backwards compatibility with commonly held lore though, so I'm wondering if anybody out there counts on exactly a SystemError being raised in this situation? -Barry -------------------- snip snip -------------------- Python 2.2b1+ (#3, Nov 13 2001, 18:06:11) [GCC 2.95.3 19991030 (prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pickle >>> pickle.loads('c__builtin__\noops\np0\n.') Traceback (most recent call last): File "", line 1, in ? File "/home/barry/projects/python/Lib/pickle.py", line 979, in loads return Unpickler(file).load() File "/home/barry/projects/python/Lib/pickle.py", line 588, in load dispatch[key](self) File "/home/barry/projects/python/Lib/pickle.py", line 805, in load_global klass = self.find_class(module, name) File "/home/barry/projects/python/Lib/pickle.py", line 815, in find_class raise SystemError, \ SystemError: Failed to import class oops from module __builtin__ >>> import cPickle >>> cPickle.loads('c__builtin__\noops\np0\n.') Traceback (most recent call last): File "", line 1, in ? SystemError: Failed to import class oops from module __builtin__ From nas@python.ca Wed Nov 14 23:10:26 2001 From: nas@python.ca (Neil Schemenauer) Date: Wed, 14 Nov 2001 15:10:26 -0800 Subject: [Python-Dev] pickle/cPickle raises SystemError In-Reply-To: <15346.63052.411305.98704@yyz.digicool.com>; from barry@zope.com on Wed, Nov 14, 2001 at 05:55:08PM -0500 References: <15346.63052.411305.98704@yyz.digicool.com> Message-ID: <20011114151026.A30259@glacier.arctrix.com> Barry A. Warsaw wrote: > Did anybody else know that pickle and cPickle can raise a SystemError? Yow! Without even reading the exceptions documentation, SystemError seems grossly wrong to me. I'm +1 on fixing it. It was undocumented and the exception is called "SystemError". People who write code that depends on catching SystemError get no sympathy from me when their code breaks. Neil From greg@cosc.canterbury.ac.nz Thu Nov 15 05:47:20 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 15 Nov 2001 18:47:20 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111141649.LAA26118@localhost.localdomain> Message-ID: <200111150547.SAA13245@s454.cosc.canterbury.ac.nz> Donald Beaudry : > In a for or while, the else clause only gets executed when > the statement terminates "normally" (not due to a break). Following > this model, one might expect the else clause associated with a 'when' > statement to be executed whenever a when's in caluse terminates > normally. But what does "normally" mean in this context? No, please, don't try to make it work like the loops do! By the principle of least astonishment when coming from other languages, that would be a nightmare. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@home.com Thu Nov 15 06:45:39 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 15 Nov 2001 01:45:39 -0500 Subject: [Python-Dev] __class_init__ In-Reply-To: <00ff01c16b4f$e06ca710$e000a8c0@thomasnotebook> Message-ID: [Tim] > I haven't had time to look into it yet, but it's definitely on my > agenda. Why don't you want it [__class_init__] anymore? [Thomas Heller] > Two reasons: > > - It seems I can also achieve what I want with attribute descriptors > - I have no idea how it would be implemented. > > OTOH, it would definitely be nice to have... I pinged Guido about this (he'll fire me if I ever do that again ), and he's really not keen on it. "The right way" is to define a custom metaclass instead (whose __init__ plays the role __class_init__ would have played if defined in the class). It makes sense to me, but I'll have to play with it to be convinced it's as usable. From thomas.heller@ion-tof.com Thu Nov 15 07:40:10 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 15 Nov 2001 08:40:10 +0100 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <20011114220726-r01010800-15c2b85e-0920-010c@10.0.0.23> Message-ID: <106c01c16da8$d2d3e540$e000a8c0@thomasnotebook> From: "Just van Rossum" > Greg Ward wrote: > > > More importantly, it is fundamentally impossible for module > > reloading to work in Python. Believe me, I've tried several times, > > and each time I run up against the same brick wall: [ ..... ] > > > > If anyone has a solution to this, I'm all ears, but for now I'm pretty > > well convinced that it cannot be done. > > Has anyone ever tried something like this: > > make a copy of module.__dict__ > module.__dict__.clear() > execute source in module.__dict__ > for each function in module: > assign new func attrs to *old* func object > (possible for func_code, func_defaults, func_doc and func_dict) > inject old (but updated!) func object back into module.__dict__ > for each class in module: > for each func in class: > > # XXX something with __bases__... > > ? http://groups.google.com/groups?selm=9ss15a%2414snvc%242%40ID-59885.news.dfncis.de Thomas From just@letterror.com Thu Nov 15 08:54:16 2001 From: just@letterror.com (Just van Rossum) Date: Thu, 15 Nov 2001 09:54:16 +0100 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. In-Reply-To: <106c01c16da8$d2d3e540$e000a8c0@thomasnotebook> Message-ID: <20011115095420-r01010800-d8036cdd-0920-010c@10.0.0.23> Thomas Heller wrote: > http://groups.google.com/groups?selm=9ss15a%2414snvc%242%40ID-59885.news.dfncis.de Nice! Maybe I wouldn't have missed that if hadn't been part of the "IsPython really O-O?" thread... I think it's worth playing with this stuff more. One improvement I would like to try is to update methods just like global functions: that way existing callbacks that are bound methods will also be updated. Other nits: func_defaults and func_doc should definitely be updated, I'm not sure about func_dict. General question: why if func_globals not a writable attribute? Just From mal@lemburg.com Thu Nov 15 09:00:08 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 10:00:08 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 References: <20011113143255.E6B02303183@snelboot.oratrix.nl> <20011113201656.D466@xs4all.nl> Message-ID: <3BF38418.1B8D997F@lemburg.com> Thomas Wouters wrote: > > On Tue, Nov 13, 2001 at 03:32:55PM +0100, Jack Jansen wrote: > > > Even though I'm not sure I like the switch idea (and I won't even contemplate > > how Guido will react when he comes back and sees what we've been spending our > > time on:-) there's one very special case of switch that I would like, and > > that's the Algol 68 style switch on type. If we had something like > > def foo(x): > > switch type(x): > > case int: > > do something > > case string: > > do something else > > this would be a nice point to hook into for something that tries to compile > > Python to C or somesuch. > > Unfortunately, type-names/objects aren't compile-time constants, so we can't > implement this without some kind of namespace-modification-notification > technique. Hmm... Or perhaps we could do the normal lookup, compare the > then-current 'int' vs. the one we looked up, and if they aren't equal > re-initialize the jump dict.... But *shudder*. Dang. You're right -- I overlooked that "detail". Which brings us back to the discussion about optimizing builtin and module global lookups... it would be really really nice if Python had a mechanism which would allow to mark those symbols read-only or at least as "pre-fetching these at code object creation time is allowed" (this wouldn't help us with the switch statement, but has some other nice advantages, e.g. avoiding global lookups at run-time). Oh well... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Nov 15 10:04:38 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 11:04:38 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) References: <20011115095420-r01010800-d8036cdd-0920-010c@10.0.0.23> Message-ID: <3BF39336.BF89B22D@lemburg.com> Just van Rossum wrote: > > Thomas Heller wrote: > > > > http://groups.google.com/groups?selm=9ss15a%2414snvc%242%40ID-59885.news.dfncis.de > > Nice! Maybe I wouldn't have missed that if hadn't been part of the "IsPython > really O-O?" thread... > > I think it's worth playing with this stuff more. One improvement I would like to > try is to update methods just like global functions: that way existing callbacks > that are bound methods will also be updated. While it seems like a nice idea to update code which is already in use, I think that this leads down the wrong track. Sooner or later you'll end up with a complete mess in memory ;-) And depending on what code you exchange, this can cause serious problems: e.g. pickled data could become unusable, parts of the system would suddenly stop working because of e.g. a name change in one of the APIs, etc. What I'd like much more is some generic way to cleanly *remove* modules and complete packages which are known to be no longer in use. A problem I sometimes have with long running processes is that they import all sorts of bits and pieces just to process a single request every now and then. The space taken up by these modules is never freed. It would be nice to have an API which allows unloading these modules completely. Would be even nicer if this would extend to extension modules as well :-) (probably won't work though...) > Other nits: > func_defaults and func_doc should definitely be updated, I'm not sure about > func_dict. > > General question: why if func_globals not a writable attribute? Why should it be ? Just think of the security holes this would open. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Thu Nov 15 10:08:19 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 15 Nov 2001 11:08:19 +0100 Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints In-Reply-To: References: Message-ID: <15347.37907.247581.153829@beluga.mojam.com> >> Below is the first draft of PEP 276 "Simple Iterator for ints". >> (Available at http://python.sourceforge.net/peps/pep-0276.html) >> >> Feel free to comment (positive, negative, bipolar, or in between ;-). I liked "for i in 10:" at first, however, the more I see of the Haskell iterator syntax, the more I like it. Given that * it should cause no backward compatibility issues (you can't put "..." in a list now, can you? NumPy?) * it completely replaces range and xrange, not just one use of it * it can be made to work for floats, strings and possibly other builtin data types as well as ints * you can optimize it well I would tend to view the int-as-iterator as a wart (a neat sort of wart, but a wart nonetheless), while the Haskell syntax is elegant and seems to fit in well with existing Python usage. -1 -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) It's only a 25% solution to our problems. Of course, we only want to solve 25% of our problems, so it becomes a 100% solution. From mal@lemburg.com Thu Nov 15 10:07:40 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 11:07:40 +0100 Subject: [Python-Dev] PEP 275: "Switching on Multiple Values", Rev 1.1 References: <20011113143255.E6B02303183@snelboot.oratrix.nl> <20011113201656.D466@xs4all.nl> <3BF38418.1B8D997F@lemburg.com> Message-ID: <3BF393EC.70DEA0BA@lemburg.com> "M.-A. Lemburg" wrote: > > Thomas Wouters wrote: > > > > On Tue, Nov 13, 2001 at 03:32:55PM +0100, Jack Jansen wrote: > > > > > Even though I'm not sure I like the switch idea (and I won't even contemplate > > > how Guido will react when he comes back and sees what we've been spending our > > > time on:-) there's one very special case of switch that I would like, and > > > that's the Algol 68 style switch on type. If we had something like > > > def foo(x): > > > switch type(x): > > > case int: > > > do something > > > case string: > > > do something else > > > this would be a nice point to hook into for something that tries to compile > > > Python to C or somesuch. > > > > Unfortunately, type-names/objects aren't compile-time constants, so we can't > > implement this without some kind of namespace-modification-notification > > technique. Hmm... Or perhaps we could do the normal lookup, compare the > > then-current 'int' vs. the one we looked up, and if they aren't equal > > re-initialize the jump dict.... But *shudder*. > > Dang. You're right -- I overlooked that "detail". After some more tweaking... how's this: switch type(x).__name__: case 'int': SUITE case 'string': SUITE (note the string constants) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From just@letterror.com Thu Nov 15 10:21:53 2001 From: just@letterror.com (Just van Rossum) Date: Thu, 15 Nov 2001 11:21:53 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) In-Reply-To: <3BF39336.BF89B22D@lemburg.com> Message-ID: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> M.-A. Lemburg wrote: > While it seems like a nice idea to update code which is already in > use, I think that this leads down the wrong track. Sooner or > later you'll end up with a complete mess in memory ;-) And depending > on what code you exchange, this can cause serious problems: e.g. > pickled data could become unusable, parts of the system would > suddenly stop working because of e.g. a name change in one of the APIs, > etc. I don't see "enhanced reloading" as a way to modify long running processes, but a way to shorten the development cycle. Sure, things can break, but that happens while you're coding, right? ;-) The worst thing that can happen is that you have to restart your app. > > General question: why if func_globals not a writable attribute? > > Why should it be ? Just think of the security holes this would > open. Huh? You can change func_globals in place, I don't see how it's more vulnerable to replace it with another dict. Assigning to func_code, func_defaults, func_doc and func_dict is allowed, I was just wondering if there's a specific reason why it's not allowed for func_globals. Am I missing something obvious here? Just From Maria" AS SEEN ON NATIONAL TV: Making over Half Million Dollars every 4 to 5 Months from your Home for an investment of only $25 U.S. Dollars expense one time THANK'S TO THE COMPUTER AGE AND THE INTERNET! ================================================== BE A MILLIONAIRE LIKE OTHERS WITHIN A YEAR!!! Before you say ''Bull'', please read the following. This is the letter you have been hearing about on the news lately. Due to the popularity of this letter on the Internet, a national weekly news program recently devoted an entire show to the investigation of this program described below, to see if it really can make people money. The show also investigated whether or not the program was legal. Their findings proved once and for all that there are ''absolutely NO Laws prohibiting the participation in the program and if people can follow the simple instructions, they are bound to make some mega bucks with only $25 out of pocket cost''. DUE TO THE RECENT INCREASE OF POPULARITY & RESPECT THIS PROGRAM HAS ATTAINED, IT IS CURRENTLY WORKING BETTER THAN EVER. This is what one had to say: ''Thanks to this profitable opportunity. I was approached many times before but each time I passed on it. I am so glad I finally joined just to see what one could expect in return for the minimal effort and money required. To my astonishment, I received total $610,470.00 in 21 weeks, with money still coming in." Pam Hedland, Fort Lee, New Jersey. =================================================== Here is another testimonial: "This program has been around for a long time but I never believed in it. But one day when I received this again in the mail I decided to gamble my $25 on it. I followed the simple instructions and 3 weeks later the money started to come in. First month I only made $240.00 but the next 2 months after that I made a total of $290,000.00. So far, in the past 8 months by re-entering the program, I have made over $710,000.00 and I am playing it again. The key to success in this program is to follow the simple steps and NOT change anything.'' More testimonials later but first, ===== PRINT THIS NOW FOR YOUR FUTURE REFERENCE ====== $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ If you would like to make at least $500,000 every 4 to 5 months easily And comfortably, please read the following...THEN READ IT AGAIN and AGAIN!!! $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ FOLLOW THE SIMPLE INSTRUCTION BELOW AND YOUR FINANCIAL DREAMS WILL COME TRUE, GUARANTEED! INSTRUCTIONS: =====Order all 5 reports shown on the list below ===== For each report, send $5 CASH, THE NAME & NUMBER OF THE REPORT YOU ARE ORDERING and YOUR E-MAIL ADDRESS to the person whose name appears ON THAT LIST next to the report. MAKE SURE YOUR RETURN ADDRESS IS ON YOUR ENVELOPE TOP LEFT CORNER in case of any mail problems. === When you place your order, make sure you order each of the 5 reports. You will need all 5 reports so that you can save them on your computer and resell them. YOUR TOTAL COST $5 X 5=$25.00. Within a few days you will receive, vie e-mail, each of the 5 reports from these 5 different individuals. Save them on your computer so they will be accessible for you to send to the 1,000's of people who will order them from you. Also make a floppy of these reports and keep it on your desk in case something happen to your computer. IMPORTANT - DO NOT alter the names of the people who are listed next to each report, or their sequence on the list, in any way other than what is instructed below in step '' 1 through 6 '' or you will lose out on majority of your profits. Once you understand the way this works, you will also see how it does not work if you change it. Remember, this method has been tested, and if you alter, it will NOT work !!! People have tried to put their friends/relatives names on all five thinking they could get all the money. But it does not work this way. Believe us, we all have tried to be greedy and then nothing happened. So Do Not try to change anything other than what is instructed. Because if you do, it will not work for you. Remember, honesty reaps the reward!!! 1.... After you have ordered all 5 reports, take this advertisement and REMOVE the name & address of the person in REPORT # 5. This person has made it through the cycle and is no doubt counting their fortune. 2.... Move the name & address in REPORT # 4 down TO REPORT # 5. 3.... Move the name & address in REPORT # 3 down TO REPORT # 4. 4.... Move the name & address in REPORT # 2 down TO REPORT # 3. 5.... Move the name & address in REPORT # 1 down TO REPORT # 2 6.... Insert YOUR name & address in the REPORT # 1 Position. PLEASE MAKE SURE you copy every name & address ACCURATELY! ========================================================== **** Take this entire letter, with the modified list of names, and save it on your computer. DO NOT MAKE ANY OTHER CHANGES. Save this on a disk as well just in case you lose any data. To assist you with marketing your business on the internet, the 5 reports you purchase will provide you with invaluable marketing information which includes how to send bulk e-mails legally, where to find thousands of free classified ads and much more. There are 2 Primary methods to get this venture going: METHOD # 1: BY SENDING BULK E-MAIL LEGALLY ========================================================== Let's say that you decide to start small, just to see how it goes, and we will assume You and those involved send out only 5,000 e-mails each. Let's also assume that the mailing receive only a 0.2% response (the response could be much better but lets just say it is only 0.2%. Also many people will send out hundreds of thousands e-mails instead of only 5,000 each). Continuing with this example, you send out only 5,000 e-mails. With a 0.2% response, that is only 10 orders for report # 1. Those 10 people responded by sending out 5,000 e-mail each for a total of 50,000. Out of those 50,000 e-mails only 0.2% responded with orders. That's=100 people responded and ordered Report # 2. Those 100 people mail out 5,000 e-mails each for a total of 500,000 e-mails. The 0.2% response to that is 1000 orders for Report # 3. Those 1000 people send out 5,000 e-mails each for a total of 5 million e-mails sent out. The 0.2% response to that is 10,000 orders for Report # 4. Those 10,000 people send out 5,000 e-mails each for a total of 50,000,000 (50 million) e-mails. The 0.2% response to that is 100,000 orders for Report # 5 THAT'S 100,000 ORDERS TIMES $5 EACH=$500,000.00 (half million). Your total income in this example is: 1..... $50 + 2..... $500 + 3.....$5,000 + 4.... $50,000 + 5..... $500,000 ....... Grand Total=$555,550.00 NUMBERS DO NOT LIE. GET A PENCIL & PAPER AND FIGUREOUT THE WORST POSSIBLE RESPONSES AND MATTER HOW YOU CALCULATE IT, YOU WILL STILL MAKE A LOT OF MONEY ! ========================================================= REMEMBER FRIEND, THIS IS ASSUMING ONLY 10 PEOPLE ORDERING OUT OF 5,000 YOU MAILED TO. Dare to think for a moment what would happen if everyone or half or even one 4th of those people mailed 100,000e-mails each or more? There are over 150 million people on the Internet worldwide and counting. Believe me, many people will do just that, and more! METHOD # 2 : BY PLACING FREE ADS ON THE INTERNET ======================================================= Advertising on the net is very very inexpensive and there are hundreds of FREE places to advertise. Placing a lot of free ads on the Internet will easily get a larger response. We strongly suggest you start with Method # 1 and add METHOD # 2 as you go along. For every $5 you receive, all you must do is e-mail them the Report they ordered. That's it. Always provide same day service on all orders. This will guarantee that the e-mail they send out, with your name and address on it, will be prompt because they can not advertise until they receive the report. =========== AVAILABLE REPORTS ==================== ORDER EACH REPORT BY ITS NUMBER & NAME ONLY. Notes: Always send $5 cash (U.S. CURRENCY) for each Report. Checks NOT accepted. Make sure the cash is concealed by wrapping it in at least 2 sheets of paper. On one of those sheets of paper, Write the NUMBER & the NAME of the Report you are ordering, YOUR E-MAIL ADDRESS and your name and postal address. PLACE YOUR ORDER FOR THESE REPORTS NOW : ==================================================== REPORT# 1: The Insider's Guide to Advertising for Free on the Net Order Report #1 from: Marìa Luz Ruiz P.O Box 3279-1000 San José, Costa Rica América Central _____________________________________________ REPORT # 2: The Insider's Guide to Sending Bulk e-mail on the Net Order Report # 2 from: Nada Mahmood P.O. Box 5385 Manama, Bahrain Arabian Gulf _________________________________________________________ REPORT # 3: Secret to Multilevel Marketing on the Net Order Report # 3 from : Kamalarora m237,Vikaspuri, Newdelhi_110018 India ___________________________________________________________ REPORT # 4: How to Become a Millionaire Utilizing MLM & the Net Order Report # 4 from: Mahdokht Kaz 1175 Road 6025 Zinj 360, Bahrain Arabian Gulf ____________________________________________________________ REPORT #5: How to Send Out 0ne Million e-mails for Free Order Report # 5 from: P. Condinho 2549 Mason Heights Mississauga, Ontario L5B 2S3, Canada ____________________________________________________________ $$$$$$$$$ YOUR SUCCESS GUIDELINES $$$$$$$$$$$ Follow these guidelines to guarantee your success: === If you do not receive at least 10 orders for Report #1 within 2 weeks, continue sending e-mails until you do. === After you have received 10 orders, 2 to 3 weeks after that you should receive 100 orders or more for REPORT # 2. If you did not, continue advertising or sending e-mails until you do. === Once you have received 100 or more orders for Report # 2, YOU CAN RELAX, because the system is already working for you, and the cash will continue to roll in ! THIS IS IMPORTANT TO REMEMBER: Every time your name is moved down on the list, you are placed in front of a Different report. You can KEEP TRACK of your PROGRESS by watching which report people are ordering from you. IF YOU WANT TO GENERATE MORE INCOME SEND ANOTHER BATCH OF E-MAILS AND START THE WHOLE PROCESS AGAIN. There is NO LIMIT to the income you can generate from this business !!! ====================================================== FOLLOWING IS A NOTE FROM THE ORIGINATOR OF THIS PROGRAM: You have just received information that can give you financial freedom for the rest of your life, with NO RISK and JUST A LITTLE BIT OF EFFORT. You can make more money in the next few weeks and months than you have ever imagined. Follow the program EXACTLY AS INSTRUCTED. Do Not change it in any way. It works exceedingly well as it is now. Remember to e-mail a copy of this exciting report after you have put your name and address in Report #1 and moved others to #2 ...........#5 as instructed above. One of the people you send this to may send out 100,000 or more e-mails and your name will be on every one of them. Remember though, the more you send out the more potential customers you will reach. So my friend, I have given you the ideas, information, materials and opportunity to become financially independent. IT IS UP TO YOU NOW ! ============ MORE TESTIMONIALS ================ "My name is Mitchell. My wife, Jody and I live in Chicago. I am an accountant with a major U.S. Corporation and I make pretty good money. When I received this program I grumbled to Jody about receiving ''junk mail''. I made fun of the whole thing,spouting my knowledge of the population and percentages involved. I ''knew'' it wouldn't work. Jody totally ignored my supposed intelligence and few days later she jumped in with both feet. I made merciless fun of her, and was ready to lay the old ''I told you so'' on her when the thing didn't work. Well, the laugh was on me! Within 3 weeks she had received 50 responses. Within the next 45 days she had Received total $ 147,200.00 ........... all cash! I was shocked. I have joined Jody in her ''hobby''. Mitchell Wolf M.D., Chicago, Illinois ====================================================== ''Not being the gambling type, it took me several weeks to make up my mind to participate in this plan. But conservative that I am, I decided that the initial investment was so little that there was just no way that I wouldn't get enough orders to at least get my money back''. '' I was surprised when I found my medium size post office box crammed with orders. I made $319,210.00in the first 12 weeks. The nice thing about this deal is that it does not matter where people live. There simply isn't a better investment with a faster return and so big." Dan Sondstrom, Alberta, Canada ======================================================= ''I had received this program before. I deleted it, but later I wondered if I should have given it a try. Of course, I had no idea who to contact to get another copy, so I had to wait until I was e-mailed again by someone else.........11 months passed then it luckily came again...... I did not delete this one! I made more than $490,000 on my first try and all the money came within 22 weeks." Susan De Suza, New York, N.Y. ======================================================= ''It really is a great opportunity to make relatively easy money with little cost to you. I followed the simple instructions carefully and within 10 days the money started to come in. My first month I made $20,560.00 and by the end of third month my total cash count was $362,840.00. Life is beautiful, Thanx to internet.". Fred Dellaca, Westport, New Zealand ======================================================= ORDER YOUR REPORTS TODAY AND GET STARTED ON 'YOUR' ROAD TO FINANCIAL FREEDOM ! ======================================================= If you have any questions of the legality of this program, contact the Office of Associate Director for Marketing Practices, Federal Trade Commission, Bureau of Consumer Protection, Washington, D.C. From arigo@ulb.ac.be Thu Nov 15 11:42:13 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Thu, 15 Nov 2001 12:42:13 +0100 (MET) Subject: [Python-Dev] PEP 276 Simple Iterator for ints In-Reply-To: Message-ID: Hello, As a logician I like "for i in 10" a lot -- indeed, it is common to identify a natural number with the set of its predecessors. Given this mathematical meaning you can read "for i in 10" in English and understand it correctly :-) Armin From paul@svensson.org Thu Nov 15 11:55:39 2001 From: paul@svensson.org (Paul Svensson) Date: Thu, 15 Nov 2001 06:55:39 -0500 (EST) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: <200111150547.SAA13245@s454.cosc.canterbury.ac.nz> Message-ID: On Thu, 15 Nov 2001, Greg Ewing wrote: >Donald Beaudry : > >> In a for or while, the else clause only gets executed when >> the statement terminates "normally" (not due to a break). Following >> this model, one might expect the else clause associated with a 'when' >> statement to be executed whenever a when's in caluse terminates >> normally. But what does "normally" mean in this context? > >No, please, don't try to make it work like the loops do! By the >principle of least astonishment when coming from other languages, that >would be a nightmare. Yes, it should work exactly like the loops do; it's just Donald who's confusing how `else' clauses on loops work, There's nothing "normal" about running to the end of a loop. The `else' clause is entered when the (explicit or implicit) test of the main statement fails. This is how it works with `if', `except', and loops. Anything else would be confusing and error-prone. Thus, in the below code, one and only one of the three functions is called. select flip_eggs: when 1: sunny_side_up() when 2: over_easy() else: scrambled() /Paul From skip@pobox.com (Skip Montanaro) Thu Nov 15 12:24:03 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 15 Nov 2001 13:24:03 +0100 Subject: [Python-Dev] __class_init__ In-Reply-To: References: <00ff01c16b4f$e06ca710$e000a8c0@thomasnotebook> Message-ID: <15347.46051.954499.240349@beluga.mojam.com> >> [__class_init__] Tim> [Guido said] ... "The right way" is to define a custom metaclass Tim> instead (whose __init__ plays the role __class_init__ would have Tim> played if defined in the class). It makes sense to me, but I'll Tim> have to play with it to be convinced it's as usable. For those of us with next to no ExtensionClass experience, I think it would be helpful to describe what __class_init__ does. I saw it there in the PyGtk 2.x wrappers before JamesH converted to new-style classes. I had no idea what it did (and blissfully ignored it) and James apparently didn't need any similar functionality after the conversion. Thx, Skip From jim@interet.com Thu Nov 15 14:17:26 2001 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 15 Nov 2001 09:17:26 -0500 Subject: [Python-Dev] Final Version of PEP 273, Zip Importing Message-ID: <3BF3CE76.8A220F7A@interet.com> Hello, The final version (unless someone objects) of the Zip importing design is available at http://python.sourceforge.net/peps/pep-0273.html and I am finishing up the code. There are several changes, especially the use of os.listdir() to cache directory contents to speed up imports. Most of the changes are in import.c. If anyone has an import.c wish list, please post here. Maybe I can add your pet feature while I am at it. Jim Ahlstrom From thomas.heller@ion-tof.com Thu Nov 15 14:41:28 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 15 Nov 2001 15:41:28 +0100 Subject: [Python-Dev] Re: [Import-sig] Re: Proposal for a modified import mechanism. References: <20011115095420-r01010800-d8036cdd-0920-010c@10.0.0.23> Message-ID: <005801c16de3$ae433b50$e000a8c0@thomasnotebook> From: "Just van Rossum" > Thomas Heller wrote: > > > > http://groups.google.com/groups?selm=9ss15a%2414snvc%242%40ID-59885.news.dfncis.de > > Nice! Maybe I wouldn't have missed that if hadn't been part of the "IsPython > really O-O?" thread... Well, I had problems at that time posting at all, so finally it escaped without a changed subject ;-) > > I think it's worth playing with this stuff more. One improvement I would like to > try is to update methods just like global functions: that way existing callbacks > that are bound methods will also be updated. Great idea! > > Other nits: > func_defaults and func_doc should definitely be updated, I'm not sure about > func_dict. > > General question: why if func_globals not a writable attribute? Dont' know, but it can be modified (although this is dangerous): >>> def f(): pass ... >>> f.func_globals {'__builtins__': , '__name__': '__main__', '__doc__': None, 'f': } >>> f.func_globals.clear() >>> f Traceback (most recent call last): File "", line 1, in ? NameError: name 'f' is not defined >>> I intend to update this stuff stealing from your ideas, but probably python-dev is not the correct place to discuss this (IMO). Should I post it again (with a better subject) to python-list, and we continue there? Thomas From thomas.heller@ion-tof.com Thu Nov 15 15:00:44 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 15 Nov 2001 16:00:44 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) References: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> Message-ID: <009101c16de6$5f771570$e000a8c0@thomasnotebook> From: "Just van Rossum" > M.-A. Lemburg wrote: > > > While it seems like a nice idea to update code which is already in > > use, I think that this leads down the wrong track. Sooner or > > later you'll end up with a complete mess in memory ;-) And depending > > on what code you exchange, this can cause serious problems: e.g. > > pickled data could become unusable, parts of the system would > > suddenly stop working because of e.g. a name change in one of the APIs, > > etc. > > I don't see "enhanced reloading" as a way to modify long running processes, but > a way to shorten the development cycle. That's exactly what I had in mind. More and more I hear complaints from people, that even a C++ program can be changed while running in the debugger (under certain circumstances), so why not Python? Thomas From thomas.heller@ion-tof.com Thu Nov 15 15:06:50 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 15 Nov 2001 16:06:50 +0100 Subject: [Python-Dev] __class_init__ References: <00ff01c16b4f$e06ca710$e000a8c0@thomasnotebook> <15347.46051.954499.240349@beluga.mojam.com> Message-ID: <009c01c16de7$391cb460$e000a8c0@thomasnotebook> From: "Skip Montanaro" > >> [__class_init__] > > Tim> [Guido said] ... "The right way" is to define a custom metaclass > Tim> instead (whose __init__ plays the role __class_init__ would have > Tim> played if defined in the class). It makes sense to me, but I'll > Tim> have to play with it to be convinced it's as usable. > > For those of us with next to no ExtensionClass experience, I think it would > be helpful to describe what __class_init__ does. I saw it there in the > PyGtk 2.x wrappers before JamesH converted to new-style classes. I had no > idea what it did (and blissfully ignored it) and James apparently didn't > need any similar functionality after the conversion. __class_init__ (or was it called __init_class__?) is a *class* method which is called when the *class* is created (the class statement executed). class X(ExtensionClass.Base): def __class_init__(cls): print cls, "created" class Y(X): pass would first print ' created' and then ' created'. It can be used to initialize class attributes for example depending on other class attributes (that's at least what I have/had in mind). Thomas From Prabhu Ramachandran Thu Nov 15 17:03:52 2001 From: Prabhu Ramachandran (Prabhu Ramachandran) Date: Thu, 15 Nov 2001 22:33:52 +0530 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) In-Reply-To: <009101c16de6$5f771570$e000a8c0@thomasnotebook> References: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> <009101c16de6$5f771570$e000a8c0@thomasnotebook> Message-ID: <15347.62840.57234.111036@monster.linux.in> >>>>> "TH" == Thomas Heller writes: [Greg Ward on from A import B] [Just van Rossun on Thomas's cool autoreload] >> I don't see "enhanced reloading" as a way to modify long >> running processes, but a way to shorten the development cycle. TH> That's exactly what I had in mind. TH> More and more I hear complaints from people, that even a C++ TH> program can be changed while running in the debugger (under TH> certain circumstances), so why not Python? Indeed, that is pretty much the same reason I tried to keep my imports clean. I sometimes use a reload_all function to reload modules while I develop code. Of course, it doesn't handle already instantiated objects and its not really meant for long running applications but it sure is convenient. prabhu From fdrake@acm.org Thu Nov 15 17:35:24 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 15 Nov 2001 12:35:24 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011115173524.741DD28697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added a chapter on Tkinter & friends, contributed by Mike Clarkson. There are a few additional updates as well. From mal@lemburg.com Thu Nov 15 17:48:55 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 18:48:55 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) References: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> Message-ID: <3BF40007.58A6D000@lemburg.com> Just van Rossum wrote: > > M.-A. Lemburg wrote: > > > > General question: why if func_globals not a writable attribute? > > > > Why should it be ? Just think of the security holes this would > > open. > > Huh? You can change func_globals in place, I don't see how it's more vulnerable > to replace it with another dict. Assigning to func_code, func_defaults, func_doc > and func_dict is allowed, I was just wondering if there's a specific reason why > it's not allowed for func_globals. Am I missing something obvious here? Hmm, looking at the code it seems that assigning to various function object attributes has always been allowed. I wonder why... since it doesn't have any obvious use except preventing to pretend that function objects are immutable (which is what I would have expected). Nevermind, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Nov 15 17:37:42 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 18:37:42 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) References: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> <009101c16de6$5f771570$e000a8c0@thomasnotebook> Message-ID: <3BF3FD66.B036F8A0@lemburg.com> Thomas Heller wrote: > > From: "Just van Rossum" > > M.-A. Lemburg wrote: > > > > > While it seems like a nice idea to update code which is already in > > > use, I think that this leads down the wrong track. Sooner or > > > later you'll end up with a complete mess in memory ;-) And depending > > > on what code you exchange, this can cause serious problems: e.g. > > > pickled data could become unusable, parts of the system would > > > suddenly stop working because of e.g. a name change in one of the APIs, > > > etc. > > > > I don't see "enhanced reloading" as a way to modify long running processes, but > > a way to shorten the development cycle. > > That's exactly what I had in mind. > > More and more I hear complaints from people, that even a C++ program > can be changed while running in the debugger (under certain circumstances), > so why not Python? The same is possible in Python's pdb, BTW. Anyway, my reply was more targetted in the direction of: "ok, this is nice to have as a hack during development, but doesn't solve any longstanding problems like e.g. unloading of modules". It's still a cool module -- perhaps you could also make it work for complete packages ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas.heller@ion-tof.com Thu Nov 15 19:22:21 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 15 Nov 2001 20:22:21 +0100 Subject: [Python-Dev] reload() et al. (Re: [Import-sig] Re: Proposal for a modified import mechanism.) References: <20011115112201-r01010800-606dffd9-0920-010c@10.0.0.23> <009101c16de6$5f771570$e000a8c0@thomasnotebook> <3BF3FD66.B036F8A0@lemburg.com> Message-ID: <032901c16e0a$ebdafc10$e000a8c0@thomasnotebook> > > > I don't see "enhanced reloading" as a way to modify long running processes, but > > > a way to shorten the development cycle. > > > > That's exactly what I had in mind. > > > > More and more I hear complaints from people, that even a C++ program > > can be changed while running in the debugger (under certain circumstances), > > so why not Python? > > The same is possible in Python's pdb, BTW. Anyway, my reply was > more targetted in the direction of: "ok, this is nice to have as a > hack during development, but doesn't solve any longstanding problems > like e.g. unloading of modules". It's still a cool module -- perhaps > you could also make it work for complete packages ?! I didn't explicitely code something to prevent packages ;-), but there's a bug in it (also in the version I posted to c.l.p, where the discussion probably should continue). Thanks, Thomas From nas@python.ca Thu Nov 15 20:10:33 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 15 Nov 2001 12:10:33 -0800 Subject: [Python-Dev] .pyc magic time bomb? Message-ID: <20011115121033.A32207@glacier.arctrix.com> Has the magic for .pyc files been fixed? If not, shouldn't this be done before the 2.2 release. Neil From tim@zope.com Thu Nov 15 21:50:53 2001 From: tim@zope.com (Tim Peters) Date: Thu, 15 Nov 2001 16:50:53 -0500 Subject: [Python-Dev] .pyc magic time bomb? In-Reply-To: <20011115121033.A32207@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > Has the magic for .pyc files been fixed? No. > If not, shouldn't this be done before the 2.2 release. That's why 2.2 has to ship before January . The last time this came up, people tried to turn "the magic number" into a database. Since Guido won't go for that, chances seem good that, in the absence of a reasonably modest alternative scheme, we'll just pick new numbers out of a hat. From mal@lemburg.com Thu Nov 15 22:56:13 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 15 Nov 2001 23:56:13 +0100 Subject: [Python-Dev] .pyc magic time bomb? References: Message-ID: <3BF4480D.277DDAED@lemburg.com> Tim Peters wrote: > > [Neil Schemenauer] > > Has the magic for .pyc files been fixed? > > No. > > > If not, shouldn't this be done before the 2.2 release. > > That's why 2.2 has to ship before January . Ah, so that's why ;-) > The last time this came up, people tried to turn "the magic number" into a > database. Since Guido won't go for that, chances seem good that, in the > absence of a reasonably modest alternative scheme, we'll just pick new > numbers out of a hat. Just add 42 and be done with it :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From barry@zope.com Fri Nov 16 00:21:12 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 15 Nov 2001 19:21:12 -0500 Subject: [Python-Dev] Branch created for Python 2.2 beta 2 Message-ID: <15348.23544.481388.771477@yyz.digicool.com> Hi all, I've just created the branch for Python 2.2 beta 2, tagged as r22b2-branch. As usual there's also a tag on the trunk tagged as r22b2-fork. Just a reminder that no checkins should be committed on the branch, without checking with me first (except for Tim and Fred, of course ;). You can continue to make commits on the trunk, but if the change should be merged into the branch, please include a note in your commit log message. I've been out of touch with my email for most of today due to network problems, so I'll be catching up on stuff tomorrow. I still plan on making the 2.2b2 release tomorrow. Cheers, -Barry From martin@v.loewis.de Fri Nov 16 09:10:34 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 16 Nov 2001 10:10:34 +0100 Subject: [Python-Dev] PEP 273 comments Message-ID: <200111160910.fAG9AY201295@mira.informatik.hu-berlin.de> Hi Jim, Reading through PEP 273, I have a couple of comments: Section zlib, "will fail with a message": Is this still current? Looking at the implementation, it seems that you'll rather get an import error, which sounds much better. Section Booting Looking at the implementation, it appears that PyImport_InitZip already invokes PyImport_ImportModule. Is there a mechanism that prevents ZIP import from being used while it is not ready yet? Regards, Martin From mal@lemburg.com Fri Nov 16 12:00:15 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 16 Nov 2001 13:00:15 +0100 Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints References: <15347.37907.247581.153829@beluga.mojam.com> Message-ID: <3BF4FFCF.60FA6513@lemburg.com> Skip Montanaro wrote: > > >> Below is the first draft of PEP 276 "Simple Iterator for ints". > >> (Available at http://python.sourceforge.net/peps/pep-0276.html) > >> > >> Feel free to comment (positive, negative, bipolar, or in between ;-). > > I liked "for i in 10:" at first, however, the more I see of the Haskell > iterator syntax, the more I like it. Given that > > * it should cause no backward compatibility issues (you can't put "..." > in a list now, can you? NumPy?) > > * it completely replaces range and xrange, not just one use of it > > * it can be made to work for floats, strings and possibly other builtin > data types as well as ints > > * you can optimize it well > > I would tend to view the int-as-iterator as a wart (a neat sort of wart, but > a wart nonetheless), while the Haskell syntax is elegant and seems to fit in > well with existing Python usage. I agree with Skip. Skip, could you tell us more about the Haskell syntax for these things ? Here's a slightly different approach which works now (= it doesn't even require a PEP :-): With the new iterator support in Python, it should be easily possible to write a module which exposes a singleton infinite sequence called e.g. "integers". Then, using the slicing notion we already have in Python, you could write: for i in integers[0,...,10]: print i for i in integers[0:11]: print i for i in integers[0:10:2]: print i for i in integers[0,...,10:2]: print i (or other wild combinations of slice objects and ellipsises) The same would work for any other number type for which slices can reasonably well express a finite sub sequence of the possibly infinite base range, e.g. floats, natural numbers, (finite) fields, prime numbers, rational numbers, Unicode characters, arrays, etc. The above notation obviously looks very natural to mathematicians, not sure about Joe User though. Hmm, seems like a nice addition for my mxNumbers package. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Fri Nov 16 12:37:53 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 16 Nov 2001 13:37:53 +0100 Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints In-Reply-To: <3BF4FFCF.60FA6513@lemburg.com> References: <15347.37907.247581.153829@beluga.mojam.com> <3BF4FFCF.60FA6513@lemburg.com> Message-ID: <15349.2209.107415.77922@beluga.mojam.com> Skip> I liked "for i in 10:" at first, however, the more I see of the Skip> Haskell iterator syntax, the more I like it. ... Skip> I would tend to view the int-as-iterator as a wart (a neat sort of Skip> wart, but a wart nonetheless), while the Haskell syntax is elegant Skip> and seems to fit in well with existing Python usage. ... mal> Skip, could you tell us more about the Haskell syntax for these mal> things ? I'll restrict this followup to python-dev. The Haskell syntax has been bandied about on c.l.py for the past several days, so I assume those folks already know this or don't care. ;-) Haskell has a list constructor that maps pretty nicely onto Python's range() and xrange() functions: [ exp1 [, exp2] .. [exp3] ] Since Haskell is lazy in its evaluation, infinite sequences can be constructed easily: [ 0 .. ] # all natural numbers More common usage in Python would be something like [ 0 .. 5 ] # [0, 1, 2, 3, 4, 5] or [ 0, 3 .. 27 ] # xrange(0, 28, 3) I believe (Tim can correct me if I'm wrong), but when you use these constructors to build finite sequences, they are closed at both ends, just as Python lists, and unlike [x]range(), hence the different endpoints in the last example. This seems to be a sticking point for people on c.l.py who see the lack of complete equivalence with [x]range as a problem. I see them as (effectively) list constructors, and expect them to be closed at both ends. (Though readers of this thread on c.l.py will notice that I muffed that bit there as well. ;-) I see a few advantages to this list constructor over [x]range(): * I think it would be a bit easier to explain to new users than range. I think most people have seen sequences in math like [ x , x , ... , x ] 1 2 n and would thus find the notation familiar. Admittedly, [x]range() isn't that difficult, however. * It would expose potentially significant optimizations that can't be made today by eliminating the attribute lookup and function call to range, and thus getting rid of that little bit of dynamism nobody ever uses anyway. * I think we could easily extend the notation to build sequences of other basic types: floats and single-character strings being the most obvious: [ "a", "c" .. "z"] [ 0.0, 0.1 .. 2*math.pi ] Python already has a "..." operator, so I would propose that be used instead of Haskell's "..". Skip From niemeyer@conectiva.com Fri Nov 16 14:12:19 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Fri, 16 Nov 2001 12:12:19 -0200 Subject: [Python-Dev] CRLF in email.Generator Message-ID: <20011116121218.A1832@ibook.distro.conectiva> --5vNYLRcllDrimb99 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello Barry!! First, I'd like to thank you for implementing the email package. It seems very complete and will be useful for sure. I've been looking at the Generator module. It seems to be writing the message with: print >> self._fp, "text" This should probably be replaced by: self._fp.write("text"+"\r\n") Print's end of line will depend on the system where python is being run. If you run it in Linux, it will output just a "\n". This breaks RFC2822: """ Messages are divided into lines of characters. A line is a series of characters that is delimited with the two characters carriage-return and line-feed; that is, the carriage return (CR) character (ASCII value 13) followed immediately by the line feed (LF) character (ASCII value 10). (The carriage-return/line-feed pair is usually written in this document as "CRLF".) """ As a side note, I'd like to suggest the inclusion of some kind of "raw_input". There are cases where you want to see the raw message (or just part of it), instead of regenerating it. This happens, for example, when you want to check a signed multipart message. Thank you very much! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --5vNYLRcllDrimb99 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE79R7CIlOymmZkOgwRAl4rAKCXP/D8S2ZnJtdaKCSMymsrZMQE/ACeKIWW Ut7yeOJlKyaEAYuO8I/OAJQ= =t+I9 -----END PGP SIGNATURE----- --5vNYLRcllDrimb99-- From jim@interet.com Fri Nov 16 14:25:46 2001 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 16 Nov 2001 09:25:46 -0500 Subject: [Python-Dev] PEP 273 comments References: <200111160910.fAG9AY201295@mira.informatik.hu-berlin.de> Message-ID: <3BF521EA.3876345E@interet.com> Hi Martin, Thanks for taking the time to look at my code. "Martin v. Loewis" wrote: > Section zlib, "will fail with a message": > > Is this still current? Looking at the implementation, it seems that > you'll rather get an import error, which sounds much better. Yes, you are right. If a module is found in a zip file, and the zip module is compressed, and zlib is unavailable, then get_zip_string() fails with PyExc_ValueError (and a message), and the import fails. > Section Booting > > Looking at the implementation, it appears that PyImport_InitZip > already invokes PyImport_ImportModule. Is there a mechanism that > prevents ZIP import from being used while it is not ready yet? At the point that PyImport_ImportModule("os") is called, zip imports are ready. Specifically, python22.zip has been added to sys.path, and zlib has been imported. The attempted import of the "os" module can be satisfied from a zip archive or from a directory. We are attempting to import the os module to support caching of directory contents using os.listdir(). This would cause an infinite loop except for the use of the use_os_listdir flag. This flag is zero if os.listdir has not yet been imported. It is 1 if os.listdir is available, and -1 if it is unavailable. JimA From thomas.heller@ion-tof.com Fri Nov 16 16:56:34 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 16 Nov 2001 17:56:34 +0100 Subject: [Python-Dev] __class_init__ References: Message-ID: <034101c16ebf$b820c9d0$e000a8c0@thomasnotebook> From: "Tim Peters" [__class_init__ methods] > I pinged Guido about this (he'll fire me if I ever do that again ), > and he's really not keen on it. "The right way" is to define a custom > metaclass instead (whose __init__ plays the role __class_init__ would have > played if defined in the class). It makes sense to me, but I'll have to > play with it to be convinced it's as usable. It seems Guido is right ;-), it is easier than ever! C:\sf\python\dist\src\PCbuild>python Python 2.2b1+ (#25, Nov 6 2001, 21:18:43) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> class metaclass(type): ... def __init__(self, *args): ... type.__init__(self, *args) ... try: ... cls_init = getattr(self, '__init_class__') ... except AttributeError: ... pass ... else: ... cls_init.im_func(self) ... >>> meta = metaclass('meta', (), {}) >>> class X(meta): ... def __init_class__(self): ... print "__init_class__", self ... __init_class__ >>> class Y(X): ... pass ... __init_class__ >>> ^Z Now I'll have to find out how to convert this to C. Thomas From thomas.heller@ion-tof.com Fri Nov 16 18:03:37 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 16 Nov 2001 19:03:37 +0100 Subject: [Python-Dev] __class_init__ References: <034101c16ebf$b820c9d0$e000a8c0@thomasnotebook> Message-ID: <041301c16ec9$16d60090$e000a8c0@thomasnotebook> When implementing a metatype in C (a subtype of PyType_Type), aren't the tp_basicsize and tp_itemsize fields inherited? This question is not covered in PEP 253... I've attached the code below. Thanks, Thomas ------------------------------------------------- #include static PyMethodDef methods[] = { { NULL, NULL }, /* Sentinel */ }; PyTypeObject MyMeta_Type = { PyObject_HEAD_INIT(NULL) 0, /* ob_size */ "mytype", /* tp_name */ 0, /*sizeof(PyTypeObject), /* tp_basicsize */ 0, /* tp_itemsize */ 0, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT, /* tp_flags */ "a meta type", /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ 0, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ 0, /* tp_init */ 0, /* tp_alloc */ 0, /* tp_new */ 0, /* tp_free */ }; DL_EXPORT(void) initmetatype(void) { PyObject *m, *d; MyMeta_Type.ob_type = &PyType_Type; MyMeta_Type.tp_base = &PyType_Type; MyMeta_Type.tp_basicsize = PyType_Type.tp_basicsize; MyMeta_Type.tp_itemsize = PyType_Type.tp_itemsize; if (PyType_Ready(&MyMeta_Type) < 0) return; m = Py_InitModule3("metatype", methods, "provides a metatype"); d = PyModule_GetDict(m); PyDict_SetItemString(d, "mytype", (PyObject *)&MyMeta_Type); } From arigo@ulb.ac.be Fri Nov 16 18:23:08 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Fri, 16 Nov 2001 19:23:08 +0100 Subject: [Python-Dev] Psyco documentation Message-ID: <000901c16ecb$e7f082c0$13ce043e@oemcomputer> Hello, I've written some docs to get interested people started in understanding Psyco. It's on the site http://homepages.ulb.ac.be/~arigo/psyco/ where you will also find an update of Psyco (now supporting finally: and except: clauses, and with a few minor improvements that accelerated the Mandelbrot example quite a bit). The new and old documents are both available in HTML and Postscript. Please let me know if you tested some (reasonably small) examples and got a SIGSEGV or a failed assertion, or just plain wrong behavior. Thanks, Armin. From gward@python.net Fri Nov 16 19:24:58 2001 From: gward@python.net (Greg Ward) Date: Fri, 16 Nov 2001 14:24:58 -0500 Subject: [Python-Dev] CRLF in email.Generator In-Reply-To: <20011116121218.A1832@ibook.distro.conectiva> References: <20011116121218.A1832@ibook.distro.conectiva> Message-ID: <20011116142458.A31403@gerg.ca> On 16 November 2001, Gustavo Niemeyer said [regarding the 'email' package] > I've been looking at the Generator module. It seems to be writing > the message with: > > print >> self._fp, "text" > > This should probably be replaced by: > > self._fp.write("text"+"\r\n") This is true if, say, _fp is a socket connected to an SMTP server. But conventional usage (eg. my ~/Mailbox) dictates that lines end with "\n" in a Unix filesystem. Perhaps the Right Answer is: self._fp.write("text" + nl) where the value of nl ("\n", "\r\n", "\r", whatever) depends on the platform and the target of the message (file, SMTP server, whatever). Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ And now for something completely different. From barry@zope.com Fri Nov 16 20:52:57 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 16 Nov 2001 15:52:57 -0500 Subject: [Python-Dev] RELEASED: Python 2.2b2 is out! Message-ID: <15349.31913.391080.615780@anthem.wooz.org> Today we release Python 2.2b2, the second -- and probably last -- beta release of Python 2.2, for your enervation, elocution, and effervescence. http://www.python.org/2.2/ Our thanks to everyone who is testing out these beta releases! Please continue to report any bug you find to the bug tracker: http://sourceforge.net/bugs/?group_id=5470 Highlights of what's new for this release are outlined below. For a more complete list, please see: http://sf.net/project/shownotes.php?release_id=61669 Andrew Kuchling is writing a gentle introduction to the most important changes, titled "What's New in Python 2.2": http://www.amk.ca/python/2.2/ Guido's written his own introduction to the type/class unification at: http://www.python.org/2.2/descrintro.html python-is-so-proud-of-little-brother-orlijn-ly y'rs, -Barry -------------------- snip snip -------------------- - Lots of bug fixes! - Multiple inheritance mixing new-style and classic classes in the list of base classes is now allowed. - The new builtin dictionary() constructor, and dictionary type, have been renamed to dict. This reflects a decade of common usage. - dict() now accepts an iterable object producing 2-sequences. For example, dict(d.items()) == d for any dictionary d. The argument, and the elements of the argument, can be any iterable objects. - New-style classes can now have a __del__ method, which is called when the instance is deleted (just like for classic classes). - Assignment to object.__dict__ is now possible, for objects that are instances of new-style classes that have a __dict__ (unless the base class forbids it). - The socket function has been converted to a type; see below. - Assignment to __debug__ raises SyntaxError at compile-time. This was promised when 2.1c1 was released as "What's New in Python 2.1c1". - mmap has a new keyword argument, "access", allowing a uniform way for both Windows and Unix users to create read-only, write-through and copy-on-write memory mappings. - By default, the gc.garbage list now contains only those instances in unreachable cycles that have __del__ methods. - The socket module defines a new method for socket objects, sendall(). - Symbolic group names in regular expressions must be unique. - We've finally confirmed that this release builds on HP-UX 11.00, *with* threads, and passes the test suite. - Thanks to a series of patches from Michael Muller, Python may build again under OS/2 Visual Age C++. - Updated RISCOS port by Dietmar Schwertberger. From tim.one@home.com Fri Nov 16 21:12:43 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 16 Nov 2001 16:12:43 -0500 Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints In-Reply-To: <15349.2209.107415.77922@beluga.mojam.com> Message-ID: [Skip Montanaro] > The Haskell syntax has been bandied about on c.l.py for the past > several days ... > > Haskell has a list constructor that maps pretty nicely onto > Python's range() and xrange() functions: > > [ exp1 [, exp2] .. [exp3] ] The mapping isn't so nice -- range/xrange are great when the argument is len(sequence), but stuff like range(len(n)-1, -1, -1) is error-prone. Write an array merge sort in Python and count the off-by-2**i errors . > Since Haskell is lazy in its evaluation, infinite sequences can be > constructed easily: > > [ 0 .. ] # all natural numbers In practice I think that's more often written: [0, 1 ..] but I've seen both. The [exp ..] form is more common when "exp" is an expression instead of a literal (presumably because writing [exp, exp+1 ..] then is more of a PITA). > More common usage in Python would be something like > > [ 0 .. 5 ] # [0, 1, 2, 3, 4, 5] > > or > > [ 0, 3 .. 27 ] # xrange(0, 28, 3) > > I believe (Tim can correct me if I'm wrong), but when you use these > constructors to build finite sequences, they are closed at both ends, > just as Python lists, and unlike [x]range(), Yes, finite flavors are closed at both ends; note that can include empty cases, like: [10 .. 0] and [10, 11 .. 0] > hence the different endpoints in the last example. This seems to be a > sticking point for people on c.l.py who see the lack of complete > equivalence with [x]range as a problem. If they were trivially equivalent to range/xrange, why bother <0.5 wink>? Note that David Eppstein kicked off this round via searching for a cleaner notation for integer sequence to use in his classroom lectures and handouts (in which Python is used as a pseudo-language, not a real one -- and that his students grasp [i, j ..] at once is the important point for him). > I see them as (effectively) list constructors, and expect them to be > closed at both ends. Ya, and I think it's impossible not to see them that way. IIRC, someone once had the hideous suggestion of doing something like .. to mean semi-open and ... to mean closed. > ... > I see a few advantages to this list constructor over [x]range(): > > * I think it would be a bit easier to explain to new users than > range. > > I think most people have seen sequences in math like > > [ x , x , ... , x ] > 1 2 n > > and would thus find the notation familiar. Admittedly, [x]range() > isn't that difficult, however. If what you're *thinking* is i, j .. k, the transformation to an equivalent range() isn't trivial; e.g., if j > i is known, range(i, k+1, j-i) does the trick except when k == sys.maxint and/or j-i overflows; but in any case it's simply not what you're *thinking*. > * It would expose potentially significant optimizations that > can't be made today by eliminating the attribute lookup and > function call to range, and thus getting rid of that little bit > of dynamism nobody ever uses anyway. As I said last time this went around, I doubt the optimizations are significant: loop overhead is generally at worst a handful-of-percent thing in real programs, and the cost of one global lookup+call per entire loop (not per loop *trip*) isn't even measurable outside contrived examples. > * I think we could easily extend the notation to build sequences of > other basic types: floats and single-character strings being the > most obvious: > > [ "a", "c" .. "z"] > > [ 0.0, 0.1 .. 2*math.pi ] -1 on floats: unless floats are rationals, it's ill-defined, and like it or not users are better off coding float steps themselves so they explicitly control the rounding errors in ways best for their apps. (BTW, repeated addition is almost exactly the worst thing to do for fp sequences; doing starting_value + iteration_ordinal * step_value on each trip is usually much better behaved for floats.) Note that any Enum type in Haskell can use this notation. The four syntactic forms map to Enum methods like so: [e1 ..] <-> enumFrom e1 [e1, e2 ..] <-> enumFromThen e1 e2 [e1 .. e3] <-> enumFromTo e1 e3 [e1, e2 .. e3] <-> enumFromThenTo e1 e2 e3 > Python already has a "..." operator, so I would propose that be > used instead of Haskell's "..". Certainly a better idea than using "()" . From barry@zope.com Fri Nov 16 22:10:08 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 16 Nov 2001 17:10:08 -0500 Subject: [Python-Dev] Re: CRLF in email.Generator References: <20011116121218.A1832@ibook.distro.conectiva> Message-ID: <15349.36544.702394.524130@anthem.wooz.org> >>>>> "GN" == Gustavo Niemeyer writes: GN> Hello Barry!! Hello Gustavo! GN> First, I'd like to thank you for implementing the email GN> package. It seems very complete and will be useful for sure. Thanks! GN> I've been looking at the Generator module. It seems to be GN> writing the message with: GN> print >> self._fp, "text" GN> This should probably be replaced by: GN> self._fp.write("text"+"\r\n") GN> Print's end of line will depend on the system where python is GN> being run. If you run it in Linux, it will output just a GN> "\n". This breaks RFC2822: Actually, because the email package was designed to be used with something like smtplib for the actual sending of the message to an SMTP server, it always uses native line endings. It's the job of smtplib to convert from native to RFC 2821 line endings (and in fact, it does just that!). Similarly, your mail server would be responsible for converting from RFC 2821 line endings to native line endings if it were going to store a message in a mailbox, for example, that you'd want to parse using email.Parser. I'll make sure this is clarified in the email package's documentation. GN> As a side note, I'd like to suggest the inclusion of some kind GN> of "raw_input". There are cases where you want to see the raw GN> message (or just part of it), instead of regenerating it. This GN> happens, for example, when you want to check a signed GN> multipart message. I'm not sure I understand what you mean by "raw_input". Do you mean something like the HeaderParser class? Cheers, -Barry From niemeyer@conectiva.com Sat Nov 17 03:50:14 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 17 Nov 2001 01:50:14 -0200 Subject: [Python-Dev] Re: CRLF in email.Generator In-Reply-To: <15349.36544.702394.524130@anthem.wooz.org> References: <20011116121218.A1832@ibook.distro.conectiva> <15349.36544.702394.524130@anthem.wooz.org> Message-ID: <20011117015014.A896@ibook.distro.conectiva> --KsGdsel6WgEHnImy Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Berry! Thanks for your prompt answer! > Actually, because the email package was designed to be used with > something like smtplib for the actual sending of the message to an > SMTP server, it always uses native line endings. It's the job of [...] Got it.. > smtplib to convert from native to RFC 2821 line endings (and in fact, > it does just that!). Similarly, your mail server would be responsible > for converting from RFC 2821 line endings to native line endings if it > were going to store a message in a mailbox, for example, that you'd > want to parse using email.Parser. That made me wonder how signed messages would be rebuilt to their original state so that they could be checked at the other side. The answer is obvious: they don't. That breaks an old precept I used to have. Signed content *may* differ in some aspects from the original content. At least EOLs are modifiable. > I'll make sure this is clarified in the email package's documentation. Thanks. > GN> As a side note, I'd like to suggest the inclusion of some kind > GN> of "raw_input". There are cases where you want to see the raw > GN> message (or just part of it), instead of regenerating it. This > GN> happens, for example, when you want to check a signed > GN> multipart message. >=20 > I'm not sure I understand what you mean by "raw_input". Do you mean > something like the HeaderParser class? Let me try again. Here's how multipart signed message works: you have two parts, one is the whole signed message (maybe another multipart message, perhaps with more signatures), the other is the signature. In order to check if the second part is a good signature for the first part, it must contain exactly the same thing as it used to have when the signature was created (besides EOLs ;-). Something like a Message.raw_input could be used in these cases. You'd still have the same structure as you have now, with the headers, and the parsed payload. But the message, and more important, the payload of each multipart message (which is a message, of course), would contain the original and unparsed content of that part. Since this information is redundant, this behavior could be turned on with an option. Unfortunately, if I understood the algorithm correctly, HeaderParser wouldn't help in that case for two reasons: first, the header structure must stay intact as well; and second, you still want to walk into multiparts. I'm thinking about using something like this for an email bot I'm working on. Users will have to authenticate themselves trough gpg. Of course, maybe I'm the only human being wanting something so bizzare, so if you choose not to implement, I'll understand. :-) Creating a custom parser won't be hard as well. Thank you! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --KsGdsel6WgEHnImy Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE79d52IlOymmZkOgwRAjncAKCR8owJkIP7TuO34UL29SzDVRiXEACePZtm tWXcMp+7NNv+89XYqWhBtP0= =t0+8 -----END PGP SIGNATURE----- --KsGdsel6WgEHnImy-- From tim.one@home.com Sat Nov 17 21:06:48 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 17 Nov 2001 16:06:48 -0500 Subject: [Python-Dev] gc.garbage Message-ID: I notice the gc docs say: The following variable is provided for read-only access: ^^^^^^^^^^^^^^^^ garbage A list of objects which the collector found to be unreachable ... This isn't clear to me. When I find finalizer cycles in gc.trash, I want to clean them up by breaking the cycles. Unless I remove the instances from gc.trash too, their appearance in that list keeps them alive despite that the cycles are broken. But "read-only access" seems to imply "don't mutate", and I don't believe that was intended. True? False? From fdrake@acm.org Sun Nov 18 05:10:58 2001 From: fdrake@acm.org (Fred L. Drake) Date: Sun, 18 Nov 2001 00:10:58 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011118051058.E6F6328697@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Update to docs beyond Python 2.2 beta 2: Clarified a couple of points in the SAX API descriptions for startElement() and startElementNS(). Better description of gc.garbage value (gc module). Cleaned up & slightly modernized some sample code in the Python/C API and Extending & Embedding manuals. From skip@pobox.com (Skip Montanaro) Sat Nov 17 08:53:28 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Sat, 17 Nov 2001 09:53:28 +0100 Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints In-Reply-To: References: <15349.2209.107415.77922@beluga.mojam.com> Message-ID: <15350.9608.656552.312766@beluga.mojam.com> >> * It would expose potentially significant optimizations that can't be >> made today by eliminating the attribute lookup and function call to >> range, and thus getting rid of that little bit of dynamism nobody >> ever uses anyway. Tim> As I said last time this went around, I doubt the optimizations are Tim> significant: loop overhead is generally at worst a Tim> handful-of-percent thing in real programs, and the cost of one Tim> global lookup+call per entire loop (not per loop *trip*) isn't even Tim> measurable outside contrived examples. I was thinking more along the lines of generating C or assembler on the fly. Knowing a priori the types of the values emitted by for i in [0, 1 .. 10] are ints means that the code generator can mostly avoid dealing with "i" as a Python int and instead use native ints. In for i in range(11): you'd have to examine the output of range() and for for i in xrange(11): it might be "too difficult" (whatever that means) to ascertain the types of the elements. ... Tim> (BTW, repeated addition is almost exactly the worst thing to do for Tim> fp sequences; doing starting_value + iteration_ordinal * step_value Tim> on each trip is usually much better behaved for floats.) Yes, that's what I had in mind. Skip From greg@cosc.canterbury.ac.nz Mon Nov 19 05:10:03 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 19 Nov 2001 18:10:03 +1300 (NZDT) Subject: [Python-Dev] PEP 276 Simple Iterator for ints In-Reply-To: Message-ID: <200111190510.SAA13737@s454.cosc.canterbury.ac.nz> Armin Rigo : > As a logician I like "for i in 10" a lot -- indeed, it is common to > identify a natural number with the set of its predecessors. Only amongst a certain specialised breed of mathematician, I think. I'm sure most people don't think of the integers that way when actually using them in their day-to-day work. I certainly don't. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Nov 19 05:14:28 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 19 Nov 2001 18:14:28 +1300 (NZDT) Subject: [Python-Dev] switch-based programming in Python In-Reply-To: Message-ID: <200111190514.SAA13740@s454.cosc.canterbury.ac.nz> Paul Svensson : > Yes, it should work exactly like the loops do; > The `else' clause is entered when the (explicit or implicit) test > of the main statement fails. Then I misunderstood you, which I think illustrates exactly why trying to make any analogy at all with loops is a bad idea -- different people will make the analogy in different ways. My way of making the analogy would have had the "else" clause always executing in addition to one of the cases, unless a "break" was executed! If and switch statements are not loops, and trying to pretend they are would be confusing and error-prone, IMO. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From arigo@ulb.ac.be Mon Nov 19 13:37:13 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Mon, 19 Nov 2001 14:37:13 +0100 (MET) Subject: [Python-Dev] Re: PEP 276 Simple Iterator for ints In-Reply-To: <15350.9608.656552.312766@beluga.mojam.com> Message-ID: Hello Skip, On Sat, 17 Nov 2001, Skip Montanaro wrote: > I was thinking more along the lines of generating C or assembler on the > fly. If you have Psyco in mind the difference is not worth any lengthy debate about the syntax of Python. It already "knows" that range() always produce ints and letting it know that xrange() does as well is not a problem. From my very selfish point of view I would rather say that [0 .. 10] would probably introduce a new byte-code instruction, which is more hassle to support than a built-in function :-) range() also has (not-yet-exploited) side benefits, e.g. we know that in the one-argument form it will always produce positive integers (tagging integers as "necessarily positive" would allow some minor optimizations in constructions like "sequence[i]"). A bient=F4t, Armin. From DoNotReplyByEmail@yahoo.com Mon Nov 19 13:45:46 2001 From: DoNotReplyByEmail@yahoo.com (DoNotReplyByEmail@yahoo.com) Date: Mon, 19 Nov 01 08:45:46 EST Subject: [Python-Dev] >>>ADVERTISE TO 11,295,000 PEOPLE FREE! Message-ID: <3791856948.991306994491@m0.net Received: from dialup-62.215.274.4.dial1.stamford ([62.215.274.4] > Dear python-dev@python.org, *********************************************************** Would you like to send an Email Advertisement to OVER 11 MILLION PEOPLE DAILY for FREE? *********************************************************** Do you have a product or service to sell? Do you want an extra 100 orders per week? NOTE: (If you do not already have a product or service to sell, we can supply you with one). ========================================================= 1) Let's say you... Sell a $24.95 PRODUCT or SERVICE. 2) Let's say you... Broadcast Email to only 500,000 PEOPLE. 3) Let's say you... Receive JUST 1 ORDER for EVERY 2,500 EMAILS. CALCULATION OF YOUR EARNINGS BASED ON THE ABOVE STATISTICS: [Day 1]: $4,990 [Week 1]: $34,930 [Month 1]: $139,720 ======================================================== To start increasing your sales today, please visit our web site or click the link below. http://www.bigcashtoday.com/package1.htm List Removal Instructions: We hope you enjoyed receiving this message. However, if you'd rather not receive future e-mails of this sort from Internet Specialists, send an email to freeemailsoftware3@excite.com and type "remove" in the "subject" line and you will be removed from any future mailings. We hope you have a great day! Internet Specialists From jim@interet.com Mon Nov 19 16:02:35 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 19 Nov 2001 11:02:35 -0500 Subject: [Python-Dev] Final Version of PEP 273, Zip Importing References: <3BF3CE76.8A220F7A@interet.com> Message-ID: <3BF92D1B.4679048E@interet.com> "James C. Ahlstrom" wrote: > The final version (unless someone objects) of the Zip > importing design is available at > > http://python.sourceforge.net/peps/pep-0273.html > And the "final" code patches are patch number 483466. JimA From tim.one@home.com Mon Nov 19 20:42:49 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 19 Nov 2001 15:42:49 -0500 Subject: [Python-Dev] Replacing __slots__ with addmembers() In-Reply-To: <3BF0ED4B.163CF4FF@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > True, but I'm also thinking about writing new code in Python > which uses these features and there I don't see the stability > of the API just yet (but would really like them to stabilize > *before* 2.2 moves out the door and even if this means waiting > until after Christmas ;-). I tried to explain before that time isn't linear : "after Christmas" probably means "next spring at earliest", and we just aren't going to let that happen. I can only suggest you pretend we didn't release the new features, then, in any aspect which appears unstable -- you're not required to use them. While new ways to spell __slots__ and property() and staticmethod() and classmethod() and super may get invented later, the 2.2 ways won't go away. So that's focusing on the wrong things. The set of things that *may* break appears impossible to characterize. For example, exactly how conflicts in multiple inheritance get resolved, or exactly when an override of __getitem__ in a subclass of dict will get invoked (e.g., currently not for .update()), or that referencing an uninitialized __slot__ attr currently returns None instead of raising NameError, or that e.g. >>> class myint(int): ... def __init__(self, val): ... int.__init__(val * 2) ... >>> i = myint(3) >>> i 3 >>> without warning, etc. I don't know how to characterize these: they have nothing in common, apart from being arguable. But *all* language behavior is arguable, so that's not a clue either -- what's different in this case is that Guido is still open to arguing about the new stuff. That means it's all experimental, to some degree. From robin@alldunn.com Mon Nov 19 22:29:30 2001 From: robin@alldunn.com (Robin Dunn) Date: Mon, 19 Nov 2001 14:29:30 -0800 Subject: [Python-Dev] NDEBUG in Python.h Message-ID: <031301c17149$a8a484b0$0100a8c0@Rogue> Hi everyone (python-dev and wx-dev), I'm wondering if any other extension writers are having problems with this code that was added to Python.h for 2.2? #ifndef Py_DEBUG #ifndef NDEBUG #define NDEBUG 1 #endif #endif The problem is that if you want to link debug libraries with a non-debug Python then the extension module code will see the above in Python.h and it will possibly effect what it expects in the debug library it is linked with. In my case I end up with differences in the expected vs. actual vtables and so the wrong virtual methods are called followed shortly by a core dump. If I have to I can make changes in my code or maybe in wxWindows to get around this, but it was my understanding that NDEBUG should always be set in the compile options and never in header files to avoid this very problem. Comments? -- Robin Dunn Software Craftsman robin@AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython! From tim.one@home.com Mon Nov 19 23:10:56 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 19 Nov 2001 18:10:56 -0500 Subject: [Python-Dev] NDEBUG in Python.h In-Reply-To: <031301c17149$a8a484b0$0100a8c0@Rogue> Message-ID: [Robin Dunn] > I'm wondering if any other extension writers are having problems with > this code that was added to Python.h for 2.2? > > #ifndef Py_DEBUG > #ifndef NDEBUG > #define NDEBUG 1 > #endif > #endif > ... > In my case I end up with differences in the expected vs. actual > vtanles and so the wrong virtual methods are called followed shortly > by a core dump. > > If I have to I can make changes in my code or maybe in wxWindows to get > around this, but it was my understanding that NDEBUG should > always be set in the compile options and never in header files to avoid > this very problem. The Windows build has always defined NDEBUG in release builds via compile options. The Linux build never did (via compile option or anything else), so release builds on Linux (and presumably all other non-Windows boxes) contained code for assert() calls. The number of assert()s in the Python codebase has zoomed since I got commit privileges , so this was getting to be a significant expense. Guido added the #ifdef business to give the non-Windows platforms a release-build speed boost. I agree it's better done via command-line option; that's more work, though, and would require fiddling the build process for every non-Windows platform. fine-by-me-so-long-as-i-don't-have-to-do-it-ly y'rs - tim From robin@alldunn.com Tue Nov 20 00:37:32 2001 From: robin@alldunn.com (Robin Dunn) Date: Mon, 19 Nov 2001 16:37:32 -0800 Subject: [wx-dev] RE: [Python-Dev] NDEBUG in Python.h References: Message-ID: <034d01c1715b$8b8f6e00$0100a8c0@Rogue> > [Robin Dunn] > > I'm wondering if any other extension writers are having problems with > > this code that was added to Python.h for 2.2? > > > > #ifndef Py_DEBUG > > #ifndef NDEBUG > > #define NDEBUG 1 > > #endif > > #endif > > ... > > In my case I end up with differences in the expected vs. actual > > vtanles and so the wrong virtual methods are called followed shortly > > by a core dump. > > > > If I have to I can make changes in my code or maybe in wxWindows to get > > around this, but it was my understanding that NDEBUG should > > always be set in the compile options and never in header files to avoid > > this very problem. > > The Windows build has always defined NDEBUG in release builds via compile > options. The Linux build never did (via compile option or anything else), > so release builds on Linux (and presumably all other non-Windows boxes) > contained code for assert() calls. The number of assert()s in the Python > codebase has zoomed since I got commit privileges , so this was > getting to be a significant expense. Guido added the #ifdef business to > give the non-Windows platforms a release-build speed boost. I agree it's > better done via command-line option; that's more work, though, and would > require fiddling the build process for every non-Windows platform. > Okay, since I don't know enough about configure and etc. to submit a patch I'll try to work around it on my end. OTOH, it seems like leaving the #define there in Python.h would just be a problem waiting around to bite somebody else in the butt down the road... This would be an ugly (but very easy) fix, and would save the next guy from spending two days pulling his hair out: #ifndef Py_DEBUG #ifndef NDEBUG #define NDEBUG 1 #define Py_UNDO_NDEBUG 1 #endif #endif #include #ifdef Py_UNDO_NDEBUG #undef NDEBUG #undef Py_UNDO_NDEBUG #endif Or perhaps the #define and the #include should just be moved to a private header that is only included by Python's .c files and is not meant for public consumption. Sorry I don't submit it as an official patch, I don't follow python-dev enough to know what would be the Right Way to do it that would keep the most people happy... -- Robin Dunn Software Craftsman robin@AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython! From robin@alldunn.com Tue Nov 20 00:53:48 2001 From: robin@alldunn.com (Robin Dunn) Date: Mon, 19 Nov 2001 16:53:48 -0800 Subject: [Python-Dev] PyMethodObject::im_class Message-ID: <035501c1715d$d0d14f40$0100a8c0@Rogue> In wxPython the code I use for reflecting virtual C++ method calls to Python methods checks first that the method is implemented in a derived class otherwise it just calls the C++ base class version. Here is a bit of the code: PyObject* method; method = PyObject_GetAttrString(m_self, (char*)name); if (PyMethod_Check(method) && ((PyMethod_GET_CLASS(method) == m_class) || PyClass_IsSubclass(PyMethod_GET_CLASS(method), m_class))) { This worked fine prior to Python 2.2 (the earliest I tried was 2.2b1) and the PyMethod_GET_CLASS macro would give me the actual class that the method was defined in. Unfortunatly in 2.2 it appears that the class object returned is now the class of m_self which in my test case is the derived class which does not define the method. This issue is further illustrated by the following Python code: class A: def spam(self): print "spam" class B(A): pass import sys print sys.version b = B() print b.spam Which, when run with Python 2.1.1 and Python 2.2b2 gives this output: storm:robind> p21 /tmp/method.py 2.1.1 (#1, Aug 30 2001, 17:36:05) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.61mdk)] storm:robind> p22 /tmp/method.py 2.2b2 (#1, Nov 19 2001, 13:36:25) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)] > Is there something different I can do in 2.2 to give me the class that the method is actually defined in? BTW, the comments in classobject.h still say "The class that defined the method". -- Robin Dunn Software Craftsman robin@AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython! From loewis@informatik.hu-berlin.de Tue Nov 20 16:50:17 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 20 Nov 2001 17:50:17 +0100 (MET) Subject: [Python-Dev] gc.get_referents Message-ID: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> In http://sourceforge.net/tracker/index.php?func=detail&aid=483815&group_id=5470&atid=105470 Zooko complains that "referent" can either mean "referrer" or "referree", and proposes to rename it to get_referrers. Sounds good to me, but 1. EINMNL (English is not my native language), and 2. We are past 2.2beta. Still, I'm willing to change the code and documentation, if others think it is a good idea to change it now. It probably isn't a good idea to change it later, since that would produce incompatibilities. Regards, Martin From barry@zope.com Tue Nov 20 17:32:01 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 20 Nov 2001 12:32:01 -0500 Subject: [Python-Dev] gc.get_referents References: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> Message-ID: <15354.37777.132607.809968@anthem.wooz.org> >>>>> "MvL" == Martin von Loewis writes: MvL> In MvL> http://sourceforge.net/tracker/index.php?func=detail&aid=483815&group_id=5470&atid=105470 MvL> Zooko complains that "referent" can either mean "referrer" or MvL> "referree", and proposes to rename it to get_referrers. | Sounds good to me, but | 1. EINMNL (English is not my native language), and | 2. We are past 2.2beta. MvL> Still, I'm willing to change the code and documentation, if MvL> others think it is a good idea to change it now. It probably MvL> isn't a good idea to change it later, since that would MvL> produce incompatibilities. I'd consider this a bug then and approve changing it for 2.2rc1. ...Just looked at the SF bug and Guido agrees. Neil or Fred, please make it so! -Barry From nas@python.ca Tue Nov 20 17:52:34 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 20 Nov 2001 09:52:34 -0800 Subject: [Python-Dev] gc.get_referents In-Reply-To: <15354.37777.132607.809968@anthem.wooz.org>; from barry@zope.com on Tue, Nov 20, 2001 at 12:32:01PM -0500 References: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> <15354.37777.132607.809968@anthem.wooz.org> Message-ID: <20011120095234.A12526@glacier.arctrix.com> Barry A. Warsaw wrote: > I'd consider this a bug then and approve changing it for 2.2rc1. > ...Just looked at the SF bug and Guido agrees. I see no reason why this can't wait for post 2.2. Neil From oren-py-d@hishome.net Tue Nov 20 17:52:26 2001 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 20 Nov 2001 12:52:26 -0500 Subject: [Python-Dev] Speeding up name lookups Message-ID: <20011120125226.A32574@hishome.net> Step 1. Define name type The type 'name' looks and behaves just like a string. Internally, it contains a reference to a string, a pointer to a dictionary (nm_dict) and an integer (nm_index). The field nm_dict points to a dictionary where this name is used as a key. nm_index is the index into the hash table of the specific dictentry where this name is a key. The validity of this cached data is verified before using it. Step 2. Convert co_names to names When initializing a code object the co_names tuple is converted from strings to names. A new name object is always allocated for each item in co_names even if the string it contains was already interned somewhere else. Step 3. PyDict_GetItem_ByName This function should be used instead of PyDict_GetItem where lookups are guaranteed to be performed with name objects as keys and performance is critical: LOAD_NAME and LOAD_GLOBAL in eval_frame. It may be inlined. PyObject * PyDict_GetItem_ByName(PyObject *op, PyObject *key) { dictobject *mp = (dictobject *)op; nameobject *nm = (nameobject *)key; dictentry *ep; if (nm->nm_dict == mp) { ep = &(op->ma_table[nm->nm_index & op->ma_mask ]); if (ep->me_key == key) return ep->me_value; } return PyDict_GetItem(op,key); } Step 4. modify PyDict_GetItem The above function takes care of cache hits. When a cache miss occurs the regular PyDict_GetItem should check whether the key is a name and initialize nm_dict and nm_index for subsequent lookups. Step 5. PyDict_SetItem_ByName Similar to PyDict_GetItem_ByName. A cache hit means that the key was already in the dictionary and only the associated value is modified. Creating new keys and other cache misses are handled by PyDict_SetItem. Should be used in STORE_NAME and STORE_GLOBAL. Step 6. modify PyDict_SetItem Initialize cache inside name objects for fast lookups by name. This appears to be a low-risk modification because it affects only a small part of the code and the cache validity is always verified. It has a slight performance impact on the speed of the regular GetItem and SetItem because they need to check if their key is a name and initialize the cache inside the name object. This should be more than offset by the improvement in name lookup speed. This may be corrected by adding a third lookup method in addition to lookdict and lookdict_string. It is possible to speed up PyDict_GetItem_ByName even further by caching the value instead of the hash table index: PyObject * PyDict_GetItem_ByName(PyObject *op, PyObject *key) { nameobject *nm = (nameobject *)key; if (nm->nm_dict == op) return nm->nm_value; else return PyDict_GetItem(op,key); } The penalty is that GetItem and SetItem will need to carefully invalidate or update the cached values inside name objects because the fast lookup function cannot verify validity. This also doesn't handle SetItem so the table index will need to be stored inside the name object as well. I'd like to hear what you think about this proposal. I believe it should address many of the issues that PEP 266 and PEP 267 were trying to solve. In which cases will the cached values be thrashed by two dictionaries that use the same name as a key? Oren Tirosh From barry@zope.com Tue Nov 20 18:19:52 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 20 Nov 2001 13:19:52 -0500 Subject: [Python-Dev] gc.get_referents References: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> <15354.37777.132607.809968@anthem.wooz.org> <20011120095234.A12526@glacier.arctrix.com> Message-ID: <15354.40648.12060.316702@anthem.wooz.org> >>>>> "NS" == Neil Schemenauer writes: NS> Barry A. Warsaw wrote: >> I'd consider this a bug then and approve changing it for >> 2.2rc1. ...Just looked at the SF bug and Guido agrees. NS> I see no reason why this can't wait for post 2.2. Except that you'll have to support both names for backwards compatibility. -Barry From nas@python.ca Tue Nov 20 18:48:22 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 20 Nov 2001 10:48:22 -0800 Subject: [Python-Dev] gc.get_referents In-Reply-To: <15354.40648.12060.316702@anthem.wooz.org>; from barry@zope.com on Tue, Nov 20, 2001 at 01:19:52PM -0500 References: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> <15354.37777.132607.809968@anthem.wooz.org> <20011120095234.A12526@glacier.arctrix.com> <15354.40648.12060.316702@anthem.wooz.org> Message-ID: <20011120104822.A12688@glacier.arctrix.com> Barry A. Warsaw wrote: > Except that you'll have to support both names for backwards > compatibility. Most of the stuff in the gc module is low-level, highly dependent on the GC implementation and may change significantly between releases. The only APIs that you can really count on are gc.enable(), gc.disable(), gc.isenabled(), and gc.collect(). People should not be using gc.get_referents() in normal code. On a related note, the gc.DEBUG_LEAK flag should probably be fixed. It's my understanding that the interpreter can create reference cycles if certain new features (like nested scopes) are used. If that's true then some of the GC debugging options will need to be rationalized. Neil From barry@zope.com Tue Nov 20 18:51:51 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 20 Nov 2001 13:51:51 -0500 Subject: [Python-Dev] gc.get_referents References: <200111201650.fAKGoHl04460@paros.informatik.hu-berlin.de> <15354.37777.132607.809968@anthem.wooz.org> <20011120095234.A12526@glacier.arctrix.com> <15354.40648.12060.316702@anthem.wooz.org> <20011120104822.A12688@glacier.arctrix.com> Message-ID: <15354.42567.436394.576799@anthem.wooz.org> >>>>> "NS" == Neil Schemenauer writes: >> Except that you'll have to support both names for backwards >> compatibility. NS> Most of the stuff in the gc module is low-level, highly NS> dependent on the GC implementation and may change NS> significantly between releases. The only APIs that you can NS> really count on are gc.enable(), gc.disable(), gc.isenabled(), NS> and gc.collect(). People should not be using NS> gc.get_referents() in normal code. Fair enough. It's your call. It doesn't appear to be documented, so you have an out. :) -Barry From oren-py-l@hishome.net Tue Nov 20 21:01:46 2001 From: oren-py-l@hishome.net (Oren Tirosh) Date: Tue, 20 Nov 2001 16:01:46 -0500 Subject: [Python-Dev] Speeding up name lookups In-Reply-To: ; from jason@crash.org on Tue, Nov 20, 2001 at 12:32:05PM -0600 References: <20011120125226.A32574@hishome.net> Message-ID: <20011120160146.A50087@hishome.net> On Tue, Nov 20, 2001 at 12:32:05PM -0600, Jason L. Asbahr wrote: > Oren, > > This sounds like a really clever idea. Have you tried it yet? > Benchmarking would be cool. Nothing would make the argument for > this change like some blazing performance numbers. :-) > > Jason No, I haven't tried it yet. I am not familiar enough with the internal Python data structures to do something like write a wrapper for the string type. Anyone wants to help? Oren From mal@lemburg.com Wed Nov 21 12:15:36 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 21 Nov 2001 13:15:36 +0100 Subject: [Python-Dev] New class/type features (Replacing __slots__ with addmembers()) References: Message-ID: <3BFB9AE8.AAE549BF@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > ... > > True, but I'm also thinking about writing new code in Python > > which uses these features and there I don't see the stability > > of the API just yet (but would really like them to stabilize > > *before* 2.2 moves out the door and even if this means waiting > > until after Christmas ;-). > > I tried to explain before that time isn't linear : "after Christmas" > probably means "next spring at earliest", and we just aren't going to let > that happen. Why not ? Who's pushing us ? AFAIK, there's no marketing team with a hammer out there forcing us to publish on a fixed date :-) Hey, these new features were once proclaimed as the Grand Unification to be announced in Python 3k... this step is a very important one in Python's development. Self-imposed deadlines should not get in the way of the design process. Well, IMHO, at least. > I can only suggest you pretend we didn't release the new features, then, in > any aspect which appears unstable -- you're not required to use them. I know... but you know as well as I do that people will start using these features right away -- "experimental" features have a tradition in Python of not being experimental in the usual sense of being subject to change in future releases. People basically start experimenting and then assume that their results will remain valid for years to come and shout out loud if they don't. > While > new ways to spell __slots__ and property() and staticmethod() and > classmethod() and super may get invented later, the 2.2 ways won't go away. That's what meant with the above note about "experimental" features in Python. > So that's focusing on the wrong things. The set of things that *may* break > appears impossible to characterize. For example, exactly how conflicts in > multiple inheritance get resolved, or exactly when an override of > __getitem__ in a subclass of dict will get invoked (e.g., currently not for > .update()), or that referencing an uninitialized __slot__ attr currently > returns None instead of raising NameError, or that e.g. > > >>> class myint(int): > ... def __init__(self, val): > ... int.__init__(val * 2) > ... > >>> i = myint(3) > >>> i > 3 > >>> > > without warning, etc. I don't know how to characterize these: they have > nothing in common, apart from being arguable. But *all* language behavior > is arguable, so that's not a clue either -- what's different in this case is > that Guido is still open to arguing about the new stuff. That means it's > all experimental, to some degree. True, and that's why I would like to see a simple list of features which will be safe to use in Python 2.2 and which do expose a stable API. Something like: * you can subclass from types and extend them, but not override existing methods or attributes * __specialmethods__ except __init__ are not guaranteed to work the same way as they do for instances etc. Or add a disclaimer like this to the prominent announcements, web-pages, etc.: """ Extending types in Python 2.2 is possible to some extent, but currently only provided for experimentation purposes and to get more experience with the overall design. The design will become stable in Python 2.3. """ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Nov 21 15:01:04 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 21 Nov 2001 16:01:04 +0100 Subject: [Python-Dev] Speeding up name lookups References: <20011120125226.A32574@hishome.net> Message-ID: <3BFBC1B0.C093E38A@lemburg.com> Oren Tirosh wrote: > > [Define name type for lookup purposes] Just as note: Guido once proposed to cache (almost) all results of global lookups in the frame object. This causes some incompatibilities for e.g. global symbols that change their value after the first lookup. I'm not sure whether your approach goes in the same direction, but I think that we might be better off using some form of descriptor for this than using a new type. The descriptor could basically work much like what you propose, i.e. cache the container object where the symbol was found and maybe even its last value provided that the container and the found object meet some criteria of constantness. I'd suggest to write this up as a PEP. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jim@interet.com Wed Nov 21 15:23:57 2001 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 21 Nov 2001 10:23:57 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 Message-ID: <3BFBC70D.BF911F5@interet.com> I have been working on import.c and thinking about imports generally. Currently, the directory of the Python script is inserted into sys.path[0]. For example, "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and "python myscript.py" creates sys.path[0] = "". But there are three problems. This insertion occurs after a number of imports have already occurred. Specifically, it occurs after the import of site, os, and sitecustomize. This is confusing. It is clear that sys.path should not change unless the user changes it. If no path component is given, the zero length string is inserted. But if the current working directory later changes, this is no longer valid. If we want the directory of the script to be sys.path[0], then an absolute path should be inserted. If a command is entered using "-c", I don't think any insertion to sys.path should be made, as there is no indicated directory. Alternatively, the absolute path getcwd() should be inserted. If everyone agrees, I will create a patch. Jim Ahlstrom From oren-py-d@hishome.net Wed Nov 21 16:24:15 2001 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 21 Nov 2001 11:24:15 -0500 Subject: [Python-Dev] Speeding up name lookups In-Reply-To: <3BFBC1B0.C093E38A@lemburg.com>; from mal@lemburg.com on Wed, Nov 21, 2001 at 04:01:04PM +0100 References: <20011120125226.A32574@hishome.net> <3BFBC1B0.C093E38A@lemburg.com> Message-ID: <20011121112415.A66168@hishome.net> On Wed, Nov 21, 2001 at 04:01:04PM +0100, M.-A. Lemburg wrote: > Oren Tirosh wrote: > > > > [Define name type for lookup purposes] > > Just as note: Guido once proposed to cache (almost) all results > of global lookups in the frame object. This causes some > incompatibilities for e.g. global symbols that change their > value after the first lookup. My idea is different because cache coherence is guaranteed. The item cached inside the name object is a shortcut "cookie" into the entry of the real dictionary and its validity can be verified with just a few instructions. > I'm not sure whether your approach goes in the same direction, > but I think that we might be better off using some form of descriptor > for this than using a new type. The thing I like about using name objects is that is touches so few places in the code. Any other approach I have tried resulted in chasing down lots of places where cache coherency might be broken and, serious changes to the bytecode, compiler etc. > I'd suggest to write this up as a PEP. The idea is still evolving too fast, soon. Oren From aahz@rahul.net Wed Nov 21 16:39:29 2001 From: aahz@rahul.net (Aahz Maruch) Date: Wed, 21 Nov 2001 08:39:29 -0800 (PST) Subject: [Python-Dev] Speeding up name lookups In-Reply-To: <20011121112415.A66168@hishome.net> from "Oren Tirosh" at Nov 21, 2001 11:24:15 AM Message-ID: <20011121163929.BBF01E8C9@waltz.rahul.net> Oren Tirosh wrote: > On Wed, Nov 21, 2001 at 04:01:04PM +0100, M.-A. Lemburg wrote: >> >> I'd suggest to write this up as a PEP. > > The idea is still evolving too fast, soon. That makes it an even better candidate for a PEP. Seriously. Part of the point of the PEP process is to document the *reasons* for making decisions, so if you start recording early, people asking why you don't do things a certain way can just be referred to the PEP. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From jack@oratrix.nl Wed Nov 21 16:53:55 2001 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 21 Nov 2001 17:53:55 +0100 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 In-Reply-To: Message by "James C. Ahlstrom" , Wed, 21 Nov 2001 10:23:57 -0500 , <3BFBC70D.BF911F5@interet.com> Message-ID: <20011121165355.96537303183@snelboot.oratrix.nl> I'm not too thrilled about this, especially not in the 2.2 release cycle. The sys.path logic is difficult, and everytime a change was made on one platform this change immedeately broke things on all other platforms. For instance: on the Mac full pathnames are not necessarily unique. Or another example: on unix you could have your working directory in /foo/bar/bletch and access ./blurgh.py there, but not access /foo/bar/bletch/blurgh.py: /foo/bar could be unsearchable for you. And the "python -c" mod I really disagree with. 'python -c "import foo"' in the directory containing foo.py is the standard way of using this, which you would break. > I have been working on import.c and thinking about > imports generally. Currently, the directory of the > Python script is inserted into sys.path[0]. For example, > "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and > "python myscript.py" creates sys.path[0] = "". But there > are three problems. > > This insertion occurs after a number of imports have already > occurred. Specifically, it occurs after the import of site, > os, and sitecustomize. This is confusing. It is clear that > sys.path should not change unless the user changes it. > > If no path component is given, the zero length string is > inserted. But if the current working directory later > changes, this is no longer valid. If we want the directory > of the script to be sys.path[0], then an absolute path should > be inserted. > > If a command is entered using "-c", I don't think any insertion > to sys.path should be made, as there is no indicated directory. > Alternatively, the absolute path getcwd() should be inserted. > > If everyone agrees, I will create a patch. > > Jim Ahlstrom > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From barry@zope.com Wed Nov 21 17:29:15 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 21 Nov 2001 12:29:15 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <3BFBC70D.BF911F5@interet.com> <20011121165355.96537303183@snelboot.oratrix.nl> Message-ID: <15355.58475.101463.586758@anthem.wooz.org> >>>>> "JJ" == Jack Jansen writes: JJ> I'm not too thrilled about this, especially not in the 2.2 JJ> release cycle. I'm sure Jim wasn't talking about this for 2.2, unless he'd gone temporarily insane. :) -Barry From barry@zope.com Wed Nov 21 17:31:33 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 21 Nov 2001 12:31:33 -0500 Subject: [Python-Dev] New class/type features (Replacing __slots__ with addmembers()) References: <3BFB9AE8.AAE549BF@lemburg.com> Message-ID: <15355.58613.731418.776866@anthem.wooz.org> >>>>> "M" == M writes: M> Why not ? Who's pushing us ? AFAIK, there's no marketing team M> with a hammer out there forcing us to publish on a fixed date M> :-) No, but at some point you do have to put a fork in things and move on. AFAIC, only Guido can slow the release juggernaut and he's having too much fun changing diapers to weigh in. :) -Barry From tim_one@email.msn.com Wed Nov 21 19:18:27 2001 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 21 Nov 2001 14:18:27 -0500 Subject: [Python-Dev] Speeding up name lookups In-Reply-To: <3BFBC1B0.C093E38A@lemburg.com> Message-ID: Keep in mind that the normal path for an interned-string-key dict lookup in 2.2 is just this, where PyString_CheckExact succeeds, and assuming the key is present: register unsigned int mask = mp->ma_mask; dictentry *ep0 = mp->ma_table; register dictentry *ep; if (!PyString_CheckExact(key)) { mp->ma_lookup = lookdict; return lookdict(mp, key, hash); } i = hash & mask; ep = &ep0[i]; if (ep->me_key == NULL || ep->me_key == key) return ep; (Hmm -- the tests in that last "if" should check the second part first.) So there's usually not much to be saved by caching the index. The main benefit from Python's local-vrbl optimization has more to do with saving function calls than dict lookups (the latter are usually very cheap for interned string keys that are present). From jim@interet.com Wed Nov 21 20:27:12 2001 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 21 Nov 2001 15:27:12 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <3BFBC70D.BF911F5@interet.com> <20011121165355.96537303183@snelboot.oratrix.nl> <15355.58475.101463.586758@anthem.wooz.org> Message-ID: <3BFC0E20.CAA8B0EE@interet.com> "Barry A. Warsaw" wrote: > I'm sure Jim wasn't talking about this for 2.2, unless he'd gone > temporarily insane. :) As part of my job, I have been writing a lot of Fortran lately, and I notice that Fortran has a lot of great features that are missing from Python. For example, Fortran's three-way IF statement is obviously better than Python's two way "if". And the Fortran common block has the ability to modify any item by using out-of-bounds array references, without having to worry about pesky type mismatches. And have I mentioned the versatile GOTO statement, and its convenient alternative spellings "GO TO" and "GO TO"? In fact, I plan to write up another PEP on features which need to be added to Python based on the only real [wo]mans programming language, namely Fortran. It should be done any day as soon as I come up for air and have t*/e to t>^~k of al& the f@#tu!es I #? <3BFB9AE8.AAE549BF@lemburg.com> <15355.58613.731418.776866@anthem.wooz.org> Message-ID: <3BFC144A.F96D483B@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> Why not ? Who's pushing us ? AFAIK, there's no marketing team > M> with a hammer out there forcing us to publish on a fixed date > M> :-) > > No, but at some point you do have to put a fork in things and move > on. AFAIC, only Guido can slow the release juggernaut and he's having > too much fun changing diapers to weigh in. :) If Guido feels OK about all this, fine with me -- I still think that the PythonLabs team ought to make the state of affairs as crystal clear as possible and put up the warning signs in the right places, e.g. by having usage of unstable features issue a warning to the programmer. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From barry@zope.com Wed Nov 21 20:59:03 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 21 Nov 2001 15:59:03 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <3BFBC70D.BF911F5@interet.com> <20011121165355.96537303183@snelboot.oratrix.nl> <15355.58475.101463.586758@anthem.wooz.org> <3BFC0E20.CAA8B0EE@interet.com> Message-ID: <15356.5527.865190.70032@anthem.wooz.org> >>>>> "JCA" == James C Ahlstrom writes: JCA> In fact, I plan to write up another PEP on features which JCA> need to be added to Python based on the only real [wo]mans JCA> programming language, namely Fortran. ------------------------------------------^^^ so close and yet so far... s/ran/h/-ly y'rs, -Barry From gmcm@hypernet.com Wed Nov 21 21:48:01 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 21 Nov 2001 16:48:01 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 In-Reply-To: <3BFBC70D.BF911F5@interet.com> Message-ID: <3BFBDAC1.21745.B0E415A1@localhost> Jim "Fortran" Ahlstrom wrote: > I have been working on import.c and thinking about > imports generally. Currently, the directory of the > Python script is inserted into sys.path[0]. For example, > "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and > "python myscript.py" creates sys.path[0] = "". But there > are three problems. > > This insertion occurs after a number of imports have already > occurred. Specifically, it occurs after the import of site, os, > and sitecustomize. This is confusing. Why? site happens before Python even thinks about sys.argv[0]. By it's very name it's about "how this installation should behave", not "how a script in this directory behaves". > It is clear that sys.path > should not change unless the user changes it. I am +1 (for very large values of 1) on clarifying the rules of import, but while hidden manipulations of sys.path qualify as a sneaky trick, I don't think they can be outlawed. > If no path component is given, the zero length string is > inserted. But if the current working directory later > changes, this is no longer valid. If we want the directory > of the script to be sys.path[0], then an absolute path should be > inserted. Some people os.chdir() just for this effect. I don't think I mind if they experience some pain . - Gordon From greg@cosc.canterbury.ac.nz Wed Nov 21 23:24:27 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 22 Nov 2001 12:24:27 +1300 (NZDT) Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 In-Reply-To: <3BFBC70D.BF911F5@interet.com> Message-ID: <200111212324.MAA14334@s454.cosc.canterbury.ac.nz> "James C. Ahlstrom" : > If we want the directory of the script to be sys.path[0], then an > absolute path should be inserted. +1 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Wed Nov 21 21:02:21 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 21 Nov 2001 22:02:21 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: (python-dev-request@python.org) References: Message-ID: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> > This insertion occurs after a number of imports have already > occurred. Specifically, it occurs after the import of site, os, and > sitecustomize. This is confusing. It is clear that sys.path should > not change unless the user changes it. I disagree. It is clear that sys.path will change during "bootstrap", e.g. as the result of processing .pth files. > If we want the directory of the script to be sys.path[0], then an > absolute path should be inserted. I think this would be unimplementable in the general case. > If everyone agrees, I will create a patch. It's not clear to me what this patch would do, so I disagree. Regards, Martin From mal@lemburg.com Thu Nov 22 09:51:29 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 22 Nov 2001 10:51:29 +0100 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <3BFBC70D.BF911F5@interet.com> Message-ID: <3BFCCAA1.27FAB14A@lemburg.com> "James C. Ahlstrom" wrote: > > I have been working on import.c and thinking about > imports generally. Currently, the directory of the > Python script is inserted into sys.path[0]. For example, > "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and > "python myscript.py" creates sys.path[0] = "". But there > are three problems. > > This insertion occurs after a number of imports have already > occurred. Specifically, it occurs after the import of site, > os, and sitecustomize. This is confusing. It is clear that > sys.path should not change unless the user changes it. I hope you mean user == programmer. Changing sys.path is perfectly legal and I wouldn't like to see that become illegal. > If no path component is given, the zero length string is > inserted. But if the current working directory later > changes, this is no longer valid. If we want the directory > of the script to be sys.path[0], then an absolute path should > be inserted. True. This causes quite a bit of confusion sometimes, esp. when people run scripts using relative paths and then find that things don't work the way they expected. I'm not sure if adding the absolute path would break anything, though -- could be that some path fiddling code explicitly looks for the '' in sys.path and then takes some action based on the fact that the script was started from the CWD. > If a command is entered using "-c", I don't think any insertion > to sys.path should be made, as there is no indicated directory. > Alternatively, the absolute path getcwd() should be inserted. Same problem here: "-c" can be used as indicator... and probably is by code looking for the absolute path of the script ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From DoNotReplyByEmail@yahoo.com Thu Nov 22 16:15:55 2001 From: DoNotReplyByEmail@yahoo.com (DoNotReplyByEmail@yahoo.com) Date: Thu, 22 Nov 01 11:15:55 EST Subject: [Python-Dev] >>>ADVERTISE TO 11,295,000 PEOPLE FREE! Message-ID: <3791856948.991306994491@m0.net Received: from dialup-62.215.274.4.dial1.stamford ([62.215.274.4] > Dear python-dev@python.org, *********************************************************** Would you like to send an Email Advertisement to OVER 11 MILLION PEOPLE DAILY for FREE? *********************************************************** Do you have a product or service to sell? Do you want an extra 100 orders per week? NOTE: (If you do not already have a product or service to sell, we can supply you with one). ========================================================= 1) Let's say you... Sell a $24.95 PRODUCT or SERVICE. 2) Let's say you... Broadcast Email to only 500,000 PEOPLE. 3) Let's say you... Receive JUST 1 ORDER for EVERY 2,500 EMAILS. CALCULATION OF YOUR EARNINGS BASED ON THE ABOVE STATISTICS: [Day 1]: $4,990 [Week 1]: $34,930 [Month 1]: $139,720 ======================================================== To find out more information, Do not respond by email. Instead, Please visit our web site at: http://www.bigcashtoday.com/package1.htm List Removal Instructions: We hope you enjoyed receiving this message. However, if you'd rather not receive future e-mails of this sort from Internet Specialists, send an email to freeemailsoftware3@excite.com and type "remove" in the "subject" line and you will be removed from any future mailings. We hope you have a great day! Internet Specialists From mal@lemburg.com Fri Nov 23 10:17:30 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 23 Nov 2001 11:17:30 +0100 Subject: [Python-Dev] PEP 275: Switching on Multiple Values, Rev 1.2 Message-ID: <3BFE223A.5E26D940@lemburg.com> Here is a new revision of the switch PEP. After this round and after Guido returns to python-dev, I intend to do one or two more rounds on python-list. -- PEP: 0275 Title: Switching on Multiple Values Version: $Revision: 1.2 $ Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 10-Nov-2001 Post-History:=20 Abstract This PEP proposes strategies to enhance Python's performance with respect to handling switching on a single variable having one of multiple possible values. Problem Up to Python 2.2, the typical way of writing multi-value switches=20 has been to use long switch constructs of the following type: if x =3D=3D 'first state': ... elif x =3D=3D 'second state': ... elif x =3D=3D 'third state': ... elif x =3D=3D 'fourth state': ... else: # default handling ... This works fine for short switch constructs, since the overhead of repeated loading of a local (the variable x in this case) and comparing it to some constant is low (it has a complexity of O(n) on average). However, when using such a construct to write a state machine such as is needed for writing parsers the number of possible states can easily reach 10 or more cases. The current solution to this problem lies in using a dispatch table to find the case implementing method to execute depending on the value of the switch variable (this can be tuned to have a complexity of O(1) on average, e.g. by using perfect hash tables). This works well for state machines which require complex and lengthy processing in the different case methods. It does not perform well for ones which only process one or two instructions per case, e.g. def handle_data(self, data): self.stack.append(data) =20 A nice example of this is the state machine implemented in pickle.py which is used to serialize Python objects. Other prominent cases include XML SAX parsers and Internet protocol handlers. Proposed Solutions This PEP proposes two different but not necessarily conflicting solutions: 1. Adding an optimization to the Python compiler and VM which detects the above if-elif-else construct and generates special opcodes for it which use an read-only dictionary for storing jump offsets. 2. Adding new syntax to Python which mimics the C style switch statement. The first solution has the benefit of not relying on adding new keywords to the language, while the second looks cleaner. Both involve some run-time overhead to assure that the switching variable is immutable and hashable. Both solutions use a dictionary lookup to find the right jump location, so they both share the same problem space in terms of requiring that both the switch variable and the constants need to be compatible to the dictionary implementation (hashable, comparable, a=3D=3Db =3D> hash(a)=3D=3Dhash(b)). Solution 1: Optimizing if-elif-else Implementation: It should be possible for the compiler to detect an if-elif-else construct which has the following signature: if x =3D=3D 'first':... elif x =3D=3D 'second':... else:... i.e. the left hand side always references the same variable, the right hand side a hashable immutable builtin type. The right hand sides need not be all of the same type, but they should be comparable to the type of the left hand switch variable. The compiler could then setup a read-only (perfect) hash table, store it in the constants and add an opcode SWITCH in front of the standard if-elif-else byte code stream which triggers the following run-time behaviour: At runtime, SWITCH would check x for being one of the well-known immutable types (strings, unicode, numbers) and use the hash table for finding the right opcode snippet. If this condition is not met, the interpreter should revert to the standard if-elif-else processing by simply skipping the SWITCH opcode and procedding with the usual if-elif-else byte code stream. Issues: The new optimization should not change the current Python semantics (by reducing the number of __cmp__ calls and adding __hash__ calls in if-elif-else constructs which are affected by the optimiztation). To assure this, switching can only safely be implemented either if a "from __future__" style flag is used, or the switching variable is one of the builtin immutable types: int, float, string, unicode, etc. (not subtypes, since it's not clear whether these are still immutable or not) To prevent post-modifications of the jump-table dictionary (which could be used to reach protected code), the jump-table will have to be a read-only type (e.g. a read-only dictionary). The optimization should only be used for if-elif-else constructs which have a minimum number of n cases (where n is a number which has yet to be defined depending on performance tests). Solution 2: Adding a switch statement to Python New Syntax: switch EXPR: case CONSTANT: SUITE =20 case CONSTANT: SUITE =20 ... else: SUITE =20 (modulo indentation variations) The "else" part is optional. If no else part is given and none of the defined cases matches, no action is taken and=20 the switch statement is ignored. This is in line with the current if-behaviour. A user who wants to signal this situation using an exception can define an else-branch which then implements the intended action. Note that the constants need not be all of the same type, but=20 they should be comparable to the type of the switch variable. Implementation: The compiler would have to compile this into byte code similar to this: def whatis(x): switch(x): case 'one':=20 print '1' case 'two':=20 print '2' case 'three':=20 print '3' else:=20 print "D'oh!" into (ommitting POP_TOP's and SET_LINENO's): 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (switch-table-1) 12 SWITCH 26 (to 38) 14 LOAD_CONST 2 ('1') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP 43 22 LOAD_CONST 3 ('2') 25 PRINT_ITEM 26 PRINT_NEWLINE 27 JUMP 43 30 LOAD_CONST 4 ('3') 33 PRINT_ITEM 34 PRINT_NEWLINE 35 JUMP 43 38 LOAD_CONST 5 ("D'oh!") 41 PRINT_ITEM 42 PRINT_NEWLINE >>43 LOAD_CONST 0 (None) 46 RETURN_VALUE =20 Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. Thomas Wouters has written a patch which demonstrates the above. You can download it from [1]. Issues: The switch statement should not implement fall-through behaviour (as does the switch statement in C). Each case defines a complete and independent suite; much like in a if-elif-else statement. This also enables using break in switch statments inside loops. If the interpreter finds that the switch variable x is not hashable, it should raise a TypeError at run-time pointing out the problem. There have been other proposals for the syntax which reuse existing keywords and avoid adding two new ones ("switch" and "case"). Others have argued that the keywords should use new terms to avoid confusion with the C keywords of the same name but slightly different semantics (e.g. fall-through without break). Some of the proposed variants: case EXPR: of CONSTANT: SUITE =20 of CONSTANT: SUITE =20 else: SUITE =20 case EXPR: if CONSTANT: SUITE =20 if CONSTANT: SUITE =20 else: SUITE =20 when EXPR: in CONSTANT_TUPLE: SUITE =20 in CONSTANT_TUPLE: SUITE =20 ... else: SUITE =20 =20 The switch statement could be extended to allow multiple values for one section (e.g. case 'a', 'b', 'c': ...). Another proposed extension would allow ranges of values (e.g. case 10..14: ...). These should probably be post-poned, but already kept in mind when designing and implementing a first version. Examples: The following examples all use a new syntax as proposed by solution 2. However, all of these examples would work with solution 1 as well. switch EXPR: switch x: case CONSTANT: case "first": SUITE print x case CONSTANT: case "second": SUITE x =3D x**2 ... print x else: else: SUITE print "whoops!" case EXPR: case x: of CONSTANT: of "first": SUITE print x of CONSTANT: of "second": SUITE print x**2 else: else: SUITE print "whoops!" case EXPR: case state: if CONSTANT: if "first": SUITE state =3D "second" if CONSTANT: if "second": SUITE state =3D "third" else: else: SUITE state =3D "first" when EXPR: when state: in CONSTANT_TUPLE: in ("first", "second"): SUITE print state in CONSTANT_TUPLE: state =3D next_state(stat= e) SUITE in ("seventh",): ... print "done" else: break # out of loop! SUITE else: print "middle state" state =3D next_state(state) Here's another nice application found by Jack Jansen (switching on argument types): switch type(x).__name__: case 'int': SUITE case 'string': SUITE Scope XXX Explain "from __future__ import switch" Credits Martin von L=F6wis (issues with the optimization idea) Thomas Wouters (switch statement + byte code compiler example) Skip Montanaro (dispatching ideas, examples) Donald Beaudry (switch syntax) Greg Ewing (switch syntax) Jack Jansen (type switching examples) References [1] https://sourceforge.net/tracker/index.php?func=3Ddetail&aid=3D481= 118&group_id=3D5470&atid=3D305470 Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From aahz@rahul.net Fri Nov 23 10:55:45 2001 From: aahz@rahul.net (Aahz Maruch) Date: Fri, 23 Nov 2001 02:55:45 -0800 (PST) Subject: [Python-Dev] PEP 275: Switching on Multiple Values, Rev 1.2 In-Reply-To: <3BFE223A.5E26D940@lemburg.com> from "M.-A. Lemburg" at Nov 23, 2001 11:17:30 AM Message-ID: <20011123105545.E4CCEE8C4@waltz.rahul.net> [Yes, this should probably go to python-list, but after surgery I haven't been on c.l.py for more than a month.] M.-A. Lemburg wrote: > > Here is a new revision of the switch PEP. After this round and > after Guido returns to python-dev, I intend to do one or two > more rounds on python-list. Here's a weird idea: Given that the main problem we're trying to solve is the slowness of dictionary dispatching to functions, how about adding an "inline" keyword or function? For example (assuming it's a function): def handle_data(): self.append(data) dispatch_dict["foo"] = handle_data class C: def read_input(self): data = "foo" value = handle_data inline(dispatch_dict[value]) Note carefully that handle_data() has no paramaters. This solution would be somewhat more expensive computationally than the switch construct, but it ought to be cheaper than doing an actual function call. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From mal@lemburg.com Fri Nov 23 13:47:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 23 Nov 2001 14:47:10 +0100 Subject: [Python-Dev] PEP 275: Switching on Multiple Values, Rev 1.2 References: <20011123105545.E4CCEE8C4@waltz.rahul.net> Message-ID: <3BFE535E.35D441E3@lemburg.com> Aahz Maruch wrote: > > Here's a weird idea: > > Given that the main problem we're trying to solve is the slowness of > dictionary dispatching to functions, how about adding an "inline" > keyword or function? Because AFAIK inlining in Python is just as expensive as doing a Python function call: you have to merge all constants, locals, etc. Depending on the function signature, it may not even be possible (e.g. if parameters don't match the inling function's signature). > For example (assuming it's a function): > > def handle_data(): > self.append(data) > > dispatch_dict["foo"] = handle_data > > class C: > def read_input(self): > data = "foo" > value = handle_data > inline(dispatch_dict[value]) > > Note carefully that handle_data() has no paramaters. This solution > would be somewhat more expensive computationally than the switch > construct, but it ought to be cheaper than doing an actual function > call. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From bslesins@best.com Sat Nov 24 01:12:23 2001 From: bslesins@best.com (Brian Slesinsky) Date: Fri, 23 Nov 2001 17:12:23 -0800 (PST) Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 Message-ID: <20011123161656.B6099-100000@shell7.ba.best.com> Hi, I'm new to the list and apologies for going off-topic a bit (this has nothing to do with performance): If Python implements switch statements it would be a shame not to have pattern-matching in switch statements too. This is a feature that has long been used in functional languages like ML. For example, here's pattern matching on a tuple: def foo(a,b): switch (f(a), g(b)): case (c,1): something(c) # if g(b)==1, assigns c = f(a) case (1,d): something(d) # if f(a)==1, assigns d = g(b) case (c,_): something(c) # any tuple: assigns c = f(a), _ is wildcard Some syntactic sugar related to this would be a way to pattern-match on arbitrary objects as is now done with tuples: def foo(pair, number): Pair(x,y) = pair # assert isinstance(pair,Pair), assigns x and y int(i) = number # assert type(x)==IntType, assign i (To implement this, classes would have a __tuple__ method that returns the tuple for matching. By convention it should return the arguments to its __init__ method.) Note how type-checking happens in a very natural way: a = int(a) # convert a int(a) = a # type-check a Combining the two: def sum(pairOrInt): switch pairOrInt: case int(a): return a case Pair(x,y): return sum(x)+sum(y) Here's some documentation on how it's done in Cyclone, a C variant from AT&T that seems to have a strong ML influence: http://www.research.att.com/projects/cyclone/online-manual/main-screen005.html ---------------------------------------------------------------------- Brian Slesinsky From nas@python.ca Sun Nov 25 19:31:38 2001 From: nas@python.ca (Neil Schemenauer) Date: Sun, 25 Nov 2001 11:31:38 -0800 Subject: [Python-Dev] mysnprintf broken Message-ID: <20011125113138.A24320@glacier.arctrix.com> The code uses vsprintf with a buffer that is 512 larger than ``n''. Obviously that is easy to overflow. Is there some reason why we can't incorporate a free snprintf implementation? There is a list available at http://www.ijs.si/software/snprintf/. Neil From martin@v.loewis.de Sun Nov 25 23:20:09 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 26 Nov 2001 00:20:09 +0100 Subject: [Python-Dev] Re: mysnprintf broken Message-ID: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> > Is there some reason why we can't incorporate a free snprintf > implementation? There is a list available at > http://www.ijs.si/software/snprintf/. Looks like the time machine is at work again: the version we use *is* a free snprintf implementation. If you want to replace it with a different one, you should indicate specifically which one you'd like to use instead. I think Mark Martinec's implementation (the top one on the URL you give) is unacceptable, because the license is too restrictive: we must incoporate the package in its entirety, i.e. redistribution of portions seems not to be licensed by the Frontier Artistic License. I don't have the time to review 10 other implementations for their suitability both in terms of licensing and correctness. Instead, I'd rather review the three occurrences of PyOS_snprintf, to determine quickly that you will have a hard time to overflow that buffer; *it is not at all easy*. Even if it does overflow, you will get a fatal error, rather than silent memory corruption. That is good enough for me. Regards, Martin From nas@python.ca Mon Nov 26 02:00:07 2001 From: nas@python.ca (Neil Schemenauer) Date: Sun, 25 Nov 2001 18:00:07 -0800 Subject: [Python-Dev] Re: mysnprintf broken In-Reply-To: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Mon, Nov 26, 2001 at 12:20:09AM +0100 References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> Message-ID: <20011125180007.A24745@glacier.arctrix.com> Martin v. Loewis wrote: > Looks like the time machine is at work again: the version we use *is* > a free snprintf implementation. Are we looking at the same mysnprintf? The one I have starts off: static int myvsnprintf(char *str, size_t size, const char *format, va_list va) { char *buffer = PyMem_Malloc(size + 512); int len; if (buffer == NULL) return -1; len = vsprintf(buffer, format, va); ... That doesn't look safe to me. Is there another snprintf implementation in the Python source tree? I can't find it. If there is then why is mysnprintf around? Neil From martin@v.loewis.de Mon Nov 26 07:09:05 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 26 Nov 2001 08:09:05 +0100 Subject: [Python-Dev] Re: [Python-Dev]mysnprintf broken Message-ID: <200111260709.fAQ795t01294@mira.informatik.hu-berlin.de> > > Looks like the time machine is at work again: the version we use *is* > > a free snprintf implementation. > Are we looking at the same mysnprintf? ... That doesn't look safe to me Definitely, on both accounts. I was not claiming that it was safe; I was only claiming it was free, and that it was snprintf implementation. To re-iterate my points: - if you think it is bad enough to deserve attention, propose a specific replacement; that will then need careful inspection - Given that there are three callers of this snprintf, and Given that two of them are guaranteed to never overrun the buffer, and Given that the third one will do so only under obscure circumstances, and Given that Python will terminate under these circumstances, rather than silently operating with wrong data, I conclude that this doesn't deserve attention. Regards, Martin From mal@lemburg.com Mon Nov 26 09:27:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 26 Nov 2001 10:27:41 +0100 Subject: [Python-Dev] Re: mysnprintf broken References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> Message-ID: <3C020B0D.FCE1FCAF@lemburg.com> "Martin v. Loewis" wrote: > > > Is there some reason why we can't incorporate a free snprintf > > implementation? There is a list available at > > http://www.ijs.si/software/snprintf/. > > Looks like the time machine is at work again: the version we use *is* > a free snprintf implementation. Well, let's say it's a free snprintf emulation ;-) > If you want to replace it with a different one, you should indicate > specifically which one you'd like to use instead. I think Mark > Martinec's implementation (the top one on the URL you give) is > unacceptable, because the license is too restrictive: we must > incoporate the package in its entirety, i.e. redistribution of > portions seems not to be licensed by the Frontier Artistic License. > > I don't have the time to review 10 other implementations for their > suitability both in terms of licensing and correctness. > > Instead, I'd rather review the three occurrences of PyOS_snprintf, to > determine quickly that you will have a hard time to overflow that > buffer; *it is not at all easy*. Even if it does overflow, you will > get a fatal error, rather than silent memory corruption. That is good > enough for me. Note that the version in Python does not result in *stack* overflows which are the type of buffer overflow usually used in exploits. PyOS_snprintf() allocates a buffer on the heap and then let's sprintf() write there -- it then checks for an overflow and causes a fatal error if it finds that sprintf() failed to manage with the size + 512 bytes it had for formatting the string. The only attack on this kind of emulation is a denial of service attack. In the 3 cases where this API is used in Python, an overflow is not possible (unless the native sprintf() implementation is broken). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Nov 26 10:06:33 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 26 Nov 2001 11:06:33 +0100 Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 References: <20011123161656.B6099-100000@shell7.ba.best.com> Message-ID: <3C021429.CB65142E@lemburg.com> Brian Slesinsky wrote: > > Hi, > > I'm new to the list and apologies for going off-topic a bit (this has > nothing to do with performance): > > If Python implements switch statements it would be a shame not to have > pattern-matching in switch statements too. This is a feature that has > long been used in functional languages like ML. For example, here's > pattern matching on a tuple: > > def foo(a,b): > switch (f(a), g(b)): > case (c,1): something(c) # if g(b)==1, assigns c = f(a) > case (1,d): something(d) # if f(a)==1, assigns d = g(b) > case (c,_): something(c) # any tuple: assigns c = f(a), _ is wildcard > > Some syntactic sugar related to this would be a way to pattern-match on > arbitrary objects as is now done with tuples: > > def foo(pair, number): > Pair(x,y) = pair # assert isinstance(pair,Pair), assigns x and y > int(i) = number # assert type(x)==IntType, assign i > > (To implement this, classes would have a __tuple__ method that returns the > tuple for matching. By convention it should return the arguments to its > __init__ method.) > > Note how type-checking happens in a very natural way: > a = int(a) # convert a > int(a) = a # type-check a > > Combining the two: > > def sum(pairOrInt): > switch pairOrInt: > case int(a): return a > case Pair(x,y): return sum(x)+sum(y) > > Here's some documentation on how it's done in Cyclone, a C variant from > AT&T that seems to have a strong ML influence: > > http://www.research.att.com/projects/cyclone/online-manual/main-screen005.html Note that the switch implementations proposed in the PEP both use Python dictionaries as basis for the jump location lookup. The aim is speed up switches on constants which currently are linear in number of cases. I'm not sure how your proposal fits in here, but it looks like the current if-elif-else syntax is better suited to it than some trying to use a switch-dictionary with some special objects to implement wild-card matching, e.g. if type(pairOrInt) == type(1): return pairOrInt elif type(pairOrInt) == type(()) and len(pairOrInt) == 2: x,y = pairOrInt return sum(x) + sum(y) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From nas@python.ca Mon Nov 26 14:14:07 2001 From: nas@python.ca (Neil Schemenauer) Date: Mon, 26 Nov 2001 06:14:07 -0800 Subject: [Python-Dev] Re: mysnprintf broken In-Reply-To: <3C020B0D.FCE1FCAF@lemburg.com>; from mal@lemburg.com on Mon, Nov 26, 2001 at 10:27:41AM +0100 References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> Message-ID: <20011126061407.A25396@glacier.arctrix.com> M.-A. Lemburg wrote: > Note that the version in Python does not result in *stack* overflows > which are the type of buffer overflow usually used in exploits. ... > The only attack on this kind of emulation is a denial of service > attack. That is a bold statement to make. It is also not true. Heap overflows _can_ be exploited to execute arbitrary code. I believe there was a phrack article a few years ago on the subject. > In the 3 cases where this API is used in Python, an overflow > is not possible (unless the native sprintf() implementation > is broken). That may be the case today but I'm sure that snprintf will start getting more use now that it is available. We really should have a better implementation than mysnprintf. Neil From jim@interet.com Mon Nov 26 14:17:43 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 26 Nov 2001 09:17:43 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <20011121165355.96537303183@snelboot.oratrix.nl> Message-ID: <3C024F07.E548553D@interet.com> Jack Jansen wrote: > > I'm not too thrilled about this, especially not in the 2.2 release cycle. The > sys.path logic is difficult, and everytime a change was made on one platform > this change immedeately broke things on all other platforms. For instance: on I agree, and I am very worried about breaking imports on platforms which I do not have, and which I can not test. But I am working on zip imports, which is a Good Thing. While working on this I can fix other things which don't seem quite right. If things break, I currently have time to fix them due to reduced pressure at work. And of course it is for 2.3, not 2.2. > And the "python -c" mod I really disagree with. 'python -c "import foo"' in > the directory containing foo.py is the standard way of using this, which you > would break. OK, I think you are right here. The "-c" option should insert the current directory. Currently it inserts "". JimA From jim@interet.com Mon Nov 26 15:03:59 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 26 Nov 2001 10:03:59 -0500 Subject: [Python-Dev] Change to sys.path[0] for Python 2.3 References: <3BFBDAC1.21745.B0E415A1@localhost> Message-ID: <3C0259DF.F8EE0D0B@interet.com> Gordon McMillan wrote: > > Jim "Fortran" Ahlstrom wrote: > > This insertion occurs after a number of imports have already > > occurred. Specifically, it occurs after the import of site, os, > > and sitecustomize. This is confusing. > > Why? site happens before Python even thinks about > sys.argv[0]. By it's very name it's about "how this installation > should behave", not "how a script in this directory behaves". Adding the directory of the Python script occurs after a number of imports have already happened. This is not necessary. The directory of the script is known. It is confusing because the programmer sees that the script directory is the first item of sys.path, and so concludes that [s]he can put scripts there and have them imported. This sometimes works, but fails for os, site, sitecustomize and a few others. There is no reason for this other that accidental details of the implementation. Think of documenting imports. We would need to explain that sys.argv[0] is Special, and different from other items. Yuk. The programmer may indeed manipluate sys.path, but at least Python's default path should be simple. BTW, is there any documentation on the details of imports, even a description of sys.path? We need an "invocation and imports" manual (which I guess I just volunteered to write). > > If no path component is given, the zero length string is > > inserted. But if the current working directory later > > changes, this is no longer valid. If we want the directory > > of the script to be sys.path[0], then an absolute path should be > > inserted. > > Some people os.chdir() just for this effect. I don't think I mind if > they experience some pain . This is a problem for zip imports and directory caching. If an item of sys.path is a relative path, and getcwd() changes, then it is difficult (as in slow, not as in impossible) to get caching to work. Think of sys.path item "./archive.zip". There is a dictionary full of items starting with "./", and then CWD changes. I would have to recognize a relative path for any item of sys.path, and call getcwd() for each one, all for each supported OS. Not fun. Using my current code, zip imports fail if a relative path is given and getcwd() changes. This is a problem especially on Windows, as normally the CWD changes to the directory of an opened file. I think most of the practical problems go away if sys.path[0] is an absolute path. Then I can either make relative paths work with the penalty that imports will be slower, or write documentation that sys.path[] must only contain absolute paths. In either case, I think sys.path[0] should be an absolute path. Again, think about documenting imports. What is sys.path[0]? It is the directory of the script. This is a fixed directory that doesn't change. JimA From jim@interet.com Mon Nov 26 15:10:22 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 26 Nov 2001 10:10:22 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> Message-ID: <3C025B5E.5BB9CD1B@interet.com> "Martin v. Loewis" wrote: > > > This insertion occurs after a number of imports have already > > occurred. Specifically, it occurs after the import of site, os, and > > sitecustomize. This is confusing. It is clear that sys.path should > > not change unless the user changes it. > > I disagree. It is clear that sys.path will change during "bootstrap", > e.g. as the result of processing .pth files. I wasn't clear. I didn't mean that sys.path shouldn't change, I meant that Python's starting sys.path[0] shouldn't be different from Python's starting sys.path[1:]. These are set up by Python. The programmer is free to alter sys.path later. > > If we want the directory of the script to be sys.path[0], then an > > absolute path should be inserted. > > I think this would be unimplementable in the general case. At least it is easy to replace "" with getcwd(), and this is valuable. JimA From mal@lemburg.com Mon Nov 26 15:20:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 26 Nov 2001 16:20:00 +0100 Subject: [Python-Dev] Re: mysnprintf broken References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> <20011126061407.A25396@glacier.arctrix.com> Message-ID: <3C025DA0.73C81122@lemburg.com> Neil Schemenauer wrote: > > M.-A. Lemburg wrote: > > Note that the version in Python does not result in *stack* overflows > > which are the type of buffer overflow usually used in exploits. > ... > > The only attack on this kind of emulation is a denial of service > > attack. > > That is a bold statement to make. It is also not true. Heap overflows > _can_ be exploited to execute arbitrary code. I believe there was a > phrack article a few years ago on the subject. I know that they can be exploited (should have phrased the reply more carefully), but I don't think that the exploits described in phrack apply to Python's use of the memory buffer. In case sprintf() overflows, Python will detect this and immediately dump core. I don't see how this could be used by an attacker, except for killing off processes (the DOS attack); the exploit described in Phrack 57 (http://www.phrack.org/) only works on systems which use Doug Lea's malloc implementation, don't define snprintf() in their C lib and have sudo installed. Should be a rather small share of installed OSes ;-) > > In the 3 cases where this API is used in Python, an overflow > > is not possible (unless the native sprintf() implementation > > is broken). > > That may be the case today but I'm sure that snprintf will start getting > more use now that it is available. We really should have a better > implementation than mysnprintf. No objection at all -- I wrote the emulation simply to add at least some level of protection against buffer overflows for platforms which don't provide snprintf() in their own C lib. Before that Python used sprintf(). I suppose we could use the code from stringobject.c:PyString_FromFormatV() as starting point for our own little snprintf() implementation... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jim@interet.com Mon Nov 26 15:43:04 2001 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 26 Nov 2001 10:43:04 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> Message-ID: <3C026308.D583AEFF@interet.com> "Martin v. Loewis" wrote: > > It's not clear to me what this patch would do, so I disagree. You are right. I wasn't clear enough. Everyone please vote on the following: 1) JimA writes import documentation and adds it to the current docs (where are they?) or in a new section (where should it go?). 2) The addition of sys.path[0] is changed to an earlier time so it occurs before any imports; so sys.path[0] works the same as sys.path[1:]. Currently it is added after some imports have occurred. 3) sys.path[0] is documented as the directory of the Python script and is an absolute path. In particular, "" is replaced with getcwd() (or equivalent for all supported OS's). So far, I have "+1", "True" and "Unimplementable in general" in the voting for item 3. 4) For the "-c" option, sys.argv[0] is getcwd() instead of "". JimA From fdrake@acm.org Mon Nov 26 21:38:59 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 26 Nov 2001 16:38:59 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011126213859.1A7AE28696@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various small corrections and additions. From martin@v.loewis.de Mon Nov 26 21:54:22 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 26 Nov 2001 22:54:22 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: <3C026308.D583AEFF@interet.com> (jim@interet.com) References: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> <3C026308.D583AEFF@interet.com> Message-ID: <200111262154.fAQLsMY01509@mira.informatik.hu-berlin.de> > 2) The addition of sys.path[0] is changed to an earlier > time so it occurs before any imports; so sys.path[0] > works the same as sys.path[1:]. Currently it is added > after some imports have occurred. I still try to finding the same mental picture for this as you apparently have. I understand "changed to an earlier time". What I don't understand is the effect that you associate with it: sys.path[0] is a string, sys.path[1:] is a list. In what sense do they "work the same"? Regards, Martin From mal@lemburg.com Mon Nov 26 20:08:17 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 26 Nov 2001 21:08:17 +0100 Subject: [Python-Dev] What's a PyStructSequence ? Message-ID: <3C02A131.373BE8A4@lemburg.com> A bug report on SF made me aware of an apprently new type in Python called PyStructSequence. There are no docs on the type (at least not in the usual places). Is it official yet ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From nas@python.ca Mon Nov 26 22:21:26 2001 From: nas@python.ca (Neil Schemenauer) Date: Mon, 26 Nov 2001 14:21:26 -0800 Subject: [Python-Dev] gc.garbage In-Reply-To: ; from tim.one@home.com on Sat, Nov 17, 2001 at 04:06:48PM -0500 References: Message-ID: <20011126142126.A26497@glacier.arctrix.com> Tim Peters wrote: > I notice the gc docs say: > > The following variable is provided for read-only access: > ^^^^^^^^^^^^^^^^ > garbage > A list of objects which the collector found to be unreachable ... > > This isn't clear to me. It's not clear because it's nonsense. I think I mean to say something about the gc.garbage binding. If you do something like: gc.garbage = "ha ha" then the list is garbage is forever inaccessible from within Python. Is there some way to prevent people from assigning to certain module variables? That would be the correct way to fix it, IMHO. Neil From bslesins@best.com Mon Nov 26 22:27:26 2001 From: bslesins@best.com (Brian Slesinsky) Date: Mon, 26 Nov 2001 14:27:26 -0800 (PST) Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 In-Reply-To: <3C021429.CB65142E@lemburg.com> Message-ID: <20011126134057.P12925-100000@shell7.ba.best.com> > I'm not sure how your proposal fits in here, but it looks like > the current if-elif-else syntax is better suited to it than > some trying to use a switch-dictionary with some special objects > to implement wild-card matching, e.g. The wildcard object isn't very important; the part that matters is being able to take structures apart using case clauses. Also, this is mostly about syntactical convenience - for certain problems, pattern-matched case statements read better and are less error-prone than if-then-elif syntax. But I need to come up with a better example. Anyway, my main point is just to argue in favor of switch syntax rather than looking for special cases of if-then-elif to optimize. Some languages do some very elegant things with switches that Python might want to implement someday. (I do appreciate that the dictionary optimization only works for the simple case though, so comparison to constants is the only part that's needed to start.) - Brian From martin@v.loewis.de Mon Nov 26 22:31:13 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 26 Nov 2001 23:31:13 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C02A131.373BE8A4@lemburg.com> (mal@lemburg.com) References: <3C02A131.373BE8A4@lemburg.com> Message-ID: <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> > A bug report on SF made me aware of an apprently new type in Python > called PyStructSequence. There are no docs on the type (at least > not in the usual places). > > Is it official yet ? It will ship as part of Python 2.2, if that is what you are asking. os.stat is documented to return one of these (if you read it carefully). Regards, Martin From skip@pobox.com (Skip Montanaro) Mon Nov 26 22:37:12 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 26 Nov 2001 16:37:12 -0600 Subject: [Python-Dev] Interesting paper on exception handling in C Message-ID: <15362.50200.162621.176767@beluga.mojam.com> I saw a reference to MLib today on one of the Gtk lists. Its doc page, in turn, referenced a paper by David Turner on "cleanup stack exception handling" that looked mighty interesting. It looks like he manages a lot of exception handling and memory free activities automatically with judiciously defined TRY and CATCH macros. I thought others here might find it interesting as well: http://www.freetype.org/david/reliable-c.html If something like this was implemented in the Python source, it might make it possible to write cleaner more leak-resistant extension modules. Skip From martin@v.loewis.de Mon Nov 26 22:58:05 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 26 Nov 2001 23:58:05 +0100 Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 In-Reply-To: <20011126134057.P12925-100000@shell7.ba.best.com> (message from Brian Slesinsky on Mon, 26 Nov 2001 14:27:26 -0800 (PST)) References: <20011126134057.P12925-100000@shell7.ba.best.com> Message-ID: <200111262258.fAQMw5e01777@mira.informatik.hu-berlin.de> > Anyway, my main point is just to argue in favor of switch syntax rather > than looking for special cases of if-then-elif to optimize. Some > languages do some very elegant things with switches that Python might want > to implement someday. I would argue that all these beautiful properties of the other languages do not carry over to Python. E.g. in Prolog, you have only a two data types: structure, and list, and neither is opaque: since they are not objects, their state is all they have. In Python, the same can be said just about lists and tuples, perhaps dictionaries. Classes don't participate that easily in pattern matching: If you have x = httplib.HTTP() x.connect("foo.com") would you then expect that x matches httplib.HTTP(), or httplib.HTTP("foo.com")? In languages with pattern matching, you find that they use it to emulate late binding: depending on the structure of a thing, you perform different code. In Python, this is more easily done using methods of the object, which naturally dispatch based on the type of the object. Regards, Martin From martin@v.loewis.de Mon Nov 26 23:11:42 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 00:11:42 +0100 Subject: [Python-Dev] Interesting paper on exception handling in C In-Reply-To: <15362.50200.162621.176767@beluga.mojam.com> (message from Skip Montanaro on Mon, 26 Nov 2001 16:37:12 -0600) References: <15362.50200.162621.176767@beluga.mojam.com> Message-ID: <200111262311.fAQNBge01995@mira.informatik.hu-berlin.de> > If something like this was implemented in the Python source, it might make > it possible to write cleaner more leak-resistant extension modules. Isn't P3K written in C++, anyway :-) I don't think using setjmp and longjmp that heavily is a smart idea. There is a significant cost to it, and the benchmarks he had are somewhat cheating, since it performs many malloc calls for a single setjmp call. For Python, we'd need a setjmp in every function, and perhaps multiple setjmps (one per try-catch-block). Of course, if we assume that there are only two true compilers (MSVC and gcc), then we could use the built-in facilities of these compilers for exception handling in C :-) Regards, Martin From greg@cosc.canterbury.ac.nz Tue Nov 27 00:12:18 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Nov 2001 13:12:18 +1300 (NZDT) Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: <3C026308.D583AEFF@interet.com> Message-ID: <200111270012.NAA15029@s454.cosc.canterbury.ac.nz> "James C. Ahlstrom" : > Everyone please > vote on the following: > > 1) JimA writes import documentation and adds it to the > current docs (where are they?) or in a new section > (where should it go?). +1 > 2) The addition of sys.path[0] is changed to an earlier > time so it occurs before any imports +1 > 3) sys.path[0] is documented as the directory of the Python > script and is an absolute path. +1 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Tue Nov 27 00:13:13 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Nov 2001 13:13:13 +1300 (NZDT) Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: <3C026308.D583AEFF@interet.com> Message-ID: <200111270013.NAA15032@s454.cosc.canterbury.ac.nz> Oops, missed one: > 4) For the "-c" option, sys.argv[0] is getcwd() instead of "". +1 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@zope.com Tue Nov 27 01:01:04 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 26 Nov 2001 20:01:04 -0500 Subject: [Python-Dev] Re: mysnprintf broken References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> <20011126061407.A25396@glacier.arctrix.com> <3C025DA0.73C81122@lemburg.com> Message-ID: <15362.58832.421376.513070@anthem.wooz.org> >>>>> "M" == M writes: M> I suppose we could use the code from M> stringobject.c:PyString_FromFormatV() as starting point for our M> own little snprintf() implementation... Aside: PyString_FromFormatV() started life out as PyErr_Format(). mailman-has-its-own-vsnprintf()-but-its-GPL'd-ly y'rs, -Barry From bslesins@best.com Tue Nov 27 02:34:25 2001 From: bslesins@best.com (Brian Slesinsky) Date: Mon, 26 Nov 2001 18:34:25 -0800 (PST) Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 In-Reply-To: <200111262258.fAQMw5e01777@mira.informatik.hu-berlin.de> Message-ID: <20011126162613.E12925-100000@shell7.ba.best.com> > In languages with pattern matching, you find that they use it to > emulate late binding: depending on the structure of a thing, you > perform different code. In Python, this is more easily done using > methods of the object, which naturally dispatch based on the type of > the object. Method dispatch is a great technique and I was once in the "all case statements are insufficiently object-oriented" camp. But sometimes you need to work with base types or other people's classes without subclassing them, so I find myself writing long if-then-elif statements (most recently for a memory profiler), and case statements would make this code clearer. (Though not nice enough to switch languages, in particular to Prolog.) re: "Classes don't participate that easily in pattern matching" True for many classes and I'll admit my proposal was quarter-baked. Even matching on base types like tuple, list, and dictionary would be plenty interesting. (Especially since they're subclassable.) - Brian From greg@cosc.canterbury.ac.nz Tue Nov 27 03:12:17 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Nov 2001 16:12:17 +1300 (NZDT) Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 In-Reply-To: <20011126162613.E12925-100000@shell7.ba.best.com> Message-ID: <200111270312.QAA15047@s454.cosc.canterbury.ac.nz> Brian Slesinsky : > But sometimes you need to work with base types or other people's > classes without subclassing them I think there are situations where it's legitimate to prefer a case-statement approach even when dealing with your own classes. I once wrote a compiler in HUGS (a Haskell dialect), and I found it very convenient to be able to write a self-contained function which took a parse tree node and did something for each possible subtype of that node. It kept all the processing related to each phase of the compiler together in one place. Using a pure OO aproach, I would have had to scatter it all among hundreds of one or two-line methods of many classes, and probably would have gone insane before finishing the project. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mal@lemburg.com Tue Nov 27 10:26:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 11:26:10 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> Message-ID: <3C036A42.CCD62BB3@lemburg.com> "Martin v. Loewis" wrote: > > > A bug report on SF made me aware of an apparently new type in Python > > called PyStructSequence. There are no docs on the type (at least > > not in the usual places). > > > > Is it official yet ? > > It will ship as part of Python 2.2, if that is what you are > asking. os.stat is documented to return one of these (if you read it > carefully). Wouldn't it make sense to expose this object in Python, e.g. by contructing it from a dictionary of string mappings ? (The type constructor is not made available in bltinmodule.c.) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Nov 27 10:07:20 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 11:07:20 +0100 Subject: [Python-Dev] re: PEP 275: Switching on Multiple Values, Rev 1.2 References: <200111270312.QAA15047@s454.cosc.canterbury.ac.nz> Message-ID: <3C0365D8.45C8B49F@lemburg.com> Greg Ewing wrote: > > Brian Slesinsky : > > > But sometimes you need to work with base types or other people's > > classes without subclassing them > > I think there are situations where it's legitimate to prefer a > case-statement approach even when dealing with your own classes. > > I once wrote a compiler in HUGS (a Haskell dialect), and I found it > very convenient to be able to write a self-contained function which > took a parse tree node and did something for each possible subtype of > that node. It kept all the processing related to each phase of the > compiler together in one place. > > Using a pure OO aproach, I would have had to scatter it all among > hundreds of one or two-line methods of many classes, and probably > would have gone insane before finishing the project. Right. IMO, the method callback approach is nice if you need an extendable architecture, e.g. to write generic parsers frameworks. It doesn't pay off in situations where you have a set of fixed requirements. In these cases, the performance gain is more important than being able to extend by subclassing. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Nov 27 10:53:14 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 11:53:14 +0100 Subject: [Python-Dev] sprintf() usage (Re: mysnprintf broken) References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> <20011126061407.A25396@glacier.arctrix.com> <3C025DA0.73C81122@lemburg.com> <15362.58832.421376.513070@anthem.wooz.org> Message-ID: <3C03709A.CD7D53F7@lemburg.com> Grepping through the Python source code there are 191 usages of sprintf() -- shouldn't these be modified to use PyOS_snprintf() instead ? Python/getargs.c would be a particularly important case to fix, since the sprintf()s in there are not protected against buffer overflows -- it seems that long function names could be used to exploit this, e.g. in multi-user environments like Zope to obtain admin priviledges. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jack@oratrix.nl Tue Nov 27 12:04:39 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Nov 2001 13:04:39 +0100 Subject: [Python-Dev] PyArg_ParseTuple and unicode Message-ID: <20011127120440.7D5E9303183@snelboot.oratrix.nl> I had expected the PyArg_ParseTuple() "u" specifier to automatically convert string objects to unicode strings with the default encoding (just as the reverse is true for "s"), but to my surprise it doesn't. Is there a deep reason for this, or am I the first person to want this functionality? Or do I miss something and should I use a different format specifier? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From jim@interet.com Tue Nov 27 13:18:37 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 27 Nov 2001 08:18:37 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> <3C026308.D583AEFF@interet.com> <200111262154.fAQLsMY01509@mira.informatik.hu-berlin.de> Message-ID: <3C0392AD.BBABCA99@interet.com> "Martin v. Loewis" wrote: > > > 2) The addition of sys.path[0] is changed to an earlier > > time so it occurs before any imports; so sys.path[0] > > works the same as sys.path[1:]. Currently it is added > > after some imports have occurred. > > I still try to finding the same mental picture for this as you > apparently have. I understand "changed to an earlier time". > > What I don't understand is the effect that you associate with it: > sys.path[0] is a string, sys.path[1:] is a list. In what sense do they > "work the same"? They work the same in that imports are satisfied from the items. sys.path[0] is the first directory string, sys.path[1:] are the remaining strings. Imports are satisfied from sys.path. The timing is currently as follows: 1) Create an initial sys.path but without its first item. 2) Import site, os, sitecustomize, etc. 3) Insert a new item as sys.path[0]. Therefore the new sys.path[0] will not be used to satisfy an import of the os module because it has already been imported. The new timing would be: 1) Create all items in sys.path. 2) Import site, os, sitecustomize, etc. Therefore the new sys.path[0] will be available to satisfy an import of the os module. JimA From jack@oratrix.nl Tue Nov 27 13:49:33 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Nov 2001 14:49:33 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: Message by "James C. Ahlstrom" , Tue, 27 Nov 2001 08:18:37 -0500 , <3C0392AD.BBABCA99@interet.com> Message-ID: <20011127134934.8D1F9303183@snelboot.oratrix.nl> > 1) Create an initial sys.path but without its first item. > 2) Import site, os, sitecustomize, etc. > 3) Insert a new item as sys.path[0]. This will not work for a frozen MacPython program. In such a program the sys.path initializer is ["$(APPLICATION)"]. So, there's no empty current directory entry in sys.path[0], the only thing in sys.path is the magic cookie "$(APPLICATION)". Early in startup, before site.py is imported, the cookie is replaced by the full pathname of the application. The import code knows that if it sees a file on sys.path (as opposed to a directory) it will look in that file for frozen PYC resources. And, before you ask, I'm aware that in the future this code will have to test whether or not the file has a .zip extension, and go down the zip-import leg if it has. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack@oratrix.nl Tue Nov 27 13:52:43 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Nov 2001 14:52:43 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: Message by "James C. Ahlstrom" , Mon, 26 Nov 2001 10:43:04 -0500 , <3C026308.D583AEFF@interet.com> Message-ID: <20011127135243.645A2303183@snelboot.oratrix.nl> > 4) For the "-c" option, sys.argv[0] is getcwd() instead of "". A related issue: what is sys.path[0] for an interactive interpreter? I customarily use Python as a shell, and os.chdir() from one place to another, importing things along the way. Would an interactive interpreter keep the "" entry in sys.path[0]? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim@interet.com Tue Nov 27 14:14:29 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 27 Nov 2001 09:14:29 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <20011127135243.645A2303183@snelboot.oratrix.nl> Message-ID: <3C039FC5.B20068D1@interet.com> Jack Jansen wrote: > > > 4) For the "-c" option, sys.argv[0] is getcwd() instead of "". > > A related issue: what is sys.path[0] for an interactive interpreter? I > customarily use Python as a shell, and os.chdir() from one place to another, > importing things along the way. Would an interactive interpreter keep the "" > entry in sys.path[0]? The proposal is to put getcwd() into sys.path[0] for an interactive interpreter. So after os.chdir(), imports would NOT be made from the new current directory, but rather from the original current directory. If this is a problem (I'm guessing it is) then the new rule would be: sys.path[0] is the absolute path to the script if there is one; else for an interactive interpreter or for the "-c" option, sys.path[0] is the special entry "" (or perhaps ".") that means look in the current directory (which is allowed to change). In either case, sys.path[0] would be added before any imports. I am not happy with changing current directories because I am trying to speed up imports, and it is harder to cache directory contents on multiple operating systems. But I can do it if we need to. A special case for "." is not too bad, but accommodating arbitrary relative paths is more of a problem. What do others think? JimA From jack@oratrix.nl Tue Nov 27 14:27:40 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Nov 2001 15:27:40 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: Message by "James C. Ahlstrom" , Tue, 27 Nov 2001 09:14:29 -0500 , <3C039FC5.B20068D1@interet.com> Message-ID: <20011127142740.A91A0303183@snelboot.oratrix.nl> > I am not happy with changing current directories because I am trying > to speed up imports, and it is harder to cache directory contents > on multiple operating systems. But I can do it if we need to. Ah, at least now I understand what you're trying to do. BUT: have you done any measurements that show that this caching is actually beneficial? For many years we've used a special caching importer in a very big Python project here, and when we finally did some real measurements it turned out that it had slowed down imports all that time in stead of speeding them up. You have to be especially aware of NFS mounted filesystems. Hmm, why not cache on a sys.path entry-by-entry basis? Then, if the entry is a zipfile we always cache, if the entry is a relative pathname we never cache, if the entry is an absolute pathname we cache on the basis of a preference. Use the sys.path entry as a key in a dictionary, the result is either None (don't cache) or the cache for this sys.path entry. If the key isn't found this is the first time we come across this sys.path entry so we decide whether to cache or not. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim@interet.com Tue Nov 27 14:33:39 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 27 Nov 2001 09:33:39 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <20011127134934.8D1F9303183@snelboot.oratrix.nl> Message-ID: <3C03A443.944A4FE9@interet.com> Jack Jansen wrote: > This will not work for a frozen MacPython program. In such a program the > sys.path initializer is ["$(APPLICATION)"]. > [Explanation....] Thanks for the explanation. I will be careful. I think this should still work in patch #483466. And if I follow the existing #ifdef's and code in PySys_SetArgv(), I should be able to keep it working. So I don't think this is a problem. But unfortunately I don't have a Mac so I can't test it.... JimA From jim@interet.com Tue Nov 27 15:15:29 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 27 Nov 2001 10:15:29 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <20011127142740.A91A0303183@snelboot.oratrix.nl> Message-ID: <3C03AE11.93E0BC2E@interet.com> Jack Jansen wrote: > > > I am not happy with changing current directories because I am trying > > to speed up imports, and it is harder to cache directory contents > > on multiple operating systems. But I can do it if we need to. > > Ah, at least now I understand what you're trying to do. BUT: have you done any > measurements that show that this caching is actually beneficial? For many > years we've used a special caching importer in a very big Python project here, > and when we finally did some real measurements it turned out that it had > slowed down imports all that time in stead of speeding them up. Yes, I have many benchmarks on both Linux and Windows. They are hard to do because the OS will cache directory contents itself. So the comparison is not between cached/uncached, but rather between Python-dictionary-cached versus file-system-cached. In particular, successive runs of a benchmark can speed up by 2X because the OS now has a fresh cache of directory contents. I am focusing on benchmarks made with a "cold" OS directory cache. I reboot to empty the cache. Roughly, my directory cache cuts import time by ~40% for a Windows local drive or an NFS drive. My impression is that OS directory caching is pretty good for Windows 2000 and Linux-NFS. Improvements are possible but not dramatic. For Windows 2000 and a Linux/Samba network server, things are different. Improvements of 5X or more are easily achieved. Apparently, this combination has good OS caching of file data blocks, and poor caching of directory blocks. Apparently the blizzard of fopen() calls Python currently makes can be a problem for some network file systems. It would also be a problem for heavily loaded servers. I also see that import times show a lot of scatter for OS caching, but much less scatter for my caching. Perhaps file server load or just evil cache spirits are to blame. Nevertheless, I do believe that Python imports need to be speeded up. > Hmm, why not cache on a sys.path entry-by-entry basis? Then, if the entry is a > zipfile we always cache, if the entry is a relative pathname we never cache, > if the entry is an absolute pathname we cache on the basis of a preference. > Use the sys.path entry as a key in a dictionary, the result is either None > (don't cache) or the cache for this sys.path entry. If the key isn't found > this is the first time we come across this sys.path entry so we decide whether > to cache or not. This is almost exactly what my code does. But I don't test for relative paths because that is a porting headache (but a test for "" or "." could be an exception). So the rule is: zip files are always cached, otherwise if os.listdir() exists and has been imported we always cache, else use current logic. Each entry in sys.path is checked once. My problem is that this breaks the current feature of importing from a relative directory path using the current getcwd(). I can fix this, but (1) is it worth it except perhaps for ".", (2) I don't want to support it for zip files because I must cache these, (3) the fix is a portability and speed problem because I must recognize a relative path (like os.path.abspath) and either call os.getcwd or fall back to fopen() searching. JimA From mal@lemburg.com Tue Nov 27 15:34:36 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 16:34:36 +0100 Subject: [Python-Dev] PyArg_ParseTuple and unicode References: <20011127120440.7D5E9303183@snelboot.oratrix.nl> Message-ID: <3C03B28C.67F5A861@lemburg.com> Jack Jansen wrote: > > I had expected the PyArg_ParseTuple() "u" specifier to automatically convert > string objects to unicode strings with the default encoding (just as the > reverse is true for "s"), but to my surprise it doesn't. > > Is there a deep reason for this, or am I the first person to want this > functionality? Or do I miss something and should I use a different format > specifier? Due to the problems around auto-conversion of objects to Unicode, the current pattern to use is: 1. fetch the object using the "O" parameter marker and 2. convert it to Unicode using PyObject_Unicode() I could imagine extending the "u" parser marker to do the same using an temporary Unicode object the contents of which are then copied into a user provided buffer (much like what "es#" does). Alternatively, we could extend the "e" parser marker with a "u" target... this has the added benefit of being able to define an encoding to use for dealing with non-Unicode string data. If you think this is needed, please upload a feature request to SF and assign it to me. I'll look into this after the feature freeze then. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tommy@ilm.com Tue Nov 27 17:43:42 2001 From: tommy@ilm.com (Tommy 'Too Many Stances' Burnette) Date: Tue, 27 Nov 2001 09:43:42 -0800 (PST) Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: <3C039FC5.B20068D1@interet.com> References: <20011127135243.645A2303183@snelboot.oratrix.nl> <3C039FC5.B20068D1@interet.com> Message-ID: <15363.53259.713562.294401@mace.lucasdigital.com> James C. Ahlstrom writes: | | The proposal is to put getcwd() into sys.path[0] for an interactive | interpreter. So after os.chdir(), imports would NOT be made from | the new current directory, but rather from the original current | directory. | | If this is a problem (I'm guessing it is) then the new rule would be: | | sys.path[0] is the absolute path to the script if there is one; else | for an interactive interpreter or for the "-c" option, sys.path[0] | is the special entry "" (or perhaps ".") that means look in the | current directory (which is allowed to change). | | In either case, sys.path[0] would be added before any imports. | | I am not happy with changing current directories because I am trying | to speed up imports, and it is harder to cache directory contents | on multiple operating systems. But I can do it if we need to. | | A special case for "." is not too bad, but accommodating arbitrary | relative paths is more of a problem. | | What do others think? | As little as I like it, we have lots and lots of code here that depends on chdir to find modules for importing. If you were to make this change (and I suspect you won't because it might break lots of code, but for argument's sake...) would having an explicit '.' somewhere in sys.path still work after a chdir? From jim@interet.com Tue Nov 27 18:57:15 2001 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 27 Nov 2001 13:57:15 -0500 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 References: <20011127135243.645A2303183@snelboot.oratrix.nl> <3C039FC5.B20068D1@interet.com> <15363.53259.713562.294401@mace.lucasdigital.com> Message-ID: <3C03E20B.D4EF1DA6@interet.com> Tommy 'Too Many Stances' Burnette wrote: > As little as I like it, we have lots and lots of code here that > depends on chdir to find modules for importing. If you were to make > this change (and I suspect you won't because it might break lots of > code, but for argument's sake...) would having an explicit '.' > somewhere in sys.path still work after a chdir? My current thinking is that "" and "." on sys.path need to work after a chdir. So you could still chdir and import from the new current directory. Harder is the question of whether 'Path/To/Lib' should work after a chdir. It is a relative path, but not simply "" nor ".". Don't panic, I am not changing anything until we all decide what we want. JimA From martin@v.loewis.de Tue Nov 27 20:05:44 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 21:05:44 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C036A42.CCD62BB3@lemburg.com> (mal@lemburg.com) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> Message-ID: <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> > Wouldn't it make sense to expose this object in Python, > e.g. by contructing it from a dictionary of string mappings ? > > (The type constructor is not made available in bltinmodule.c.) No. AFAIR, this idea was explicitly rejected at the time the patch was designed (see the comments on the stat patch for the exact history). The rationale was that it is easy enough to create a class that doubles as tuple in Python yourself (perhaps through inheritance from tuple), so there would be no need to expose this type. Regards, Martin From martin@v.loewis.de Tue Nov 27 20:18:05 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 21:18:05 +0100 Subject: [Python-Dev] sprintf() usage (Re: mysnprintf broken) In-Reply-To: <3C03709A.CD7D53F7@lemburg.com> (mal@lemburg.com) References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> <20011126061407.A25396@glacier.arctrix.com> <3C025DA0.73C81122@lemburg.com> <15362.58832.421376.513070@anthem.wooz.org> <3C03709A.CD7D53F7@lemburg.com> Message-ID: <200111272018.fARKI5c01506@mira.informatik.hu-berlin.de> > Grepping through the Python source code there are 191 > usages of sprintf() -- shouldn't these be modified to > use PyOS_snprintf() instead ? Not necessarily. sprintf is perfectly ok if used correctly (i.e. if you can guarantee an upper bound on the resulting string length, and compute this bound either statically or dynamically). > Python/getargs.c would be a particularly important case > to fix, since the sprintf()s in there are not protected > against buffer overflows -- it seems that long function > names could be used to exploit this, e.g. in multi-user > environments like Zope to obtain admin priviledges. That indeed appears to be the case. However, given char buf[256]; sprintf(p, "%s() ", fname); I think the correct reformulation should be char buf[256]; sprintf(p, "%.100s() ", fname); In seterror, you add then a number of strings containing each a %d (adding 20 bytes worst-case each), where the loop should terminate if there are only, say, 140 bytes left; the final printf could then use %.100s. Alternatively, you could use "%.*s" through-out, operating with the lengths of the strings themselves. Regards, Martin From martin@v.loewis.de Tue Nov 27 20:24:46 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 21:24:46 +0100 Subject: [Python-Dev] Re: Change to sys.path[0] for Python 2.3 In-Reply-To: <3C0392AD.BBABCA99@interet.com> (jim@interet.com) References: <200111212102.fALL2Lc01966@mira.informatik.hu-berlin.de> <3C026308.D583AEFF@interet.com> <200111262154.fAQLsMY01509@mira.informatik.hu-berlin.de> <3C0392AD.BBABCA99@interet.com> Message-ID: <200111272024.fARKOkY01537@mira.informatik.hu-berlin.de> > They work the same in that imports are satisfied from the items. > sys.path[0] is the first directory string, sys.path[1:] are the > remaining strings. Imports are satisfied from sys.path. > The timing is currently as follows: Now I got it. +1. Martin From mal@lemburg.com Tue Nov 27 20:28:34 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 21:28:34 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> Message-ID: <3C03F772.5B0D13@lemburg.com> "Martin v. Loewis" wrote: > > > Wouldn't it make sense to expose this object in Python, > > e.g. by contructing it from a dictionary of string mappings ? > > > > (The type constructor is not made available in bltinmodule.c.) > > No. AFAIR, this idea was explicitly rejected at the time the patch was > designed (see the comments on the stat patch for the exact history). > > The rationale was that it is easy enough to create a class that > doubles as tuple in Python yourself (perhaps through inheritance from > tuple), so there would be no need to expose this type. Hmm, isn't the trick with this type that you can access the various elements as attributes *and* using index notation ? Also, why should we hide something useful from the Python programmer if it's there anyway ? (One thing I've always wondered about is why Python doesn't expose Py_True and Py_False through the builtin module...) I guess, I'll add a constructor to mx.Tools, my repository for missing builtins ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From michel@zope.com Tue Nov 27 20:38:45 2001 From: michel@zope.com (Michel Pelletier) Date: Tue, 27 Nov 2001 12:38:45 -0800 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> Message-ID: <3C03F9D5.3148C076@zope.com> "M.-A. Lemburg" wrote: > Also, why should we hide something useful from the Python > programmer if it's there anyway ? (One thing I've always wondered > about is why Python doesn't expose Py_True and Py_False through > the builtin module...) I have no idea why they are not exposed, of course. but my guess would be because there is no boolean type, there is no need for them. I myself have never needed a boolean type, "zero" or "empty" have always worked for me as a boolean false. What are they used at the C level for? > I guess, I'll add a constructor to mx.Tools, my repository > for missing builtins ;-) Does mx.Tools offer a boolean type? -Michel From mal@lemburg.com Tue Nov 27 20:39:03 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 21:39:03 +0100 Subject: [Python-Dev] sprintf() usage (Re: mysnprintf broken) References: <200111252320.fAPNK9m01747@mira.informatik.hu-berlin.de> <3C020B0D.FCE1FCAF@lemburg.com> <20011126061407.A25396@glacier.arctrix.com> <3C025DA0.73C81122@lemburg.com> <15362.58832.421376.513070@anthem.wooz.org> <3C03709A.CD7D53F7@lemburg.com> <200111272018.fARKI5c01506@mira.informatik.hu-berlin.de> Message-ID: <3C03F9E7.833551F@lemburg.com> "Martin v. Loewis" wrote: > > > Grepping through the Python source code there are 191 > > usages of sprintf() -- shouldn't these be modified to > > use PyOS_snprintf() instead ? > > Not necessarily. sprintf is perfectly ok if used correctly (i.e. if > you can guarantee an upper bound on the resulting string length, and > compute this bound either statically or dynamically). This is done in most cases, indeed. Still I think we need some auditing here and having all audited sprintf() uses renamed to say PyOS_snprintf() would make auditing future Python releases a lot easier. > > Python/getargs.c would be a particularly important case > > to fix, since the sprintf()s in there are not protected > > against buffer overflows -- it seems that long function > > names could be used to exploit this, e.g. in multi-user > > environments like Zope to obtain admin priviledges. > > That indeed appears to be the case. However, given > > char buf[256]; > sprintf(p, "%s() ", fname); > > I think the correct reformulation should be > > char buf[256]; > sprintf(p, "%.100s() ", fname); Right. > In seterror, you add then a number of strings containing each a %d > (adding 20 bytes worst-case each), where the loop should terminate if > there are only, say, 140 bytes left; the final printf could then use > %.100s. > > Alternatively, you could use "%.*s" through-out, operating with the > lengths of the strings themselves. I think that would make the code much more complicated. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Nov 27 20:54:23 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Nov 2001 21:54:23 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> <3C03F9D5.3148C076@zope.com> Message-ID: <3C03FD7F.CA3C81A0@lemburg.com> Michel Pelletier wrote: > > "M.-A. Lemburg" wrote: > > > Also, why should we hide something useful from the Python > > programmer if it's there anyway ? (One thing I've always wondered > > about is why Python doesn't expose Py_True and Py_False through > > the builtin module...) > > I have no idea why they are not exposed, of course. but my guess would > be because there is no boolean type, there is no need for them. I > myself have never needed a boolean type, "zero" or "empty" have always > worked for me as a boolean false. > > What are they used at the C level for? All simple compares return either Py_True or Py_False (e.g. 1==1 returns a reference to Py_True). > > I guess, I'll add a constructor to mx.Tools, my repository > > for missing builtins ;-) > > Does mx.Tools offer a boolean type? No, but I'm thinking about adding a Boolean number type to mxNumber. I'll also need some form of a binary type to make the set complete for XML-RPC. Currently, I can work around this by using True and False (which mx.Tools adds) and using buffer objects as wrappers to mean "this is a binary type". -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Tue Nov 27 20:57:03 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 27 Nov 2001 14:57:03 -0600 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C03F9D5.3148C076@zope.com> References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> <3C03F9D5.3148C076@zope.com> Message-ID: <15363.65055.413046.674164@beluga.mojam.com> Michel> I have no idea why they are not exposed, of course. but my guess Michel> would be because there is no boolean type, there is no need for Michel> them. I myself have never needed a boolean type, "zero" or Michel> "empty" have always worked for me as a boolean false. XML-RPC defines a Boolean data type, so xmlrpclib defines a Boolean class with True and False instances. Programmers wanting to send boolean values must pass one of them (and expect to receive them when data arrives). Given that Py_True and Py_False are sitting there just below the surface, it seems a (small) shame that /F had to do that, more so now that xmlrpclib is part of the core distribution. Interestingly (or oddly, not sure which) enough, Lib/test/test_iter.py also defines a small Boolean class. -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From barry@zope.com Tue Nov 27 20:54:37 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 27 Nov 2001 15:54:37 -0500 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> <3C03F9D5.3148C076@zope.com> Message-ID: <15363.64909.919533.5992@anthem.wooz.org> >>>>> "MP" == Michel Pelletier writes: MP> I have no idea why they are not exposed, of course. but my MP> guess would be because there is no boolean type, there is no MP> need for them. I myself have never needed a boolean type, MP> "zero" or "empty" have always worked for me as a boolean MP> false. There have been many implementations of a boolean type; I know 'cause I've done at least 3. For grins I might even try a fourth for Python 2.2. But you're right, in practice there isn't much of a need. -Barry From martin@v.loewis.de Tue Nov 27 21:21:33 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 22:21:33 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C03F9D5.3148C076@zope.com> (message from Michel Pelletier on Tue, 27 Nov 2001 12:38:45 -0800) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> <3C03F9D5.3148C076@zope.com> Message-ID: <200111272121.fARLLXx01929@mira.informatik.hu-berlin.de> > What are they used at the C level for? To return them to Python. You write Py_INCREF(Py_True); return Py_True; or result = c ? Py_True : Py_False; Py_INCREF(result); return result; just as you return None. That saves atleast one function call. Py_True, of course, *is* 1. There is no proper boolean type. Regards, Martin From martin@v.loewis.de Tue Nov 27 21:09:38 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 27 Nov 2001 22:09:38 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C03F772.5B0D13@lemburg.com> (mal@lemburg.com) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111272005.fARK5if01440@mira.informatik.hu-berlin.de> <3C03F772.5B0D13@lemburg.com> Message-ID: <200111272109.fARL9c201822@mira.informatik.hu-berlin.de> > Hmm, isn't the trick with this type that you can access > the various elements as attributes *and* using index > notation ? Indeed. In Py 2.2, you can do that two ways: A) indexmap = {'st_dev':0, 'st_ino':1} # etc class StatResult(tuple): def __getattr__(self,name): return self[indexmap[name]] B) fields = ['st_dev', 'st_ino'] #etc class StatResult(UserList.UserList): def __init__(self, dev, ino): self.st_dev = dev self.st_ino = ino def __getattr__(self, name): if name=="data": return [getattr(self,fname) for fname in fields] raise AttributeError, name def __setattr__(self, name, value): if name=="data": raise AttributeError, "data is read-only" self.__dict__[name] = value > Also, why should we hide something useful from the Python programmer > if it's there anyway ? Because it has unknown limitations (atleast, they are unknown to me at the moment; I probably could report them if I searched long enough). Regards, Martin From MarkH@ActiveState.com Wed Nov 28 02:20:43 2001 From: MarkH@ActiveState.com (Mark Hammond) Date: Wed, 28 Nov 2001 13:20:43 +1100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <200111272121.fARLLXx01929@mira.informatik.hu-berlin.de> Message-ID: > Py_True, of course, *is* 1. There is no proper boolean type. Seeing you added emphasis on the *is*, I assume you meant *is* :) >>> true=(1==1) >>> true is 1 0 >>> Py_True == 1, but is *not* 1 Pedantically, Mark. From martin@v.loewis.de Wed Nov 28 06:22:26 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 28 Nov 2001 07:22:26 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: References: Message-ID: <200111280622.fAS6MQu01529@mira.informatik.hu-berlin.de> > > Py_True, of course, *is* 1. There is no proper boolean type. > > Seeing you added emphasis on the *is*, I assume you meant *is* :) Indeed, that's what I really meant. > Py_True == 1, but is *not* 1 Thanks for clarifying it; that was surprising since I assumed 1 was a proper singleton under all circumstances. Regards, Martin From fdrake@acm.org Wed Nov 28 07:57:33 2001 From: fdrake@acm.org (Fred L. Drake) Date: Wed, 28 Nov 2001 02:57:33 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011128075733.9558928696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Minor updates; mostly changed organization of the "Internet Data Handling" chapter. From mal@lemburg.com Wed Nov 28 09:15:53 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 28 Nov 2001 10:15:53 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: Message-ID: <3C04AB49.7B3B9665@lemburg.com> Mark Hammond wrote: > > > Py_True, of course, *is* 1. There is no proper boolean type. > > Seeing you added emphasis on the *is*, I assume you meant *is* :) > > >>> true=(1==1) > >>> true is 1 > 0 > >>> > > Py_True == 1, but is *not* 1 Py_True is a singleton and all 1 integers in Python are cached and shared, so we end up having exactly two different objects for the number 1 in Python: Py_True and 1 (or 3-2 or 4/4 or ...). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jack@oratrix.nl Wed Nov 28 16:02:02 2001 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 28 Nov 2001 17:02:02 +0100 Subject: [Python-Dev] PyArg_ParseTuple and unicode In-Reply-To: Message by "M.-A. Lemburg" , Tue, 27 Nov 2001 16:34:36 +0100 , <3C03B28C.67F5A861@lemburg.com> Message-ID: <20011128160203.C734C303183@snelboot.oratrix.nl> MAL wrote: > Jack Jansen wrote: > > > > I had expected the PyArg_ParseTuple() "u" specifier to automatically convert > > string objects to unicode strings with the default encoding (just as the > > reverse is true for "s"), but to my surprise it doesn't. > [...] > > I could imagine extending the "u" parser marker to do the same > using an temporary Unicode object the contents of which are then > copied into a user provided buffer (much like what "es#" does). > Alternatively, we could extend the "e" parser marker with a > "u" target... this has the added benefit of being able > to define an encoding to use for dealing with non-Unicode > string data. I like this second alternative, because it also provides a possible solution for a more complex problem I have. I'm porting the PythonWin MFC modules to Windows CE, and on WinCE the OS API's are completely unicode-based. In C you can write code that is portable between 8-bit systems and Unicode systems by using a set of macros and types that get expanded to char/strlen/sprintf/etc or wchar_t/wslen/etc. Currently I have every PyArg_ParseTuple #ifdeffed, but using an "e" marker would be better. > If you think this is needed, please upload a feature request to > SF and assign it to me. I'll look into this after the feature > freeze then. As I'm doing this project with a very old Python I might as well do it myself (unless I can't figure out how:-), and then I'll put the patch in sourceforge, hand it over to you and you can turn my gibberish into real C:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal@lemburg.com Wed Nov 28 16:11:27 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 28 Nov 2001 17:11:27 +0100 Subject: [Python-Dev] -DINET6 in Makefile Message-ID: <3C050CAF.2A5CEDC5@lemburg.com> What's the reasoning behind putting -DINET6 into the default compiler options of the generic Makefile ? I'm just asking because such a define will be inherited by all extensions being compiled with distutils and the Makefile.pre.in setup process... sounds like trouble if you ask me. Shouldn't the define be placed into the pyconfig.h file or only in those extensions which need it ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Nov 28 16:16:34 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 28 Nov 2001 17:16:34 +0100 Subject: [Python-Dev] PyArg_ParseTuple and unicode References: <20011128160203.C734C303183@snelboot.oratrix.nl> Message-ID: <3C050DE2.1BAD18E1@lemburg.com> Jack Jansen wrote: > > MAL wrote: > > Jack Jansen wrote: > > > > > > I had expected the PyArg_ParseTuple() "u" specifier to automatically convert > > > string objects to unicode strings with the default encoding (just as the > > > reverse is true for "s"), but to my surprise it doesn't. > > [...] > ... > > If you think this is needed, please upload a feature request to > > SF and assign it to me. I'll look into this after the feature > > freeze then. > > As I'm doing this project with a very old Python I might as well do it myself > (unless I can't figure out how:-), and then I'll put the patch in sourceforge, > hand it over to you and you can turn my gibberish into real C:-) Ok. You could use the existing "e" parser marker code as template -- I'd suggest to use "eu" and "eu#" as new "e" marker options. This won't go into 2.2 though... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Wed Nov 28 23:51:47 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 29 Nov 2001 00:51:47 +0100 Subject: [Python-Dev] python-dev vs. the spammers Message-ID: <032901c17867$a794b570$ced241d5@hagrid> just noticed that I received 60 check-in messages during the last 24 hours (yesterday, in local time), and only 51 spams. and as far as I can tell, that's the first time that happened since I changed mail system (my old setup didn't generate statistics, so this could be the first time ever ;-) From nickm@alum.mit.edu Thu Nov 29 01:27:36 2001 From: nickm@alum.mit.edu (nickm@alum.mit.edu) Date: Wed, 28 Nov 2001 20:27:36 -0500 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C036A42.CCD62BB3@lemburg.com> References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> Message-ID: <200111290127.fAT1Rae02898@blinking-device.reputation.com> Marc-Andre Lemburg wrote: > "Martin v. Loewis" wrote: > > > > A bug report on SF made me aware of an apparently new type in Python > > > > called PyStructSequence. There are no docs on the type (at least > > > not in the usual places). > > > > > > Is it official yet ? > > > It will ship as part of Python 2.2, if that is what you are > > asking. os.stat is documented to return one of these (if you read it > > carefully). > >Wouldn't it make sense to expose this object in Python, >e.g. by contructing it from a dictionary of string mappings ? > >(The type constructor is not made available in bltinmodule.c.) Hi, all. I'm not subscribed to python-dev, but I'm the author of the original patch, and I thought I should comment. If you look closely, you'll find that PyStructSequence is not a type itself, but rather a tool used to construct new tuple/struct hybrid types, like the results of os.stat and time.gmtime. In reality, PyStructSequence is only a set of common implementation logic for a set of other types, including os.stat_result, os.statvfs_result, and time.struct_time. There are a few possible objections to this scheme: Q. Nick, why didn't you make it a _real_ metatype? A. Writing a real metatype in C was beyond my Python abilities. If anybody wants to, I'd be thrilled.a Q. Okay, so why not expose it to python? A. Because it isn't a real metatype. Every type that uses it _is_ exposed to python. I think this isn't a problem, because it's way easier to re-implement PyStructSequence in Python than it is to turn it into a metatype. Q. If it's so easy to write in Python, why not do it that way? A. Because there are fringe benefits to doing it in C. For example, on some Unix machines (such as Linux), struct stat has some attributes that don't correspond to any elements of the old tuple view. To expose (say) st_rdev to Python code at all, you'd need to change the result of posix.stat... but this would break code that used posix.stat directly. But because PyStructSequence is written in C, posix.stat can return an augmented tuple/struct hybrid that (when accessed as a tuple) still has 10 elements, but also exposes st_rdev as an attribute. HTH, -- Nick Mathewson From fdrake@acm.org Thu Nov 29 08:48:05 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 29 Nov 2001 03:48:05 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011129084805.0885328696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added section of example regular expressions to the "re" module docs. Various small clarifications. From mal@lemburg.com Thu Nov 29 09:14:43 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 29 Nov 2001 10:14:43 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> Message-ID: <3C05FC83.5F87556D@lemburg.com> nickm@alum.mit.edu wrote: > > Marc-Andre Lemburg wrote: > > "Martin v. Loewis" wrote: > > > > > A bug report on SF made me aware of an apparently new type in Python > > > > > called PyStructSequence. There are no docs on the type (at least > > > > not in the usual places). > > > > > > > > Is it official yet ? > > > > It will ship as part of Python 2.2, if that is what you are > > > asking. os.stat is documented to return one of these (if you read it > > > carefully). > > > >Wouldn't it make sense to expose this object in Python, > >e.g. by contructing it from a dictionary of string mappings ? > > > >(The type constructor is not made available in bltinmodule.c.) > > Hi, all. I'm not subscribed to python-dev, but I'm the author of the > original patch, and I thought I should comment. > > If you look closely, you'll find that PyStructSequence is not a type > itself, but rather a tool used to construct new tuple/struct hybrid > types, like the results of os.stat and time.gmtime. Indeed -- and I have a question there: why did you have to implement this as meta-type ? It seems that the same thing could have been done using a normal type which then gets initialized after instantiation. Or was it to get used to the new type system :-? > In reality, PyStructSequence is only a set of common implementation > logic for a set of other types, including os.stat_result, > os.statvfs_result, and time.struct_time. > > There are a few possible objections to this scheme: > > Q. Nick, why didn't you make it a _real_ metatype? > > A. Writing a real metatype in C was beyond my Python > abilities. If anybody wants to, I'd be thrilled.a > > Q. Okay, so why not expose it to python? > > A. Because it isn't a real metatype. Every type that uses it _is_ > exposed to python. > > I think this isn't a problem, because it's way easier to > re-implement PyStructSequence in Python than it is to turn it > into a metatype. > > Q. If it's so easy to write in Python, why not do it that way? > > A. Because there are fringe benefits to doing it in C. > > For example, on some Unix machines (such as Linux), struct stat > has some attributes that don't correspond to any elements of > the old tuple view. To expose (say) st_rdev to Python code at > all, you'd need to change the result of posix.stat... but this > would break code that used posix.stat directly. > > But because PyStructSequence is written in C, posix.stat can > return an augmented tuple/struct hybrid that (when accessed as > a tuple) still has 10 elements, but also exposes st_rdev as an > attribute. This would have also been possible using the "normal" approach; I'm still not convinced -- it looks too much like an academic experiment ;-). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@v.loewis.de Thu Nov 29 10:22:56 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 29 Nov 2001 11:22:56 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C05FC83.5F87556D@lemburg.com> (mal@lemburg.com) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> Message-ID: <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> > Indeed -- and I have a question there: why did you have to implement > this as meta-type ? MAL, please do read the patch discussion first, at http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=462296 Regards, Martin From mal@lemburg.com Thu Nov 29 11:56:53 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 29 Nov 2001 12:56:53 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> Message-ID: <3C062285.84305AB@lemburg.com> "Martin v. Loewis" wrote: > > > Indeed -- and I have a question there: why did you have to implement > > this as meta-type ? > > MAL, please do read the patch discussion first, at > > http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=462296 The discussion on SF doesn't really answer my question. What Nick did is fascinating: he reused the type object implementation to mimic a sequence ! That's cool, but looks like an awfully tricky way of doing something straight forward such as sub-classing the tuple type to extend it with an additional dictionary. So the question remains: why did Nick *have* to implement this as meta-type ? BTW, Nick's stuff is a nice intro to the more complicated capabilities of the new type system and I think people can learn a lot from it. I certainly want to learn from it, because I haven't really interned the details behind all the new C type slots yet. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas.heller@ion-tof.com Thu Nov 29 17:23:19 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 29 Nov 2001 18:23:19 +0100 Subject: [Python-Dev] What's a PyStructSequence ? References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> <3C062285.84305AB@lemburg.com> Message-ID: <013c01c178fa$9c46fe60$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" > "Martin v. Loewis" wrote: > > > > > Indeed -- and I have a question there: why did you have to implement > > > this as meta-type ? > > > > MAL, please do read the patch discussion first, at > > > > http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=462296 > > The discussion on SF doesn't really answer my question. What > Nick did is fascinating: he reused the type object implementation > to mimic a sequence ! That's cool, but looks like an awfully > tricky way of doing something straight forward such as sub-classing > the tuple type to extend it with an additional dictionary. > So the question remains: why did Nick *have* to implement this > as meta-type ? As I understand it, PyStructSequence_InitType() is a factory for types aka metaclasses. The above statment 'he reused the type object to mimic a sequence' is IMO wrong. *My* question would be (maybe this is what MAL meant): why aren't the created types subclasses of PyTupleType? Thomas From niemeyer@conectiva.com Thu Nov 29 19:54:06 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 29 Nov 2001 17:54:06 -0200 Subject: [Python-Dev] Re: _PyTuple_Resize() in 2.2 In-Reply-To: <20011126131611.B758@ute.mems-exchange.org> References: <20011121154833.A3423@ibook.distro.conectiva> <20011126131611.B758@ute.mems-exchange.org> Message-ID: <20011129175406.A6254@ibook.distro.conectiva> --HlL+5n6rz5pIUxbD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > >You may want to include in your Python 2.2 change log the fact > >that _PyTuple_Resize() has now 2 arguments instead of 3. The last >=20 > Good suggestion; I've added it to the CVS version. Thanks! One more: readline() method (and probably others) now raise an exception if called for a directory. It used to return an empty string. I haven't found this in Python's sourceforge changelog either. --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --HlL+5n6rz5pIUxbD Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8BpJbIlOymmZkOgwRAvTAAJ9AnPYaX6lgX/tBHpVKXpfjLLe3EQCgoOOj zmX6zAwdo9qso79eljg4RH0= =5DZJ -----END PGP SIGNATURE----- --HlL+5n6rz5pIUxbD-- From thomas.heller@ion-tof.com Thu Nov 29 20:32:31 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 29 Nov 2001 21:32:31 +0100 Subject: [Python-Dev] __metaclass__ Message-ID: <034001c17915$0d72dc70$e000a8c0@thomasnotebook> Maybe I'm missing something, but why doesn't the following raise errors: class X(object): __metaclass__ = type X() X(1) X(1, 2, 3, a="x", b="y") I would have expected 'this constructor takes no arguments' errors on the last two lines. Or is this expected behaviour? Thomas From tim@zope.com Thu Nov 29 20:56:53 2001 From: tim@zope.com (Tim Peters) Date: Thu, 29 Nov 2001 15:56:53 -0500 Subject: [Python-Dev] Re: _PyTuple_Resize() in 2.2 In-Reply-To: <20011129175406.A6254@ibook.distro.conectiva> Message-ID: > You may want to include in your Python 2.2 change log the fact > that _PyTuple_Resize() has now 2 arguments instead of 3. The last As the NEWS file for 2.2a1 said, """ C API - Removed the unused last_is_sticky argument from the internal _PyTuple_Resize(). If this affects you, you were cheating. """ Nobody outside the core should be using any private API functions (hence, "cheating"). From guido@python.org Thu Nov 29 21:48:13 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 29 Nov 2001 16:48:13 -0500 Subject: [Python-Dev] Re: _PyTuple_Resize() in 2.2 In-Reply-To: Your message of "Thu, 29 Nov 2001 17:54:06 -0200." <20011129175406.A6254@ibook.distro.conectiva> References: <20011121154833.A3423@ibook.distro.conectiva> <20011126131611.B758@ute.mems-exchange.org> <20011129175406.A6254@ibook.distro.conectiva> Message-ID: <200111292148.QAA01837@cj20424-a.reston1.va.home.com> > One more: readline() method (and probably others) now raise an > exception if called for a directory. It used to return an empty > string. I haven't found this in Python's sourceforge changelog either. Can you add this to the SF bug manager with a piece of sample code that shows what you did, what you expected, and what happened instead? I can't parse "call readline() for a directory." --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Nov 29 21:50:09 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 29 Nov 2001 16:50:09 -0500 Subject: [Python-Dev] __metaclass__ In-Reply-To: Your message of "Thu, 29 Nov 2001 21:32:31 +0100." <034001c17915$0d72dc70$e000a8c0@thomasnotebook> References: <034001c17915$0d72dc70$e000a8c0@thomasnotebook> Message-ID: <200111292150.QAA01860@cj20424-a.reston1.va.home.com> > Maybe I'm missing something, but why doesn't the following > raise errors: > > class X(object): > __metaclass__ = type > > X() > X(1) > X(1, 2, 3, a="x", b="y") > > I would have expected 'this constructor takes no arguments' > errors on the last two lines. Or is this expected behaviour? Neither object.__init__ nor object.__new__ pays any attention to its argument list. If they did, subclassing would be more difficult. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Nov 29 21:51:46 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 29 Nov 2001 22:51:46 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <3C062285.84305AB@lemburg.com> (mal@lemburg.com) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> <3C062285.84305AB@lemburg.com> Message-ID: <200111292151.fATLpkp01348@mira.informatik.hu-berlin.de> > The discussion on SF doesn't really answer my question. What > Nick did is fascinating: he reused the type object implementation > to mimic a sequence ! That's cool, but looks like an awfully > tricky way of doing something straight forward such as sub-classing > the tuple type to extend it with an additional dictionary. > So the question remains: why did Nick *have* to implement this > as meta-type ? For one thing, you'll see from the discussion that extending the tuple type with an additional dict is non-trivial: You cannot define a C data type that does this. You'll also see that there was a version that did it, and that it was rejected precisely because of this problem. Regards, Martin From martin@v.loewis.de Thu Nov 29 22:22:15 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 29 Nov 2001 23:22:15 +0100 Subject: [Python-Dev] What's a PyStructSequence ? In-Reply-To: <013c01c178fa$9c46fe60$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> <3C062285.84305AB@lemburg.com> <013c01c178fa$9c46fe60$e000a8c0@thomasnotebook> Message-ID: <200111292222.fATMMF001410@mira.informatik.hu-berlin.de> > *My* question would be (maybe this is what MAL meant): > why aren't the created types subclasses of PyTupleType? How would you inherit from PyTupleType in C? E.g. by doing struct PyStatType{ PyTupleType foo; PyObject *additional_field; }; That reads well, but it is wrong: PyTupleType ends in a flexible array member, so it cannot be used as the member of another struct. Regards, Martin From tim@zope.com Thu Nov 29 22:25:12 2001 From: tim@zope.com (Tim Peters) Date: Thu, 29 Nov 2001 17:25:12 -0500 Subject: [Python-Dev] gc.garbage In-Reply-To: <20011126142126.A26497@glacier.arctrix.com> Message-ID: [Neil Schemenauer, on the gc.garbage docs as of about a week ago] > It's not clear because it's nonsense. I think I mean to say something > about the gc.garbage binding. If you do something like: > > gc.garbage = "ha ha" > > then the list is garbage is forever inaccessible from within Python. I've since tried to repair the docs, to point out that rebinding gc.garbage is a Bad Idea but mutating it may be a Good one. BTW, I expect it's more likely people will get in trouble via: gc.garbage = [] I expect that because I did it once . > Is there some way to prevent people from assigning to certain module > variables? Not that I know of. If you're terribly concerned, gc could look up "garbage" in its dict on each access. That's what, e.g., PRINT_ITEM does with sys.stdout. Then it would also have to check that it's a list, etc. But I'd be keener to see new words spelling out which parts of the gc interface are and aren't intended "to work" across releases ... From niemeyer@conectiva.com Thu Nov 29 22:45:12 2001 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 29 Nov 2001 20:45:12 -0200 Subject: [Python-Dev] Re: _PyTuple_Resize() in 2.2 In-Reply-To: <200111292148.QAA01837@cj20424-a.reston1.va.home.com> References: <20011121154833.A3423@ibook.distro.conectiva> <20011126131611.B758@ute.mems-exchange.org> <20011129175406.A6254@ibook.distro.conectiva> <200111292148.QAA01837@cj20424-a.reston1.va.home.com> Message-ID: <20011129204512.B12806@ibook.distro.conectiva> --JYK4vJDZwFMowpUq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > > One more: readline() method (and probably others) now raise an > > exception if called for a directory. It used to return an empty > > string. I haven't found this in Python's sourceforge changelog either. >=20 > Can you add this to the SF bug manager with a piece of sample code > that shows what you did, what you expected, and what happened instead? > I can't parse "call readline() for a directory." I thought this has been changed on purpose. Nevertheless, I've filled a bug at sourceforge (#487277). Here is what I meant: Python 2.1 (#1, Jun 22 2001, 17:13:13)=20 [GCC 2.95.3 20010315 (release) (conectiva)] on linux-i386 Type "copyright", "credits" or "license" for more information. >>> open("/etc").readline() '' Python 2.2b2+ (#1, Nov 27 2001, 21:39:35)=20 [GCC 2.95.3 20010315 (release) (conectiva)] on linux-ppc Type "help", "copyright", "credits" or "license" for more information. >>> open("/etc").readline() Traceback (most recent call last): File "", line 1, in ? IOError: [Errno 21] Is a directory --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --JYK4vJDZwFMowpUq Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8Brp3IlOymmZkOgwRAiDDAJwLSGpp4LioKNrQEs3UmeqFSWlKdACdEVsU QRjuzauPFpWMTDp29lD2180= =8zgM -----END PGP SIGNATURE----- --JYK4vJDZwFMowpUq-- From nas@python.ca Thu Nov 29 22:58:35 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 29 Nov 2001 14:58:35 -0800 Subject: [Python-Dev] gc.garbage In-Reply-To: ; from tim@zope.com on Thu, Nov 29, 2001 at 05:25:12PM -0500 References: <20011126142126.A26497@glacier.arctrix.com> Message-ID: <20011129145835.A2215@glacier.arctrix.com> Tim Peters wrote: > Neil Schemenauer: > > Is there some way to prevent people from assigning to certain module > > variables? > > Not that I know of. If you're terribly concerned, gc could look up > "garbage" in its dict on each access. That's what, e.g., PRINT_ITEM does > with sys.stdout. Then it would also have to check that it's a list, etc. What would happen if it's not a list? PRINT_ITEM raises RuntimeError. I suppose the collector could do the same. > But I'd be keener to see new words spelling out which parts of the gc > interface are and aren't intended "to work" across releases ... All of them? :-) Seriously, there could come a time when GC can no longer be disabled. The debugging and threshold stuff is fairly implementation dependent. get_referrers() and get_objects() are highly implementation dependent. I suppose gc.collect() should always be available. Anything else is fair game, IMHO. Incidentally, I can't say I'm happy with GC as it stands. It uses too much memory now that so many objects are tracked. I had worked on the idea of a separate heap for GC objects for a while but couldn't figure out how to make generational collection to work. As Don Beaudry's sig says: "so much code, so little time". :-) Neil From nas@python.ca Thu Nov 29 23:05:10 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 29 Nov 2001 15:05:10 -0800 Subject: [Python-Dev] Re: _PyTuple_Resize() in 2.2 In-Reply-To: <20011129204512.B12806@ibook.distro.conectiva>; from niemeyer@conectiva.com on Thu, Nov 29, 2001 at 08:45:12PM -0200 References: <20011121154833.A3423@ibook.distro.conectiva> <20011126131611.B758@ute.mems-exchange.org> <20011129175406.A6254@ibook.distro.conectiva> <200111292148.QAA01837@cj20424-a.reston1.va.home.com> <20011129204512.B12806@ibook.distro.conectiva> Message-ID: <20011129150510.B2215@glacier.arctrix.com> Gustavo Niemeyer wrote: > Python 2.1 (#1, Jun 22 2001, 17:13:13) > [GCC 2.95.3 20010315 (release) (conectiva)] on linux-i386 > Type "copyright", "credits" or "license" for more information. > >>> open("/etc").readline() > '' I'm pretty sure patch 2.117 to fileobject.c caused this change in behavior. Here is the log message: date: 2001/08/09 18:14:59; author: gvanrossum; state: Exp; lines: +6 -0 Apply anonymous SF patch #441229. Previously, f.read() and f.readlines() checked for errors on their file object and possibly raised an IOError, but f.readline() didn't. This patch makes f.readline() behave like the others. Note that I've added a call to clearerr() since the other calls to ferror() include that too. I have no way to test this code. :-) Try open("/etc").read() with 2.1. I get an IOError exception. Neil From martin@v.loewis.de Thu Nov 29 23:23:00 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 30 Nov 2001 00:23:00 +0100 Subject: [Python-Dev] gc.garbage In-Reply-To: <20011129145835.A2215@glacier.arctrix.com> (message from Neil Schemenauer on Thu, 29 Nov 2001 14:58:35 -0800) References: <20011126142126.A26497@glacier.arctrix.com> <20011129145835.A2215@glacier.arctrix.com> Message-ID: <200111292323.fATNN0U02096@mira.informatik.hu-berlin.de> > > But I'd be keener to see new words spelling out which parts of the gc > > interface are and aren't intended "to work" across releases ... > > All of them? :-) I wish the C API hadn't been changed for 2.2, rendering useless all code that had been created to support GC in 2.0 and 2.1. Regards, Martin From skip@pobox.com (Skip Montanaro) Fri Nov 30 01:35:05 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 29 Nov 2001 19:35:05 -0600 Subject: [Python-Dev] Re: problems with DBM nonuniformity In-Reply-To: References: Message-ID: <15366.57929.747462.677279@beluga.mojam.com> Jason> On Linux, "whitelist.db" = whitelist.db when the default module is Jason> dbhash, so there is no problem: ... Jason> However on Solaris for example (default module of dbm): Jason> % python -c "import anydbm;wl = anydbm.open('whitelist.db','c')" Jason> % ls -l whitelist* Jason> -rw------- 1 jason users 0 Nov 29 19:13 whitelist.db.dir Jason> -rw------- 1 jason users 0 Nov 29 19:13 whitelist.db.pag Jason> Under Linux, the dbm module acts differently, adding a '.db' Jason> suffix to the given filename: Jason> % python -c "import dbm;wl = dbm.open('whitelist.db','c')" Jason> % ls -l whitelist* Jason> -rw------- 1 jasonrm acl 16384 Nov 29 17:21 whitelist.db.db Jason> So, I can't rely on comparing "whitelist" with "whitelist.db" Jason> filename to filename since the latter might not exist. ... Jason> Does anyone have some suggestions on how I might support this Jason> feature in a cross-platform, and generic fashion? Seems to me the natural thing to do would be to add "get_data_filename" and "get_index_filename" methods (or something similar) to the underlying modules (dbhash, bsddb, dbm, etc) and expose them through anydbm. It's too late for 2.2, but I suspect if you implemented something and method name(s) could be settled on it would make it into CVS early in the 2.3 cycle. This seems like a small enough change that you just file a bug report on SourceForge with the proposal and add an implementation when you have something workable. -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From fdrake@acm.org Fri Nov 30 06:10:49 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 30 Nov 2001 01:10:49 -0500 (EST) Subject: [Python-Dev] [development doc updates] Message-ID: <20011130061049.B6ACB28696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Updated the httplib docs to cover the API provided in recent versions of Python. Please review and comment on this and all other new material! (In case you have forgetten, other large chunks of new material include the sections covering modules in the email package, the compiler package, and the Tkinter/Tix chapter.) From mal@lemburg.com Fri Nov 30 09:31:03 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 30 Nov 2001 10:31:03 +0100 Subject: [Python-Dev] Subclassing varying length types (What's a PyStructSequence ?) References: <3C02A131.373BE8A4@lemburg.com> <200111262231.fAQMVDJ01693@mira.informatik.hu-berlin.de> <3C036A42.CCD62BB3@lemburg.com> <200111290127.fAT1Rae02898@blinking-device.reputation.com> <3C05FC83.5F87556D@lemburg.com> <200111291022.fATAMuF01381@mira.informatik.hu-berlin.de> <3C062285.84305AB@lemburg.com> <200111292151.fATLpkp01348@mira.informatik.hu-berlin.de> Message-ID: <3C0751D7.C17A7FC@lemburg.com> "Martin v. Loewis" wrote: > > > The discussion on SF doesn't really answer my question. What > > Nick did is fascinating: he reused the type object implementation > > to mimic a sequence ! That's cool, but looks like an awfully > > tricky way of doing something straight forward such as sub-classing > > the tuple type to extend it with an additional dictionary. > > So the question remains: why did Nick *have* to implement this > > as meta-type ? > > For one thing, you'll see from the discussion that extending the tuple > type with an additional dict is non-trivial: You cannot define a C > data type that does this. You'll also see that there was a version > that did it, and that it was rejected precisely because of this problem. Ok, now we're getting somewhere... you're saying that Python types using the PyObject struct itself to store variable size data cannot be subclassed in C. Even though it's not trivial as you indicate, this should well be possible by appending new object data to the end of the allocated data field -- not very elegant, but still a way to cope with the problem. However, Nick's approach of creating a new type from a template by using the fact that the list of known name-to-index mappings is not going to change for instances of the type makes things a lot cleaner, since now the mapping can live in the type definition rather than the instance (which may very well be a varying length PyObject type). Hmm, this makes me wonder: perhaps we should start thinking about phasing out varying length PyObjects in the interpreter... esp. the inability to subclass strings looks like a bummer for future extensions of this particular type. Unicode doesn't have this problem, BTW. Or we need to come up with a fairly nice way of making subclassing varying length types a lot easier, e.g. by adding a special pointer ob_ext to PyObject_VAR_HEAD which then allows declaring type extensions in an malloced buffer. Thoughts ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Nov 30 09:13:37 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 30 Nov 2001 10:13:37 +0100 Subject: [Python-Dev] gc.garbage References: <20011126142126.A26497@glacier.arctrix.com> <20011129145835.A2215@glacier.arctrix.com> Message-ID: <3C074DC1.3AEDD630@lemburg.com> Neil Schemenauer wrote: > > Tim Peters wrote: > > But I'd be keener to see new words spelling out which parts of the gc > > interface are and aren't intended "to work" across releases ... > > All of them? :-) Seriously, there could come a time when GC can no > longer be disabled. Please don't remove this option ! Writing code which does not introduce cycles or knows how to break them is fairly easy -- removing the option would make everyone pay for careless programming of a few. > The debugging and threshold stuff is fairly > implementation dependent. get_referrers() and get_objects() are highly > implementation dependent. I suppose gc.collect() should always be > available. Anything else is fair game, IMHO. > > Incidentally, I can't say I'm happy with GC as it stands. It uses too > much memory now that so many objects are tracked. I had worked on the > idea of a separate heap for GC objects for a while but couldn't figure > out how to make generational collection to work. As Don Beaudry's sig > says: "so much code, so little time". :-) Another reason not to remove the option. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jim@interet.com Fri Nov 30 17:07:03 2001 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 30 Nov 2001 12:07:03 -0500 Subject: [Python-Dev] ArtifactFile: I can't change my patch files Message-ID: <3C07BCB7.D9043F1E@interet.com> I am trying to replace files in my patch #483466, but I am getting File delete: ArtifactFile: Permission Denied on SourceForge. If I need to be a member of the Python project to do this, could someone please add me? Or tell me what's wrong? The new files implement the changes to sys.path[0] that we have been discussing. I still need to make further changes to import.c so that "" and relative paths are handled. So don't install this patch yet, but look it over for problems if you want. Jim From jason-dated-1007831845.6c6a51@mastaler.com Fri Nov 30 17:17:24 2001 From: jason-dated-1007831845.6c6a51@mastaler.com (Jason R. Mastaler) Date: Fri, 30 Nov 2001 10:17:24 -0700 Subject: [Python-Dev] Re: problems with DBM nonuniformity In-Reply-To: <15366.57929.747462.677279@beluga.mojam.com> (Skip Montanaro's message of "Thu, 29 Nov 2001 19:35:05 -0600") References: <15366.57929.747462.677279@beluga.mojam.com> Message-ID: Skip Montanaro writes: > Seems to me the natural thing to do would be to add > "get_data_filename" and "get_index_filename" methods (or something > similar) to the underlying modules (dbhash, bsddb, dbm, etc) and > expose them through anydbm. I see. So you agree that with the current implementation, there isn't a reliable way to do what I'm trying to do with DBM? > It's too late for 2.2, but I suspect if you implemented something > and method name(s) could be settled on it would make it into CVS > early in the 2.3 cycle. This seems like a small enough change that > you just file a bug report on SourceForge with the proposal and add > an implementation when you have something workable. I'm not sure when I'll be able to get to this, but I'll put it on my TODO list. In the meantime, I think I'll just support the auto regeneration feature I mentioned with CDB[1] instead of DBM since its file interface is consistent across platforms. Footnotes: 1. python-cdb extension module (http://pilcrow.madison.wi.us/) -- (TMDA - http://tmda.sourceforge.net) (Python-based SPAM reduction system) From martin@v.loewis.de Fri Nov 30 20:22:24 2001 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 30 Nov 2001 21:22:24 +0100 Subject: [Python-Dev] -DINET6 in Makefile In-Reply-To: <3C050CAF.2A5CEDC5@lemburg.com> (mal@lemburg.com) References: <3C050CAF.2A5CEDC5@lemburg.com> Message-ID: <200111302022.fAUKMO002089@mira.informatik.hu-berlin.de> > What's the reasoning behind putting -DINET6 into the default > compiler options of the generic Makefile ? I believe the sole reason is that the author of the patch didn't know how to get it into pyconfig.h. itojun recently confirmed that all uses of the INET6 can be replaced with ENABLE_IPV6, and that the define may go away. I hesitate to change that, though, since some of the IPv6 implementations *may* require that INET6 is defined when processing the "system" headers (not all IPv6 implementations we support actually come with the operating system). > I'm just asking because such a define will be inherited by > all extensions being compiled with distutils and the Makefile.pre.in > setup process... sounds like trouble if you ask me. What kind of trouble? > Shouldn't the define be placed into the pyconfig.h file or > only in those extensions which need it ? Wouldn't it cause the same trouble there? Regards, Martin From mal@lemburg.com Fri Nov 30 20:46:14 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 30 Nov 2001 21:46:14 +0100 Subject: [Python-Dev] -DINET6 in Makefile References: <3C050CAF.2A5CEDC5@lemburg.com> <200111302022.fAUKMO002089@mira.informatik.hu-berlin.de> Message-ID: <3C07F016.AEC20158@lemburg.com> "Martin v. Loewis" wrote: > > > What's the reasoning behind putting -DINET6 into the default > > compiler options of the generic Makefile ? > > I believe the sole reason is that the author of the patch didn't know > how to get it into pyconfig.h. itojun recently confirmed that all uses > of the INET6 can be replaced with ENABLE_IPV6, and that the define may > go away. > > I hesitate to change that, though, since some of the IPv6 > implementations *may* require that INET6 is defined when processing > the "system" headers (not all IPv6 implementations we support actually > come with the operating system). For Python's own use, it should suffice defining the symbol in pyconfig.h. > > I'm just asking because such a define will be inherited by > > all extensions being compiled with distutils and the Makefile.pre.in > > setup process... sounds like trouble if you ask me. > > What kind of trouble? The symbol could enable some logic which may not be desired by the application, e.g. cause system includes to change, socket semantics of wrapped libs could also be affected etc. > > Shouldn't the define be placed into the pyconfig.h file or > > only in those extensions which need it ? > > Wouldn't it cause the same trouble there? No, because the pyconfig.h import is under extension control (e.g. you can first include the system or lib header files and only then import pyconfig.h). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Fri Nov 30 23:30:43 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 30 Nov 2001 18:30:43 -0500 Subject: [Python-Dev] gc.garbage In-Reply-To: <20011129145835.A2215@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > Is there some way to prevent people from assigning to certain module > variables? [Tim] >> Not that I know of. If you're terribly concerned, gc could look up >> "garbage" in its dict on each access. That's what, e.g., >> PRINT_ITEM does with sys.stdout. ... [Neil] > What would happen if it's not a list? PRINT_ITEM raises RuntimeError. > I suppose the collector could do the same. Sure, that would be fine. >> But I'd be keener to see new words spelling out which parts of the gc >> interface are and aren't intended "to work" across releases ... > All of them? :-) Seriously, there could come a time when GC can no > longer be disabled. The debugging and threshold stuff is fairly > implementation dependent. get_referrers() and get_objects() are highly > implementation dependent. I suppose gc.collect() should always be > available. Anything else is fair game, IMHO. I meant "new words" in the docs, not on Python-Dev . > Incidentally, I can't say I'm happy with GC as it stands. Well, you're young and hopeful -- you'll get over both. I have, and am indeed happy with GC as it stands. > It uses too much memory now that so many objects are tracked. There I disagree, but subtly: it always used "too much" memory. The marginal memory cost in adding a gazillion new tracked types was minor, as very few programs have a gazillion frame objects or traceback objects or generator-iterator objects (etc) sitting around. The vast bulk of the damage was done the instant lists, tuples, dicts and instances got tracked. So it goes. > I had worked on the idea of a separate heap for GC objects for a while > but couldn't figure out how to make generational collection to work. Generational gimmicks are rare in non-copying collectors for this very reason, right? > As Don Beaudry's sig says: "so much code, so little time". :-) Time for Don to change his sig -- his young and hopeful days should be long gone by now too . From tim.one@home.com Fri Nov 30 23:30:44 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 30 Nov 2001 18:30:44 -0500 Subject: [Python-Dev] gc.garbage In-Reply-To: <200111292323.fATNN0U02096@mira.informatik.hu-berlin.de> Message-ID: [Martin v. Loewis] > I wish the C API hadn't been changed for 2.2, rendering useless all > code that had been created to support GC in 2.0 and 2.1. Would we really need more than one hand to count all that code <0.9 wink>? not-aware-of-any-myself-ly y'rs - tim From tim.one@home.com Fri Nov 30 23:35:23 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 30 Nov 2001 18:35:23 -0500 Subject: [Python-Dev] Subclassing varying length types (What's a PyStructSequence ?) In-Reply-To: <3C0751D7.C17A7FC@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Hmm, this makes me wonder: perhaps we should start thinking > about phasing out varying length PyObjects in the interpreter... No chance, IMO: the memory savings is too great. > esp. the inability to subclass strings looks like a bummer for > future extensions of this particular type. Unicode doesn't have > this problem, BTW. OTOH, I know someone at Zope Corp who could testify with force about the memory burden of switching to Unicode strings -- if you've got gobs of 'em, it's much worse than a factor of 2 blowup. Moving to obmalloc.c should help that a lot (two malloc overheads per Unicode string, and obmalloc overheads are much lower). > Or we need to come up with a fairly nice way of making > subclassing varying length types a lot easier, e.g. by > adding a special pointer ob_ext to PyObject_VAR_HEAD > which then allows declaring type extensions in an malloced > buffer. > > Thoughts ? Not a one .