From greg@cosc.canterbury.ac.nz Mon Jul 1 02:00:58 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Jul 2002 13:00:58 +1200 (NZST) Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <01f201c21f69$f7600f10$ced241d5@hagrid> Message-ID: <200207010100.g6110wJ28436@oma.cosc.canterbury.ac.nz> Fredrik Lundh : > Tim warned me that the mere attempt to read sources for > existing RE implementations was a sure way to destroy my > brain, so I avoided that. Oh, no! Does that mean I should attach a health warning to the source of Plex??? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From xscottg@yahoo.com Mon Jul 1 03:02:55 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 30 Jun 2002 19:02:55 -0700 (PDT) Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() In-Reply-To: <006801c22072$00c890a0$88e97ad1@othello> Message-ID: <20020701020255.17441.qmail@web40110.mail.yahoo.com> --- Raymond Hettinger wrote: > RH> > The change was based on the advice I got. > TP > Wasn't that an empty set? > > Not unless Scott Gilbert is a null: > > SG > > > "... So the best bet would be to have it just always return a > string..." > I'm pretty close to a null. :-) Besides I don't think my comments to you made the list. At least I don't remember CC'ing them to python-dev... I'd be happy to foward those messages to the list if there is interest, but I don't think there is <0.5 grin>. I personally don't have much stake in the buffer object. It looked like something that would be useful for several things that I'm interested in, but when I looked closer I realized it just isn't. If it's politically the correct thing to leave it broken, then that gets my unneeded blessing. It would be nice if _that_ decision was documented somewhere instead of everything just getting quiet when the topic is brought up. Tim has said before that this is one of those yearly pointless discussions. I would have read Guido's essay on the topic if I knew how to find it... As I've said before though, a mutable byte array object that pickled efficiently, could be constructed from a pointer & destructor, and promised not to invalidate your pointer when the GIL is released would be useful. And it looks like Guido's long lost essay seems to concur with this in a few places. Asynchronous file I/O, concurrent calculation on numeric arrays, page aligned memory for DMA transfers, all sorts of other goodies could use something like this. Of course the buffer object can't be used for any of these. Guido's essay seems to indicate that one of the reasons not to add something like this is because there is no equivalent in Java, and therefore Jython. I don't find that motivating. Let Jython be portable in the Java sense of the word, and let Python be powerful everywhere there is a working C compiler... __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From bsder@mail.allcaps.org Mon Jul 1 03:23:29 2002 From: bsder@mail.allcaps.org (Andrew P. Lentvorski) Date: Sun, 30 Jun 2002 19:23:29 -0700 (PDT) Subject: [Python-Dev] XML module causes profiler to throw Message-ID: <20020630191302.J5810-100000@mail.allcaps.org> This was reported in bug 534864, but seems to have been left to rot. At the very least, I'd like to bump it's importance up in case there is a later bugfix version for 2.2 (aka 2.2.2 or something). What is it's status? Is there a workaround? What is the diagnosis? I find it hard to believe that others haven't tripped across this (especially somebody in the Zope team). I can at least add my voice such that it is not specific to one type of installation (his is Red Hat 7.1--mine is FreeBSD 4.6) Thanks, -a From tim.one@comcast.net Mon Jul 1 05:04:37 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 00:04:37 -0400 Subject: [Python-Dev] XML module causes profiler to throw In-Reply-To: <20020630191302.J5810-100000@mail.allcaps.org> Message-ID: [Andrew P. Lentvorski, about ] > This was reported in bug 534864, but seems to have been left to rot. > At the very least, I'd like to bump it's importance up in case there is a > later bugfix version for 2.2 (aka 2.2.2 or something). > > What is it's status? Is there a workaround? What is the diagnosis? Everything known about it is in the bug report. > I find it hard to believe that others haven't tripped across this > (especially somebody in the Zope team). I can at least add my voice such > that it is not specific to one type of installation (his is Red Hat > 7.1--mine is FreeBSD 4.6) Posting this info to Python-Dev doesn't do any good. Add it to the bug report! Be sure to say which version of Python you were using (the report only mentioned 2.2; there's not even any info there about 2.2.1). The last activity that report saw was Martin saying he couldn't reproduce it in 2.3a0, and to date nobody has added a comment saying they could reproduce it. Based on what's there (an unconfirmed report against 2.2, and a failure to reproduce under CVS), I wouldn't give it high priority either. From tim.one@comcast.net Mon Jul 1 06:45:20 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 01:45:20 -0400 Subject: [Python-Dev] Some dull gc stats Message-ID: I checked in a surprisingly large patch to change the way we collect a generation. Before: Things directly reachable from outside the young generation were moved into a 'reachable' set in one pass. Things indirectly reachable from outside the young generation were moved into 'reachable' in a second pass. The 'young' list then contained the unreachable objects. After: Things unreachable from outside the young generation are moved into an 'unreachable' set in one pass. The 'young' list contains the reachable objects when that's done. The point was that almost everything is reachable in the end, and moving an object between lists costs six pointer stores (updating prev and next pointers in the object, and in each of the two lists). So if most stuff is doomed to be reachable in the end, better to move the unreachable stuff than to move the reachable stuff. This seems to be a nice little win. If you want to know more, read the comments. An instrumented version showed this over a run of the Python test suite: scanned 7437363 moved 36854 movedback 34389 where scanned # of times the loop in move_unreachable() went around == # of objects moved # of times "if (gc->gc.gc_refs == 0)" in move_unreachable() triggered == # of objects moved into an unreachable set movedback # of times "if (gc_refs == GC_TENTATIVELY_UNREACHABLE)" in visit_reachable() triggered == the number of times move_unreachable() guessed wrong and an object had to be moved back into a reachable set So the change saved about 7e6 object moves here, for 6x as many pointer stores, and gc is finding very little that's unreachable in the end. Surprisingly, the worst (for some technical meaning of "worst" I'll leave to your imagination) stats I've seen came out of running Zope3's test suite: scanned 649444 moved 56124 movedback 43576 It's surprising for lots of reasons, including how relatively little work gc is doing in total, and how relatively much was found to be unreachable (about 12 thousand objects). The latter is surprising because Zope code has traditionally tried like heck not to create cycles. The Python test suite *almost* gave "the best" (most favorable to the change) stats I've seen. Only a variant of Kevin Jacobs's little test case looked better so far: scanned 12322200 moved 244 movedback 244 It would be nicer if we could drive scanned there down to 0 . From martin@v.loewis.de Mon Jul 1 06:56:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Jul 2002 07:56:22 +0200 Subject: [Python-Dev] XML module causes profiler to throw In-Reply-To: <20020630191302.J5810-100000@mail.allcaps.org> References: <20020630191302.J5810-100000@mail.allcaps.org> Message-ID: "Andrew P. Lentvorski" writes: > What is it's status? Is there a workaround? What is the diagnosis? The status is that it is unreproducable. I just tried with the Python 2.2. Unless there is some independent verification of the problem, I'm going to close it as unreproducable. Regards, Martin From martin@v.loewis.de Mon Jul 1 07:30:38 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Jul 2002 08:30:38 +0200 Subject: [Python-Dev] Some dull gc stats In-Reply-To: References: Message-ID: Tim Peters writes: > I checked in a surprisingly large patch to change the way we collect a > generation. Do you think this should be backported to 2.2.2 as well? Regards, Martin From oren-py-d@hishome.net Mon Jul 1 07:31:20 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 1 Jul 2002 09:31:20 +0300 Subject: [Python-Dev] String interning Message-ID: <20020701093120.A3499@hishome.net> I was looking into the string interning code to see how it works. Here are my observations: A Python string can be in one of three states: 1. Not interned: s->ob_sinterned == NULL 2. Directly-interned: s->ob_sinterned == s 3. Indirectly interned. ob_sinterned points to another string: s->ob_sinterned != s && s->ob_sinterned != NULL Indirectly interned strings are quite rare. Creating one requires that a string already exist in the interned dictionary and that an equal string with multiple references be interned. The reference used for internining is replaced with the previously interned string. The other references to the same string will become indirectly interned. References to the ob_sinterned field are found in in stringobject.c and dictobject.c and, inexplicably, in Mac/Python/macimport.c In stringobject.c most references to ob_sinterned are to initialize it. The only place that uses it is string_hash: if ob_sinterned is not NULL it uses the hash of the string it points to instead of the current string object. If the string is directly interned this is just a longer way of doing the same thing. If the string is indirectly interned this is merely redundant because the hash of the two strings should be equal (I hope so!). The only thing this test could have saved is recalculating the hash from the string if the cached hash is zero. This doesn't happen because if the string is indirectly interned it has been used as a key during interning which initializes its cached hash. In dictobject.c the only reference to ob_sinterned is in PyDict_SetItem: if the string is interned it uses the ob_sinterned pointer as the key instead of the argument. This could only make a difference if the string is indirectly interned. It turns out that this never happens. I couldn't find one occurence of SetItem with an indirectly interned string as key in the regression tests or any other Python code I have tested. Even if this did happen it wouldn't cause any problems to ignore this case - the lookup would still function correctly. Summary: As far as I can tell, indirectly interned strings are redundant. Without them the ob_sinterned field is effectively a boolean flag. The size of all string objects can be reduced by 3 bytes. Can anyone explain why interning is implemented the way it is? Can anyone explain why Mac/Python/macimport.c is messing with ob_sinterned? Oren From martin@v.loewis.de Mon Jul 1 08:03:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Jul 2002 09:03:22 +0200 Subject: [Python-Dev] String interning In-Reply-To: <20020701093120.A3499@hishome.net> References: <20020701093120.A3499@hishome.net> Message-ID: Oren Tirosh writes: > In stringobject.c most references to ob_sinterned are to initialize it. The > only place that uses it is string_hash: if ob_sinterned is not NULL it uses > the hash of the string it points to instead of the current string object. This is not true: PyString_InternInPlace has if ((t = s->ob_sinterned) != NULL) { which checks whether the string being interned had been interned before. > Summary: As far as I can tell, indirectly interned strings are redundant. > Without them the ob_sinterned field is effectively a boolean flag. > > Can anyone explain why interning is implemented the way it is? Can anyone > explain why Mac/Python/macimport.c is messing with ob_sinterned? I'm not sure what meaning you would assiocate with the boolean flag. If this is meant to denote "this is an interned string", then if ((t = s->ob_sinterned) != NULL) { if (t == (PyObject *)s) return; would become if (s->ob_isinterned) return; To see the difference, I added if ((t = s->ob_sinterned) != NULL) { if (t == (PyObject *)s) return; fprintf(stderr, "reinterning\n"); If that code prints "reinterning", it can efficiently intern the argument, but couldn't with your change. I agree that this is very rare, but in the test suite, it triggers 5 times in test_descr. > The size of all string objects can be reduced by 3 bytes. That is not true. Taking a 32-bit architecture, and considering that each string has 16 bytes minimum storage (without ob_sinterned), and taking into account the 8-byte clustering of pymalloc, we get stringsize current-storage new-storage savings 0 24 24 0 1 24 24 0 2 24 24 0 3 24 24 0 4 32 24 8 5 32 24 8 6 32 24 8 7 32 32 0 So the size reduction depends on the actual length of the strings; it's 3 bytes only on average, assuming a uniform distribution of string sizes. Regards, Martin From oren-py-d@hishome.net Mon Jul 1 09:00:21 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 1 Jul 2002 04:00:21 -0400 Subject: [Python-Dev] String interning In-Reply-To: References: <20020701093120.A3499@hishome.net> Message-ID: <20020701080020.GA62710@hishome.net> On Mon, Jul 01, 2002 at 09:03:22AM +0200, Martin v. Loewis wrote: > Oren Tirosh writes: > > > In stringobject.c most references to ob_sinterned are to initialize it. The > > only place that uses it is string_hash: if ob_sinterned is not NULL it uses > > the hash of the string it points to instead of the current string object. > > This is not true: PyString_InternInPlace has I meant references to ob_sinterned outside the actual implementation of PyString_InternInPlace. > > Summary: As far as I can tell, indirectly interned strings are redundant. > > Without them the ob_sinterned field is effectively a boolean flag. > > > > Can anyone explain why interning is implemented the way it is? Can anyone > > explain why Mac/Python/macimport.c is messing with ob_sinterned? > > I'm not sure what meaning you would assiocate with the boolean > flag. "This string is interned. It is equal to another interned strings iff they are the same object" ... > If that code prints "reinterning", it can efficiently intern the > argument, but couldn't with your change. > > I agree that this is very rare, but in the test suite, it triggers 5 > times in test_descr. test_descr is not exactly typical Python code... What bothers me is that of the two places that check if a string is interned one is a no-op and the other never happens. Oren From bsder@mail.allcaps.org Mon Jul 1 09:08:48 2002 From: bsder@mail.allcaps.org (Andrew P. Lentvorski) Date: Mon, 1 Jul 2002 01:08:48 -0700 (PDT) Subject: [Python-Dev] XML module causes profiler to throw In-Reply-To: Message-ID: <20020701010104.L6236-100000@mail.allcaps.org> Thanks, that was the info I needed. I'll work on trying to create a small program that people can run. It is not *completely* unreproduceable. I have found three different filings about this floating around in various places. I will collect the references tomorrow and put them into the main bug report. -a On 1 Jul 2002, Martin v. Loewis wrote: > "Andrew P. Lentvorski" writes: > > > What is it's status? Is there a workaround? What is the diagnosis? > > The status is that it is unreproducable. I just tried with the Python > 2.2. Unless there is some independent verification of the problem, I'm > going to close it as unreproducable. > > Regards, > Martin > From mal@lemburg.com Mon Jul 1 09:21:50 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Jul 2002 10:21:50 +0200 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> Message-ID: <3D20111E.2090203@lemburg.com> Fredrik Lundh wrote: > raymond wrote: > > > >>As far as I can tell, buffer() is one of the least used or known about >>Python tools. What do you guys think about this as a candidate for silent >>deprecation (moving out of the primary documentation)? > > > +1, in theory. > > does anyone have any real-life use cases? I've never been > able to use it for anything, and cannot recall ever seeing it > being used by anyone else... > > (it sure doesn't work for the use cases I thought of when > first learning about the API...) -1. I use it in real-life applications to wrap binary data. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Oleg Broytmann Mon Jul 1 11:39:14 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Mon, 1 Jul 2002 14:39:14 +0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: ; from tim.one@comcast.net on Sun, Jun 30, 2002 at 04:32:35PM -0400 References: <20020701002535.A1510@phd.pp.ru> Message-ID: <20020701143913.A6446@phd.pp.ru> On Sun, Jun 30, 2002 at 04:32:35PM -0400, Tim Peters wrote: > [Oleg Broytmann] > > I think I can reduce this, but I am afraid the data structure > > still will be large, > > That doesn't matter. It's the amount of *code* we don't understand and have > to learn that matters. If you could reduce this to a gigabyte of pickle > input that we only need to feed into pickle, that would be great. Ok. From today I have a lot of spare time and very good almost free Internet connection, so I can investigate things. I can post the results of my investigation to the developers list or to the c.l.py, if anyone is interested. > > That what I don't want to do - file a mysterious bug report. > > That's what bug reports are best for! Hmm, I thought they are not, as those mysterious bug reports take up space and time - someone have to read it, at least; but they are not help in any way. > Now you've got comments about your > bug scattered across comp.lang.python and python-dev, and nobody will be > able to find them again. Attaching new info to a shared bug report is much > more effective. Ah, I see now. I am strictly attached to email and email archives, and I am always hating web-based collaboration tools, but you made a good point. Still, life is too short to spend it in the SF slooow interface :( Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jacobs@penguin.theopalgroup.com Mon Jul 1 11:44:52 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 1 Jul 2002 06:44:52 -0400 (EDT) Subject: [Python-Dev] Some dull gc stats In-Reply-To: Message-ID: On Mon, 1 Jul 2002, Tim Peters wrote: > I checked in a surprisingly large patch to change the way we collect a > generation. >[...] > The point was that almost everything is reachable in the end, and moving an > object between lists costs six pointer stores (updating prev and next > pointers in the object, and in each of the two lists). So if most stuff is > doomed to be reachable in the end, better to move the unreachable stuff than > to move the reachable stuff. >[...] > It would be nicer if we could drive scanned there down to 0 . This change may be a short-term win if I can get Jeremy's idea working. It involves temporarily untracking objects with known external roots, so many more objects become unreachable. These include objects stored on the c-eval stack, local variables in the current frame, and possibly other select places. I have no idea if this approach will make enough of a difference to be worthwhile, but it seems like a worthy experiment. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jepler@unpythonic.net Mon Jul 1 12:46:44 2002 From: jepler@unpythonic.net (jepler@unpythonic.net) Date: Mon, 1 Jul 2002 06:46:44 -0500 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020630234820.A1006@phd.pp.ru> References: <20020630234820.A1006@phd.pp.ru> Message-ID: <20020701064639.A1003@unpythonic.net> Is this posted as an SF bug report yet? I have a small program which can hit the recursion limit in pickle and cause sig11 in cPickle. It's a very deep nested tuple in this case. If the first 'pickle.dump()' call is not commented out, the program dies with "RuntimeError: maximum recursion depth exceeded". If the second bit of code is executed, it dies with segmentation violation. On my system, redhat 7.2, the stack in the main thread is very large, but the stack in other threads is very small. On systems where the main stack is smaller, just running cPickle.dump(x, open("/dev/null", "w")) should show the problem, no threads needed. Probably some sort of stack check should be present in cPickle, but there's nothing much that can be done about data structures that are so deeply recursive that they fill the stack. Well, pickle could be rewritten to be iterative, but that's a tall order. (traceback from cPickle: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1026 (LWP 1198)] 0x4004912c in __new_sem_post (sem=0x8154358) at semaphore.c:137 137 semaphore.c: No such file or directory. in semaphore.c (gdb) where #0 0x4004912c in __new_sem_post (sem=0x8154358) at semaphore.c:137 #1 0x08095ec2 in PyThread_release_lock (lock=0x8154358) at Python/thread_pthread.h:412 #2 0x08079a4d in PyEval_SaveThread () at Python/ceval.c:329 #3 0x4025b09d in write_file (self=0x80ff3e8, s=0x4025db54 "(", n=1) at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:414 #4 0x4025396a in save_tuple (self=0x80ff3e8, args=0x4062a66c) at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1340 #5 0x40254f18 in save (self=0x80ff3e8, args=0x4062a66c, pers_save=0) at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1944 #6 0x402539b1 in save_tuple (self=0x80ff3e8, args=0x4062a68c) at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1350 ...) import pickle, cPickle x = () for i in range(100000): x = (x,) #pickle.dump(x, open("/dev/null", "w")) import thread, time thread.start_new_thread(cPickle.dump, (x, open("/dev/null", "w"))) time.sleep(1000) From Oleg Broytmann Mon Jul 1 12:54:31 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Mon, 1 Jul 2002 15:54:31 +0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020701064639.A1003@unpythonic.net>; from jepler@unpythonic.net on Mon, Jul 01, 2002 at 06:46:44AM -0500 References: <20020630234820.A1006@phd.pp.ru> <20020701064639.A1003@unpythonic.net> Message-ID: <20020701155431.B6446@phd.pp.ru> On Mon, Jul 01, 2002 at 06:46:44AM -0500, jepler@unpythonic.net wrote: > Is this posted as an SF bug report yet? Not yet. > I have a small program which can hit the recursion limit in pickle and > cause sig11 in cPickle. It's a very deep nested tuple in this case. In my case the data strucrure is more complex, but less deep. It is a tree of objects (about 3000 objects). I tried to create lesser trees, but the bug disappeared. One interesting thing to note is that when I changed builtin list back to UserList the problem disappeared. That is, my problem is related to pickling new classes. I narrowed the code to just 180 lines. The problem manifests itself after loading initial tree, running inverse linker, and saving data back. Inverse linker runs over all objects in the tree and adds a link to its parent to every object. So I think the bug is in the pickling a data structure with loops. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From pinard@iro.umontreal.ca Mon Jul 1 13:39:41 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 01 Jul 2002 08:39:41 -0400 Subject: [Python-Dev] Re: Infinie recursion in Pickle In-Reply-To: <20020701143913.A6446@phd.pp.ru> References: <20020701002535.A1510@phd.pp.ru> <20020701143913.A6446@phd.pp.ru> Message-ID: [Oleg Broytmann] > On Sun, Jun 30, 2002 at 04:32:35PM -0400, Tim Peters wrote: > > Attaching new info to a shared bug report is much more effective. You know, it is only effective when it works! The SF tracker did not work for me. I filed a bug, and checked it was correctly saved (through tedious paging all over). Then, much later, I received a message from a maintainer saying that my report was empty. It surely was not after I filed it. I guess the SF tracker works only for those using it very often :-). > Ah, I see now. I am strictly attached to email and email archives, and I > am always hating web-based collaboration tools, but you made a good point. > Still, life is too short to spend it in the SF slooow interface :( Slow, hardly usable, and not even dependable. Email might be less black-holish, after all. Moreover, most people (maintainers included) know how to read and file an email. I've a hard time believing people who tell me that maintainers are unable to sort emails without loosing them, or that I can really sort their own email better than they can. I usually praise maintainers as intelligent people. :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From fredrik@pythonware.com Mon Jul 1 14:15:49 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 1 Jul 2002 15:15:49 +0200 Subject: [Python-Dev] Re: Infinie recursion in Pickle References: <20020701002535.A1510@phd.pp.ru><20020701143913.A6446@phd.pp.ru> Message-ID: <05a601c22101$741ac2f0$0900a8c0@spiff> Fran=E7ois Pinard wrote: > Slow, hardly usable, and not even dependable. Email might be less > black-holish, after all. Moreover, most people (maintainers included) > know how to read and file an email. might so be, but such archives are not shared. > I usually praise maintainers as intelligent people. :-) so why not do as we tell you, and post bug reports on SF? or better, join the roundup team, and make sure it's good enough to replace the SF tracker. (as far as I know, it already is -- but someone still needs to set it up, write a script that pulls all data out of the old system, etc). From Jack.Jansen@cwi.nl Mon Jul 1 14:19:15 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 1 Jul 2002 15:19:15 +0200 Subject: [Python-Dev] String interning In-Reply-To: <20020701093120.A3499@hishome.net> Message-ID: <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl> On Monday, July 1, 2002, at 08:31 , Oren Tirosh wrote: > Can anyone > explain why Mac/Python/macimport.c is messing with ob_sinterned? It's all explained in the comment a few lines above where ob_sinterned is used: /* ** If we have interning find_module takes care of interning all ** sys.path components. We then keep a record of all sys.path ** components for which GetFInfo has failed (usually because the ** component in question is a folder), and we don't try opening these ** as resource files again. */ This code gives a considerable speedup for module searches. The reason it's mac-specific is that MacPython allows files on sys.path as well as directories (and these files are searched for PYC resources). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From oren-py-d@hishome.net Mon Jul 1 14:57:25 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 1 Jul 2002 16:57:25 +0300 Subject: [Python-Dev] String interning In-Reply-To: <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl>; from Jack.Jansen@cwi.nl on Mon, Jul 01, 2002 at 03:19:15PM +0200 References: <20020701093120.A3499@hishome.net> <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl> Message-ID: <20020701165725.A10327@hishome.net> On Mon, Jul 01, 2002 at 03:19:15PM +0200, Jack Jansen wrote: > > On Monday, July 1, 2002, at 08:31 , Oren Tirosh wrote: > > Can anyone > > explain why Mac/Python/macimport.c is messing with ob_sinterned? > > It's all explained in the comment a few lines above where ob_sinterned > is used: > /* > ** If we have interning find_module takes care of interning all > ** sys.path components. We then keep a record of all sys.path > ** components for which GetFInfo has failed (usually because the > ** component in question is a folder), and we don't try opening these > ** as resource files again. > */ > > This code gives a considerable speedup for module searches. The reason > it's mac-specific is that MacPython allows files on sys.path as well as > directories (and these files are searched for PYC resources). I guess the clean solution would be to add a PyString_CheckInterened macro. Oren From tismer@tismer.com Mon Jul 1 15:02:22 2002 From: tismer@tismer.com (Christian Tismer) Date: Mon, 01 Jul 2002 16:02:22 +0200 Subject: [Python-Dev] Ann: Stackless 2.2.1 on PowerPC Message-ID: <3D2060EE.7090101@tismer.com> Announcement: Stackless Python Works on PowerPC. The PPC support was much simpler to implement than expected. It was helpful to look into the PPC switch implementation of the ICON language. Thanks to Just van Rossum for giving me access to his machine. Thanks to Armin Rigo for showing me the tricks for x86-unix. There is still no installer available, this is at alpha level. In case you want to build your own Stackless, check out the module stackless from :pserver:anonymous@tismer.com:/home/cvs Updated news can be found at http://www.stackless.com/ have fun - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From fredrik@pythonware.com Mon Jul 1 15:57:00 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 1 Jul 2002 16:57:00 +0200 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> <3D20111E.2090203@lemburg.com> Message-ID: <02e001c2210f$922ca2a0$ced241d5@hagrid> mal wrote: > > does anyone have any real-life use cases? I've never been > > able to use it for anything, and cannot recall ever seeing it > > being used by anyone else... > I use it in real-life applications to wrap binary data. can you elaborate? how do you use it? could it be replaced by something simpler, and still work in your application? would something like this work? class buffer(object): def __len__(...) def __getitem__(...) def __getslice__(...) class basestring(buffer): ... class string(basestring): ... class unicode(basestring): ... From pinard@iro.umontreal.ca Mon Jul 1 16:25:14 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 01 Jul 2002 11:25:14 -0400 Subject: [Python-Dev] Re: Infinie recursion in Pickle In-Reply-To: <05a601c22101$741ac2f0$0900a8c0@spiff> References: <20020701002535.A1510@phd.pp.ru> <20020701143913.A6446@phd.pp.ru> <05a601c22101$741ac2f0$0900a8c0@spiff> Message-ID: [Fredrik Lundh] > > I usually praise maintainers as intelligent people. :-) > so why not do as we tell you, and post bug reports on SF? Because this is not an efficient way to proceed, and does not always work. I do not have much spare time, and would hate seeing it spoiled, fighting with artificial problems coming from tools doomed to be replaced anyway. > or better, join the roundup team, and make sure it's good enough > to replace the SF tracker. I found the SF tracker so unattractive that I've been tempted to do that indeed. Yet, thinking more about it, it is non-sense for me to invest vast amount of energies merely to acquire the capability of submitting numerous little things (as documentation nits, for example). A long while ago, I witnessed that we had to pay real money to machine constructors, yearly, for having the right of submitting reports to them in such a way that we could later use their consequent works. The free software movement turned the values around, and refreshingly underlined that reporting a problem is a contribution from the user to the software maintainer and indirectly, to the community. For many years, we are living a progressive swing-back, in which expenditure of money has been replaced by all the stunts and sufferings induced by inadequate communication tools, like bug trackers. The price to pay is rather high. If users' contributions were really welcome, maintainers would not try to force users into this. It is much easier and comfortable for me to be a mere user, and let others pay the price. However, my principles and education strongly tell me that when something is given to me (like Python), it is only normal and natural trying to give something back. As I contributed many thousands of hours for other projects, I did my share overall, and my own principles are satisfied. Enough for me to refuse a high price ticket, in free time and irritation, before I could offer my work or dedication. Oh, I may come to like bug trackers. But surely, I find extremely distasteful being forced into them. -- François Pinard http://www.iro.umontreal.ca/~pinard From gmcm@hypernet.com Mon Jul 1 16:33:39 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 1 Jul 2002 11:33:39 -0400 Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle) In-Reply-To: <05a601c22101$741ac2f0$0900a8c0@spiff> Message-ID: <3D203E13.22484.1789FB93@localhost> On 1 Jul 2002 at 15:15, Fredrik Lundh wrote: [complaints about SF bug tracker] > or better, join the roundup team, and make sure > it's good enough to replace the SF tracker. (as far > as I know, it already is -- but someone still needs > to set it up, write a script that pulls all data out > of the old system, etc). Someone is setting it up, and has written those scripts. A working demo (populated with PythonLabs history) should soon be available. I have to say that my "good enough" has been focussed on funtionality more than usability. I've never noticed any particular[1] difficulty in using the SF tracker to enter or post info on a bug, so François probably has a different "good enough" :-). -- Gordon http://www.mcmillan-inc.com/ [1] Writing an app as a set of cgi's is a fast way to write a mediocre GUI. Writing a *good* GUI in this environment is extremely difficult and ugly, because you end up with lots of just-barely-portable- by-any-definition-thereof javascript, and Aahz is left out in the cold. Oh well, at least no one had a requirement that it be usable throught their cell phone... From fredrik@pythonware.com Mon Jul 1 16:59:07 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 1 Jul 2002 17:59:07 +0200 Subject: [Python-Dev] Re: Bug tracking References: <3D203E13.22484.1789FB93@localhost> Message-ID: <03e601c22118$3ddbec70$ced241d5@hagrid> gordon wrote: > Someone is setting it up, and has written those > scripts. A working demo (populated with PythonLabs > history) should soon be available. +1 on adding gordon to the standard library, and +1 on deprecating the whinewhinewhine module. From aahz@pythoncraft.com Mon Jul 1 17:02:58 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 1 Jul 2002 12:02:58 -0400 Subject: [Python-Dev] Re: Bug tracking (was: Re: Infinie recursion in Pickle) In-Reply-To: <3D203E13.22484.1789FB93@localhost> References: <05a601c22101$741ac2f0$0900a8c0@spiff> <3D203E13.22484.1789FB93@localhost> Message-ID: <20020701160258.GA26325@panix.com> On Mon, Jul 01, 2002, Gordon McMillan wrote: > > I have to say that my "good enough" has been focussed on funtionality > more than usability. I've never noticed any particular[1] difficulty > in using the SF tracker to enter or post info on a bug, so François > probably has a different "good enough" :-). > > [1] Writing an app as a set of cgi's is a fast way to write a mediocre > GUI. Writing a *good* GUI in this environment is extremely difficult > and ugly, because you end up with lots of just-barely-portable- > by-any-definition-thereof javascript, and Aahz is left out in the > cold. Oh well, at least no one had a requirement that it be usable > throught their cell phone... If it's usable in Lynx, it should be usable on a cell phone. Anyway, I find it difficult to believe that you're having a lot of trouble writing a good GUI with plain HTML, at least by the standards of people here -- a clean, accessible, and functional interface *is* a good GUI. If you've got an URL for testing, I'd be glad to give feedback (and I'll even be willing to fire up Konquerer to cross-check). I'm curious whether you think that Google Groups Advanced Search is a good GUI. What about the Google Groups thread view? Finally, it takes some effort, but it's not *that* hard to use JavaScript that degrades gracefully when JavaScript isn't available. For more info, see http://www.rahul.net/aahz/javascript.html -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim.one@comcast.net Mon Jul 1 18:14:09 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 13:14:09 -0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020701064639.A1003@unpythonic.net> Message-ID: Bug reports are off-topic on Python-Dev, unless the Python developers need this list to collaborate on an implementation change. Please keep bug reports on SourceForge. Like it or not, putting information in an SF bug report is only way your bug has a chance to get addressed. From tim.one@comcast.net Mon Jul 1 18:33:20 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 13:33:20 -0400 Subject: [Python-Dev] Re: Infinie recursion in Pickle In-Reply-To: Message-ID: Fran=E7ois, I swear you spend more time complaining about SF than it = would take to just use it. You're not going to prevail on this, so please = save everyone's time (yours and ours) by skipping the repetition. > ... > The free software movement turned the values around, and refreshing= ly > underlined that reporting a problem is a contribution from the user= to > the software maintainer and indirectly, to the community. Problem reports are certainly appreciated. Problem reports via one-t= o-one email works great for a new open source project with users numbering = in the dozens, but it doesn't scale. Python has hundreds of thousands of us= ers now, and more reports than the sum total of developers can handle. A= ny project of this size has to change how it works. Guido held on to hi= s Guido's-inbox bug reporting system for a year after it totally broke = down, and it took a lot of extra work to recover from the chaos it fell int= o. Despite its flaws, the SF-based trackers work at least a thousand tim= es better than that did in the end. We can't go back. From tim.one@comcast.net Mon Jul 1 18:40:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 13:40:16 -0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020701143913.A6446@phd.pp.ru> Message-ID: [Oleg Broytmann] > Ok. From today I have a lot of spare time and very good almost free > Internet connection, so I can investigate things. Great! > I can post the results of my investigation to the developers list or to > the c.l.py, if anyone is interested. It's off-topic on Python-Dev, and it will get ignored on c.l.py (you already tried that -- what changed since the last time you got ignored ?). Attach info to a bug report -- that's where it belongs. > ... > Still, life is too short to spend it in the SF slooow interface :( That's a feature: it encourages people to add only focused comments of real value <0.9 wink>. From pinard@iro.umontreal.ca Mon Jul 1 19:13:41 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 01 Jul 2002 14:13:41 -0400 Subject: [Python-Dev] Re: Infinie recursion in Pickle In-Reply-To: References: Message-ID: [Tim Peters] > François, I swear you spend more time complaining about SF than it would > take to just use it. So far, its kif-kif. (I'm not sure it is English: I mean that I spent about the same time complaining that I spent trying to use the beast). The balance would probably break if I was trying to use SF more often! :-) > You're not going to prevail on this, so please save everyone's time > (yours and ours) by skipping the repetition. I'm not at all trying to prevail, I do not have such needs. However, it is worth underlining that there are other ways to communication, and that the Python community might disserve itself by asserting there is only one. > Despite its flaws, the SF-based trackers work at least a thousand times > better than that did in the end. We can't go back. There is only once choice left, then, and that's going forward! I intend to give `roundup' an honest try, while understanding it is still in the works. -- François Pinard http://www.iro.umontreal.ca/~pinard From mal@lemburg.com Mon Jul 1 20:59:29 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 01 Jul 2002 21:59:29 +0200 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> <3D20111E.2090203@lemburg.com> <02e001c2210f$922ca2a0$ced241d5@hagrid> Message-ID: <3D20B4A1.6000702@lemburg.com> Fredrik Lundh wrote: > mal wrote: > > >>>does anyone have any real-life use cases? I've never been >>>able to use it for anything, and cannot recall ever seeing it >>>being used by anyone else... >> > > >>I use it in real-life applications to wrap binary data. > > > can you elaborate? how do you use it? As I said, I wrap binary data in buffer objects; these can be memory-mapped files, strings containing binary data or any other Python object implementing the buffer interface. IMHO, buffer() is the only way to signify non-string data while maintaining a string like interface. > could it be replaced > by something simpler, and still work in your application? > > would something like this work? > > class buffer(object): > def __len__(...) > def __getitem__(...) > def __getslice__(...) Provided these return buffer objects, yes. > class basestring(buffer): > ... > > class string(basestring): > ... > > class unicode(basestring): > ... I don't see the simplification, though ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From oren-py-d@hishome.net Mon Jul 1 21:18:41 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 1 Jul 2002 16:18:41 -0400 Subject: [Python-Dev] Alternative implementation of string interning Message-ID: <20020701201841.GA52320@hishome.net> http://python.org/sf/576101 Interning is done using a flag instead of a pointer (3 bytes less). The ob_sinterned pointer was most of the time either NULL or pointing to the same object. Cases where it pointed to another object were rare and the code that was cheching for this case was not effective. Interned strings are no longer immortal. They die when their refcnt reaches 0 just like any other object. The reference from the interned dict will not keep them alive longer than necessary. Can anyone explain why they were implemented with a pointer in the first place? Barry? Oren From tim.one@comcast.net Mon Jul 1 22:12:31 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 17:12:31 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020701201841.GA52320@hishome.net> Message-ID: [Oren Tirosh, on ] > ... > Interned strings are no longer immortal. They die when their refcnt > reaches 0 just like any other object. This may be a problem. Code now can rely on that id(some_interned_string) stays the same across the life of a run. > ... > Can anyone explain why they were implemented with a pointer in the first > place? Barry? It will have to be Guido. He made a plausible case to me once about why the indirection is there, but it may be an optimization that's no longer important. At the time interned strings were introduced, extension modules had mountains of code of the form: /* at module init time, in one or more modules */ static PyObject *spam_str = PyString_FromString("spam"); /* in various module routines */ PyObject_SetAttr(someobject, spam_str, user_supplied_value); and PyObject_SetAttr() was changed to make spam_str what you called an "indirectly interned" string by magic. This was (or at least Guido thought it was ) an important optimization at the time. Extension modules written after interned strings were introduced can exploit interning directly, a la /* at module init time, in one or more modules */ static PyObject *spam_str = PyString_InternFromString("spam"); and the core was reworked to do that too (note that this optimization wasn't directed at the core -- it could well be that core code never creates an indirectly interned string). I don't know how many extension modules still implicitly rely on indirect interning for a speed boost. Zope doesn't, and that's all that really matters . From gmcm@hypernet.com Mon Jul 1 23:16:30 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 1 Jul 2002 18:16:30 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: References: <20020701201841.GA52320@hishome.net> Message-ID: <3D209C7E.2597.18FACFEA@localhost> On 1 Jul 2002 at 17:12, Tim Peters wrote: > ... I don't know how many > extension modules still implicitly rely on indirect > interning for a speed boost. I bet most extension authors have been completely ignorant of it, which makes the answer "most of them" . -- Gordon http://www.mcmillan-inc.com/ From bsder@mail.allcaps.org Tue Jul 2 01:20:07 2002 From: bsder@mail.allcaps.org (Andrew P. Lentvorski) Date: Mon, 1 Jul 2002 17:20:07 -0700 (PDT) Subject: [Python-Dev] XML module causes profiler to throw In-Reply-To: <20020701010104.L6236-100000@mail.allcaps.org> Message-ID: <20020701170702.H290-200000@mail.allcaps.org> This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --0-1139718406-1025569154=:290 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: <20020701171925.N290@mail.allcaps.org> Okay, I created a tiny, self-sufficient file which demonstrates the problem without requiring external files or anything else silly like that. I tried to attach it to SourceForge bug 534864, but I'm apparently too dumb to figure out how to do it. -a Here's the log, the source file is attached: Python 2.2.1 (#1, May 27 2002, 16:42:22) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> import profile >>> import xmltest >>> profile.run('xmltest.main()') Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/profile.py", line 71, in run prof = prof.run(statement) File "/usr/local/lib/python2.2/profile.py", line 404, in run return self.runctx(cmd, dict, dict) File "/usr/local/lib/python2.2/profile.py", line 410, in runctx exec cmd in globals, locals File "", line 1, in ? File "xmltest.py", line 24, in main xml.sax.parseString(testxml, chand) File "/usr/local/lib/python2.2/xml/sax/__init__.py", line 49, in parseString parser.parse(inpsrc) File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 90, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local/lib/python2.2/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 143, in feed self._parser.Parse(data, isFinal) File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 216, in start_element def start_element(self, name, attrs): File "/usr/local/lib/python2.2/profile.py", line 214, in trace_dispatch_i if self.dispatch[event](self, frame,t): File "/usr/local/lib/python2.2/profile.py", line 260, in trace_dispatch_call assert rframe.f_back is frame.f_back, ("Bad call", rfn, AssertionError: ('Bad call', ('/usr/local/lib/python2.2/xml/sax/expatreader.py', 132, 'feed'), , , , ) >>> xmltest.main() start element end element --0-1139718406-1025569154=:290 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="xmltest.py" Content-Transfer-Encoding: BASE64 Content-ID: <20020701171914.A290@mail.allcaps.org> Content-Description: Content-Disposition: ATTACHMENT; FILENAME="xmltest.py" IyEvdXNyL2Jpbi9lbnYgcHl0aG9uDQoNCmltcG9ydCB4bWwuc2F4DQoNCnRl c3R4bWwgPSBcDQoiIiINCjxodG1sIC8+DQoiIiINCg0KY2xhc3MgQ29udGVu dEhhbmRsZXIoeG1sLnNheC5Db250ZW50SGFuZGxlcik6DQogICAgIiIiIEhh bmRsZSBjYWxsYmFja3MgZnJvbSB0aGUgU0FYIFhNTCBwYXJzZXIuICIiIg0K DQogICAgZGVmIF9faW5pdF9fKHNlbGYpOg0KICAgICAgICBwYXNzDQoNCiAg ICBkZWYgc3RhcnRFbGVtZW50KHNlbGYsIG5hbWUsIGF0dHJzKToNCiAgICAg ICAgcHJpbnQgInN0YXJ0IGVsZW1lbnQiDQoNCiAgICBkZWYgZW5kRWxlbWVu dChzZWxmLCBuYW1lKToNCiAgICAgICAgcHJpbnQgImVuZCBlbGVtZW50Ig0K DQpkZWYgbWFpbigpOg0KICAgIGNoYW5kID0gQ29udGVudEhhbmRsZXIoKQ0K ICAgIHhtbC5zYXgucGFyc2VTdHJpbmcodGVzdHhtbCwgY2hhbmQpDQoNCmlm IF9fbmFtZV9fID09ICJfX21haW5fXyI6DQogICAgbWFpbigpDQo= --0-1139718406-1025569154=:290-- From tim.one@comcast.net Tue Jul 2 02:23:15 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 21:23:15 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <3D209C7E.2597.18FACFEA@localhost> Message-ID: [Gordon, on extension modules implicitly relying on indirect interning] > I bet most extension authors have been completely > ignorant of it, which makes the answer "most of > them" . Could be! I don't know how much of a speed boost they get, though. While the magical interning is done for PyObject_SetAttr(), it's not done for the has-to-be-more-frequently-called PyObject_GetAttr(), as people call that with all sorts of garbage strings. For some reason interning is done for PyObject_GetAttrString(), although the caller of that can't profit from indirect interning (it takes a char*, not a PyObject*). Like I said, maybe this all makes sense to Guido <0.9 wink>. at-least-we're-not-fighting-over-what-the-comments-mean-ly y'rs - tim From tim.one@comcast.net Tue Jul 2 02:31:09 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 01 Jul 2002 21:31:09 -0400 Subject: [Python-Dev] Some dull gc stats In-Reply-To: Message-ID: [MvL] > Do you think this should be backported to 2.2.2 as well? I already backported the part that's arguably "a bugfix" (the part that would have solved Kevin's severe speed problem, had 2.2.1 not had a different bug that prevented the dramatic slowdown he saw in CVS Python). All that remains is new optimizations, and those can't be sold as a bugfix. Large (in the sense of lines of code) changes to gc really need to go thru lots of testing too. The optimizations involve some new algorithms, not just tweaking the former ones. So, no. If it were-- or evolves into --a dramatic speedup, maybe. From bsder@mail.allcaps.org Tue Jul 2 03:50:19 2002 From: bsder@mail.allcaps.org (Andrew P. Lentvorski) Date: Mon, 1 Jul 2002 19:50:19 -0700 (PDT) Subject: [Python-Dev] Performance question about math operations Message-ID: <20020701193609.O547-100000@mail.allcaps.org> I have a VLSI layout editor written in Python. At its core, it has to redraw a lot of polygons. This requires a lot of coordinate conversion mathematics. Essentially the following loop: #! /usr/bin/env python def main(): i = 0 while i < 1000000: i = i + 1 (1-678)*3.589 -((1-456)*3.589) if __name__=="__main__": main() Now, I understand that looping in Python has overhead. It turns out that the loop without the math operations takes about .5 seconds. Fine. However, each line of math operations adds .75 seconds to the total loop time for a total run time of about 2 seconds. This is with -O enabled (even though it doesn't seem to have any effect). This same loop in C++ (with classes, indirection, copy contruction, etc) takes about .05 seconds. That's about a factor of 30 (1.5 / .05) difference even if I cancel out the loop overhead. I could handle factor of 2 or 4, but 30 seems a bit high. What is eating all that time? And can I do anything about it? -a From aahz@pythoncraft.com Tue Jul 2 04:17:57 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 1 Jul 2002 23:17:57 -0400 Subject: [Python-Dev] Performance question about math operations In-Reply-To: <20020701193609.O547-100000@mail.allcaps.org> References: <20020701193609.O547-100000@mail.allcaps.org> Message-ID: <20020702031757.GA15825@panix.com> On Mon, Jul 01, 2002, Andrew P. Lentvorski wrote: > > I have a VLSI layout editor written in Python. At its core, it has to > redraw a lot of polygons. This requires a lot of coordinate conversion > mathematics. Please post this question to comp.lang.python; it is not appropriate for python-dev. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From oren-py-d@hishome.net Tue Jul 2 06:27:38 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 2 Jul 2002 08:27:38 +0300 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: ; from tim.one@comcast.net on Mon, Jul 01, 2002 at 05:12:31PM -0400 References: <20020701201841.GA52320@hishome.net> Message-ID: <20020702082738.A26155@hishome.net> On Mon, Jul 01, 2002 at 05:12:31PM -0400, Tim Peters wrote: > [Oren Tirosh, on ] > > ... > > Interned strings are no longer immortal. They die when their refcnt > > reaches 0 just like any other object. > > This may be a problem. Code now can rely on that id(some_interned_string) > stays the same across the life of a run. This requires code that stores the id of an object without keeping a reference to the actual object. It also requires that no other piece of Python or C code keep a reference to that object and yet for its identity to be somehow still significant. If find that extremely hard to imagine. > > Can anyone explain why they were implemented with a pointer in the first > > place? Barry? ... > and PyObject_SetAttr() was changed to make spam_str what you called an > "indirectly interned" string by magic. This was (or at least Guido thought > it was ) an important optimization at the time. I see. As far as I can tell, it isn't any more. Now for something a bit more radical: Why not make interned strings a type? could be an un-subclassable subclass of string. intern would just be an alias for this type. No two istr instances are equal unless they are identical. I guess PyString_CheckExact would need to be changed to accept either String or InternedString. Oren From martin@v.loewis.de Tue Jul 2 08:10:31 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 02 Jul 2002 09:10:31 +0200 Subject: [Python-Dev] Some dull gc stats In-Reply-To: References: Message-ID: Tim Peters writes: > All that remains is new optimizations, and those can't be sold as a bugfix. I understand that it is not a requirement anymore that changes to Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2 evolves and continues to grow new features, as long as they are "strictly backwards compatible". For any user-visible feature, it is normally debatable whether it is "strictly backwards compatible", since it is, by nature, a change in observable behaviour. This specific case is not in that category (i.e. has no user-observable behaviour change), so I think it qualifies for 2.2 - provided there is enough trust in its correctness. > Large (in the sense of lines of code) changes to gc really need to go thru > lots of testing too. The optimizations involve some new algorithms, not > just tweaking the former ones. I'm concerned that backporting more changes to Python 2.2 will become difficult in that area, if the GC implementations vary significantly. Maybe this can be reconsidered when there actually is another change to backport. Regards, Martin From Jack.Jansen@cwi.nl Tue Jul 2 10:17:15 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 2 Jul 2002 11:17:15 +0200 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: Message-ID: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl> On Monday, July 1, 2002, at 11:12 , Tim Peters wrote: > [Oren Tirosh, on ] >> ... >> Interned strings are no longer immortal. They die when their refcnt >> reaches 0 just like any other object. > > This may be a problem. Code now can rely on that > id(some_interned_string) > stays the same across the life of a run. The macimport code relies on the ids remaining the same. But it is easy to fix (just add an incref). I'll also change it to use PyString_CheckInterned. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From oren-py-d@hishome.net Tue Jul 2 11:37:48 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 2 Jul 2002 06:37:48 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl> References: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl> Message-ID: <20020702103748.GA68536@hishome.net> On Tue, Jul 02, 2002 at 11:17:15AM +0200, Jack Jansen wrote: > > On Monday, July 1, 2002, at 11:12 , Tim Peters wrote: > > >[Oren Tirosh, on ] > >>... > >>Interned strings are no longer immortal. They die when their refcnt > >>reaches 0 just like any other object. > > > >This may be a problem. Code now can rely on that > >id(some_interned_string) > >stays the same across the life of a run. > > The macimport code relies on the ids remaining the same. But it is easy > to fix (just add an incref). I'll also change it to use > PyString_CheckInterned. No, an incref there would leak references. Nothing needs to be changed. Any code with correct reference counting will not notice any difference with this patch. The only problem that could occur is if Python code uses the id function, stores the integer result but doesn't keep an actual reference to the string object and no other code does, either. Even this is not a problem yet unless the code also expects that if the same string is ever interned again it will get the same integer id and breaks if it doesn't. I can't believe anyone is stupid enough to do that. Using the id function this way is equivalent to an uncounted reference. BTW, my patch already takes care of PyString_CheckInterned in macimport.c Oren From fredrik@pythonware.com Tue Jul 2 12:18:31 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 2 Jul 2002 13:18:31 +0200 Subject: [Python-Dev] Alternative implementation of string interning References: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl> <20020702103748.GA68536@hishome.net> Message-ID: <012501c221ba$34014f90$0900a8c0@spiff> Oren Tirosh wrote: > Even this is not a problem yet unless the code also expects that if = the > same string is ever interned again it will get the same integer id and = breaks > if it doesn't. I can't believe anyone is stupid enough to do that. do what? trust the documentation? intern(string)=20 Enter string in the table of ``interned'' strings and return the interned string - which is string itself or a copy. /.../ Interned strings are immortal (never get garbage collected). From Jack.Jansen@cwi.nl Tue Jul 2 14:25:20 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 2 Jul 2002 15:25:20 +0200 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020702103748.GA68536@hishome.net> Message-ID: <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl> On Tuesday, Jul 2, 2002, at 12:37 Europe/Amsterdam, Oren Tirosh wrote: >> The macimport code relies on the ids remaining the same. But it is easy >> to fix (just add an incref). I'll also change it to use >> PyString_CheckInterned. > > No, an incref there would leak references. Nothing needs to be changed. > Uhm... I'm confused: macimport stores a pointer to the object if it's interned (the object in question is one of the strings in sys.path). It didn't INCREF the object, and that wasn't needed up until now because interned objects can never go away. However, if they can go away I would think that storing a pointer would definitely call for an INCREF... From gward@python.net Tue Jul 2 14:53:25 2002 From: gward@python.net (Greg Ward) Date: Tue, 2 Jul 2002 09:53:25 -0400 Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle) In-Reply-To: <3D203E13.22484.1789FB93@localhost> References: <05a601c22101$741ac2f0$0900a8c0@spiff> <3D203E13.22484.1789FB93@localhost> Message-ID: <20020702135325.GA5085@gerg.ca> On 01 July 2002, Gordon McMillan said: > [1] Writing an app as a set of cgi's is a fast way > to write a mediocre GUI. Rumour has it that there are several fine web application frameworks available for Python. I'm partial to Quixote [1] myself, but I've also heard good things about WebWare. Greg [1] http://www.mems-exchange.org/software/quixote/ -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ All programmers are playwrights and all computers are lousy actors. From oren-py-d@hishome.net Tue Jul 2 14:57:56 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 2 Jul 2002 09:57:56 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl> References: <20020702103748.GA68536@hishome.net> <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl> Message-ID: <20020702135756.GA95955@hishome.net> On Tue, Jul 02, 2002 at 03:25:20PM +0200, Jack Jansen wrote: > > On Tuesday, Jul 2, 2002, at 12:37 Europe/Amsterdam, Oren Tirosh wrote: > >>The macimport code relies on the ids remaining the same. But it is easy > >>to fix (just add an incref). I'll also change it to use > >>PyString_CheckInterned. > > > >No, an incref there would leak references. Nothing needs to be changed. > > > Uhm... I'm confused: macimport stores a pointer to the object if it's > interned (the object in question is one of the strings in sys.path). It > didn't INCREF the object, and that wasn't needed up until now because > interned objects can never go away. However, if they can go away I would > think that storing a pointer would definitely call for an INCREF... Are you saying that this code is not following reference counting rules and got away with it only because interned strings are immortal? I don't see how adding only an incref could be correct - there must be a corresponding decref somewhere. Oren From jacobs@penguin.theopalgroup.com Tue Jul 2 15:12:30 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 2 Jul 2002 10:12:30 -0400 (EDT) Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle) In-Reply-To: <20020702135325.GA5085@gerg.ca> Message-ID: On Tue, 2 Jul 2002, Greg Ward wrote: > On 01 July 2002, Gordon McMillan said: > > [1] Writing an app as a set of cgi's is a fast way > > to write a mediocre GUI. > > Rumour has it that there are several fine web application frameworks > available for Python. I'm partial to Quixote [1] myself, but I've also > heard good things about WebWare. I think the point was that web-based GUIs tend to be rather mediocre, regardless of which toolkit is used. To some degree I have to agree -- you typically end up with a very clunky GUI with lots of high latency hits to a server for updates, or a very complex frontend implemented with a large and difficult to maintain body of Javascript. Some progress has been made to improve the situation, although the state-of-the-art is far from ideal. We can take this discussion off python-dev if anyone wants to know more about my thoughts on this matter. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From Jack.Jansen@cwi.nl Tue Jul 2 15:28:49 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 2 Jul 2002 16:28:49 +0200 Subject: [Python-Dev] HUGE_VAL and INFINITY Message-ID: <0651D5E2-8DC8-11D6-9F20-0030655234CE@cwi.nl> I think there is a problem with the way pyport.h treats HUGE_VAL and INFINITY. But as this whole area is a great can of worms I'd like someone with more knowledge of C standards and floating point and such to ponder it, please. If both INFINITY and HUGE_VAL are defined then INFINITY takes precedence. However, all references I've seen to INFINITY seem to indicate that this is a float value, not a double value, according to the C99 standard. And I've now come across a platform where HUGE_VAL==1e500 and INFINITY==HUGE_VALF==1e50, and these latter values are not infinite for doubles (I assume they are infinite for floats, but I haven't checked). I have a patch that will fix this problem for my specific case, but I have the feeling that it may be the pyport.h logic that is at fault here. If no-one jumps in I'll commit my fix in a few days time. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen@cwi.nl Tue Jul 2 15:37:42 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 2 Jul 2002 16:37:42 +0200 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020702135756.GA95955@hishome.net> Message-ID: <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl> On Tuesday, July 2, 2002, at 03:57 , Oren Tirosh wrote: >> Uhm... I'm confused: macimport stores a pointer to the object if it's >> interned (the object in question is one of the strings in sys.path). It >> didn't INCREF the object, and that wasn't needed up until now because >> interned objects can never go away. However, if they can go away I >> would >> think that storing a pointer would definitely call for an INCREF... > > Are you saying that this code is not following reference counting rules > and got away with it only because interned strings are immortal? I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-): interned strings were specifically defined to be immortal. > I don't see how adding only an incref could be correct - there must be a > corresponding decref somewhere. No, there isn't, because this list of pointers is never cleared. Which was never needed, because they were borrowed references. Again, it isn't rocket science to fix this: _PyImport_Fini() will need to call out to a new routine _PyMacImport_Fini() that DECREFs the stored pointers. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From marklists@mceahern.com Tue Jul 2 15:39:16 2002 From: marklists@mceahern.com (Mark McEahern) Date: Tue, 2 Jul 2002 09:39:16 -0500 Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle) In-Reply-To: Message-ID: [Kevin Jacobs] > I think the point was that web-based GUIs tend to be rather mediocre, > regardless of which toolkit is used. To some degree I have to > agree -- you > typically end up with a very clunky GUI with lots of high latency > hits to a > server for updates, or a very complex frontend implemented with a > large and > difficult to maintain body of Javascript. > > Some progress has been made to improve the situation, although the > state-of-the-art is far from ideal. > > We can take this discussion off python-dev if anyone wants to know more > about my thoughts on this matter. I'd be interested to hear more. There's a current discussion on comp.lang.python where you may want to post your thoughts: Here are two different pointers to the beginning of today's installments: http://groups.google.com/groups?selm=3d21259f%240%2428006%24afc38c87%40news. optusnet.com.au http://mail.python.org/pipermail/python-list/2002-July/111256.html Cheers, // mark - From tim.one@comcast.net Tue Jul 2 16:28:08 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 02 Jul 2002 11:28:08 -0400 Subject: [Python-Dev] HUGE_VAL and INFINITY In-Reply-To: <0651D5E2-8DC8-11D6-9F20-0030655234CE@cwi.nl> Message-ID: [Jack Jansen] > I think there is a problem with the way pyport.h treats HUGE_VAL and > INFINITY. I'm afraid there are necessarily problems there, since this stuff is insufficiently standardized. > But as this whole area is a great can of worms I'd like someone with > more knowledge of C standards and floating point and such to ponder it, > please. Let's look at the code: """ /* According to * http://www.cray.com/swpubs/manuals/SN-2194_2.0/html-SN-2194_2.0/x3138.htm * on some Cray systems HUGE_VAL is incorrectly (according to the C std) * defined to be the largest positive finite rather than infinity. We * need the std-conforming infinity meaning (provided the platform has * one!). * * Then, according to a bug report on SourceForge, defining Py_HUGE_VAL as * INFINITY caused internal compiler errors under BeOS using some version * of gcc. Explicitly casting INFINITY to double made that problem go * away. */ #ifdef INFINITY #define Py_HUGE_VAL ((double)INFINITY) #else #define Py_HUGE_VAL HUGE_VAL #endif """ > If both INFINITY and HUGE_VAL are defined then INFINITY takes > precedence. Right, and the comment explains why (a broken Cray system). > However, all references I've seen to INFINITY seem to indicate that > this is a float value, not a double value, according to the C99 standard. It is a float value, but is explicitly cast to double in the above. > And I've now come across a platform where HUGE_VAL==1e500 and > INFINITY==HUGE_VALF==1e50, and these latter values are not infinite for > doubles (I assume they are infinite for floats, but I haven't checked). The platform's header files are braindead. That doesn't mean we shouldn't try to survive despite them, but you should file a bug report with whoever supplies this C. If (double)INFINITY isn't a double-precision infinity, their definition of INFINITY is hosed (the C89 std doesn't say anything useful about this, it's a matter of respecting the spirit of IEEE-754 and that C didn't bother to define a double-precision version of the INFINITY macro -- that means a *useful* float INFINITY has to be defined in such a way that it can do double-duty). > I have a patch that will fix this problem for my specific case, but I > have the feeling that it may be the pyport.h logic that is at fault > here. If no-one jumps in I'll commit my fix in a few days time. Don't check in a change here without review. Why are you keeping "the fix" secret? At this point, I'd be happy to drop the hack-around for the broken Cray, and reduce the whole mess to: #ifndef Py_HUGE_VAL #define Py_HUGE_VAL HUGE_VAL #endif Then someone on a broken box can #define their own Py_HUGE_VAL in their own stinkin' config file. From oren-py-d@hishome.net Tue Jul 2 18:55:07 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 2 Jul 2002 20:55:07 +0300 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl>; from Jack.Jansen@cwi.nl on Tue, Jul 02, 2002 at 04:37:42PM +0200 References: <20020702135756.GA95955@hishome.net> <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl> Message-ID: <20020702205507.A32734@hishome.net> On Tue, Jul 02, 2002 at 04:37:42PM +0200, Jack Jansen wrote: > > On Tuesday, July 2, 2002, at 03:57 , Oren Tirosh wrote: > >> Uhm... I'm confused: macimport stores a pointer to the object if it's > >> interned (the object in question is one of the strings in sys.path). It > >> didn't INCREF the object, and that wasn't needed up until now because > >> interned objects can never go away. However, if they can go away I > >> would > >> think that storing a pointer would definitely call for an INCREF... > > > > Are you saying that this code is not following reference counting rules > > and got away with it only because interned strings are immortal? > > I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-): > interned > strings were specifically defined to be immortal. I know it says so in the doc, but I always tended to look at it as an implementation limitation rather than a feature... Oren From David Abrahams" >>> help(weakref.ref) Help on built-in function ref: ref(...) new(object[, callback]) -- create a weak reference to 'object'; when 'object' is finalized, 'callback' will be called and passed a reference to 'object'. ^^^^^^^^^^^^^^^^^^^^^^^ This appears to be a lie, or at least misleadingly phrased, in Python 2.2.1: >>> class Z: pass ... >>> def dying(x): print x, 'is dying' ... >>> z = Z() >>> r = weakref.ref(z, dying) >>> z = 1 is dying It appears that it's a reference to the weakref object that's passed, not the dying object itself. What's the intention? TIA, Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From tim@zope.com Tue Jul 2 20:06:03 2002 From: tim@zope.com (Tim Peters) Date: Tue, 2 Jul 2002 15:06:03 -0400 Subject: [Python-Dev] Some dull gc stats In-Reply-To: Message-ID: [martin@v.loewis.de] > I understand that it is not a requirement anymore that changes to > Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2 > evolves and continues to grow new features, as long as they are > "strictly backwards compatible". Alex made a case here for "new features", but the Python Business Forum hasn't shown interest in that. Like most businessfolk, I expect they'll ignore such issues until someone discovers that the lack of a new feature is putting them out of business <0.8 wink>. > For any user-visible feature, it is normally debatable whether it is > "strictly backwards compatible", since it is, by nature, a change in > observable behaviour. > > This specific case is not in that category (i.e. has no > user-observable behaviour change), so I think it qualifies for 2.2 - > provided there is enough trust in its correctness. The "bugfix part" of these changes certainly had user-visible aspects, in that before it was possible for objects in older generations to get yanked back into younger generations. This can affect when objects get collected, and so throw off over-tuned programs slinging gc.enable() and disable() "at exactly the best time(s)". > ... > I'm concerned that backporting more changes to Python 2.2 will become > difficult in that area, if the GC implementations vary significantly. Maintaining multiple branches is always a PITA. > Maybe this can be reconsidered when there actually is another change > to backport. Anyone who is so inclined is welcome to reconsider it non-stop . From tim.one@comcast.net Tue Jul 2 20:31:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 02 Jul 2002 15:31:03 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020702082738.A26155@hishome.net> Message-ID: [Tim] > This may be a problem. Code now can rely on that > id(some_interned_string) stays the same across the life of a run. [Oren Tirosh] > This requires code that stores the id of an object without keeping a > reference to the actual object. It also requires that no other piece of > Python or C code keep a reference to that object and yet for its > identity to be somehow still significant. If find that extremely hard > to imagine. I would have guessed you had a more vivid imagination . It's precisely because the id has been guaranteed that a program may not care to save a reference to an interned string. For example, """ _ids = map(id, map(intern, "if then elif else".split())) TOKEN_IF, TOKEN_THEN, TOKEN_ELIF, TOKEN_ELSE, TOKEN_NAME = range(5) id2token = dict(zip(_ids, range(4))) del _ids def tokenvector(s): return [id2token.get(id(intern(word)), TOKEN_NAME) for word in s.split()] print tokenvector("if this is the example, then what's the question?") """ This works reliably today to classify tokens. I'm not certain I'd care if it broke, but we have to consider that it hasn't been difficult to write code that would break. >> This was (or at least Guido thought it was ) an important >> optimization at the time. > I see. As far as I can tell, it isn't any more. Which extension modules have you investigated? The claim is too vague to carry weight. Zope's C code uses the interned-string C API directly, so it doesn't matter to Zope code. That's all I've looked at. Making a case that the optimization is no longer important requires investigating code. > Now for something a bit more radical: > > Why not make interned strings a type? could be an > un-subclassable subclass of string. intern would just be an > alias for this type. No two istr instances are equal unless they are > identical. I guess PyString_CheckExact would need to be changed to > accept either String or InternedString. What would the point be? That is, instead of "why not?", why? As to "why not?", there's something about elevating what's basically an optimization hack to a type that makes me squirm. From niemeyer@conectiva.com Tue Jul 2 20:54:00 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Tue, 2 Jul 2002 16:54:00 -0300 Subject: [Python-Dev] weakref (or doc) bug? In-Reply-To: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> Message-ID: <20020702165400.B25194@ibook.distro.conectiva> > >>> help(weakref.ref) > Help on built-in function ref: > > ref(...) > new(object[, callback]) -- create a weak reference to 'object'; > when 'object' is finalized, 'callback' will be called and passed > a reference to 'object'. > ^^^^^^^^^^^^^^^^^^^^^^^ [...] > It appears that it's a reference to the weakref object that's passed, not ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I'm not sure if that's what was intended, but the documentation seems compliant with the current behavior. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From David Abrahams" <20020702165400.B25194@ibook.distro.conectiva> Message-ID: <155201c22203$39f132f0$6501a8c0@boostconsulting.com> From: "Gustavo Niemeyer" > > >>> help(weakref.ref) > > Help on built-in function ref: > > > > ref(...) > > new(object[, callback]) -- create a weak reference to 'object'; > > when 'object' is finalized, 'callback' will be called and passed > > a reference to 'object'. > > ^^^^^^^^^^^^^^^^^^^^^^^ > [...] > > It appears that it's a reference to the weakref object that's passed, not > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > I'm not sure if that's what was intended, but the documentation seems > compliant with the current behavior. Only if you take the unqualified term "reference" to mean a weakref.ref object. I read it as being a regular reference. You might think I have Java-on-the-brain, but I've never programmed a line of that foul black sludge in my life. I'm sure other people will read it the way I do. -Dave From niemeyer@conectiva.com Tue Jul 2 21:14:17 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Tue, 2 Jul 2002 17:14:17 -0300 Subject: [Python-Dev] weakref (or doc) bug? In-Reply-To: <155201c22203$39f132f0$6501a8c0@boostconsulting.com> References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> <20020702165400.B25194@ibook.distro.conectiva> <155201c22203$39f132f0$6501a8c0@boostconsulting.com> Message-ID: <20020702171416.A25592@ibook.distro.conectiva> > Only if you take the unqualified term "reference" to mean a weakref.ref > object. I read it as being a regular reference. You might think I have > Java-on-the-brain, but I've never programmed a line of that foul black > sludge in my life. I'm sure other people will read it the way I do. Maybe the documentation should be clarified then. Giving a usage example for the callback would help as well. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From mal@lemburg.com Tue Jul 2 21:18:14 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 02 Jul 2002 22:18:14 +0200 Subject: [Python-Dev] Some dull gc stats References: Message-ID: <3D220A86.5070003@lemburg.com> Tim Peters wrote: > [martin@v.loewis.de] > >>I understand that it is not a requirement anymore that changes to >>Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2 >>evolves and continues to grow new features, as long as they are >>"strictly backwards compatible". > > > Alex made a case here for "new features", but the Python Business Forum > hasn't shown interest in that. Like most businessfolk, I expect they'll > ignore such issues until someone discovers that the lack of a new feature is > putting them out of business <0.8 wink>. Patch level releases should *never* include new features (unless these are essential to fix a serious bug or a simple byproduct of a fix). I don't know where you got the impression that Python should move back to the 1.5 branch development process where patch levels added new features. W/r to the PBF: at EuroPython we did a poll to see which version to base the PBF's activities on. The result was that a majority voted for Python 2.2 as first target. Patch levels are there to stabilize a release, not make it more powerful. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Jack.Jansen@oratrix.com Tue Jul 2 21:32:33 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Tue, 2 Jul 2002 22:32:33 +0200 Subject: [Python-Dev] HUGE_VAL and INFINITY In-Reply-To: Message-ID: On dinsdag, juli 2, 2002, at 05:28 , Tim Peters wrote: > [Jack Jansen] >> I think there is a problem with the way pyport.h treats HUGE_VAL and >> INFINITY. > > I'm afraid there are necessarily problems there, since this stuff is > insufficiently standardized. I found a couple of references to INFINITY being a float. Here's one (no idea as to it's status, though, that's why I asked for help of a standards guru): http://www.opengroup.org/onlinepubs/007904975/basedefs/math.h.html (googling for "INFINITY math.h" will find many more). > #define Py_HUGE_VAL ((double)INFINITY) Is the intention of this define that it would first convert the constant "1e50" to an IEEE float "Infinity", and that this float would then be promoted to a double "Infinity"? If it is indeed stated somewhere in the C standard that this is the course of action to take then the compiler is wrong, because what it actually seems to be doing is parsing the "1e50" as a double because of the cast (speculating here, but this is consistent with the results). If you happen to have a reference then I can post a bug report. >> I have a patch that will fix this problem for my specific case, but I >> have the feeling that it may be the pyport.h logic that is at fault >> here. If no-one jumps in I'll commit my fix in a few days time. > > Don't check in a change here without review. The patch is a simple #ifdef __APPLE__ #undef INFINITY #endif I'll post a sourceforge bug tomorrow and assign it to you. Feel free to completely ignore it and do the config magic to handle the Cray case specially, though. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From David Abrahams" <20020702165400.B25194@ibook.distro.conectiva> <155201c22203$39f132f0$6501a8c0@boostconsulting.com> Message-ID: <159b01c2220c$5b7b62c0$6501a8c0@boostconsulting.com> From: "David Abrahams" > Java-on-the-brain, but I've never programmed a line of that foul black > sludge in my life. Sorry everyone, that remark was in poor taste. -Dave From tim@zope.com Tue Jul 2 22:13:08 2002 From: tim@zope.com (Tim Peters) Date: Tue, 2 Jul 2002 17:13:08 -0400 Subject: [Python-Dev] HUGE_VAL and INFINITY In-Reply-To: Message-ID: [Jack Jansen] > I found a couple of references to INFINITY being a float. ... Yes, INFINITY must expand to a constant expression of type float, although your compiler isn't doing that (see below). The header files you're using are still braindead for the reasons I explained last time regardless. >> #define Py_HUGE_VAL ((double)INFINITY) > Is the intention of this define that it would first convert the > constant "1e50" to an IEEE float "Infinity", and that this float > would then be promoted to a double "Infinity"? No. As the comments before this code said, the explicit cast to double was for the benefit of some other broken compiler. The literal "1e50" has type double in C, so if they're really #define'ing INFINITY as 1e50 then they're violating that INFINITY must expand to an expression of type float. They could have made it a float literal by appending "f" or "F", but then it wouldn't be a legal float literal. They're screwed either way -- they're doing this part incorrectly no matter how you cut it. They can look at any other compiler for a correct way to do it . > If it is indeed stated somewhere in the C standard that this is the > course of action to take then the compiler is wrong, The compiler is wrong, but for other reasons. > because what it actually seems to be doing is parsing the "1e50" as a > double because of the cast (speculating here, but this is consistent > with the results). 1e50 is a double with or without the cast. > The patch is a simple > #ifdef __APPLE__ > #undef INFINITY > #endif Bleech. I'm going to remove all this crap. If some Crays still have broken HUGE_VAL definitions, tough -- someone on a Cray can fix it. Putting this junk in the core just ensures it will always stay broken. From barry@zope.com Tue Jul 2 22:36:49 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 2 Jul 2002 17:36:49 -0400 Subject: [Python-Dev] weakref (or doc) bug? References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> <20020702165400.B25194@ibook.distro.conectiva> <155201c22203$39f132f0$6501a8c0@boostconsulting.com> <159b01c2220c$5b7b62c0$6501a8c0@boostconsulting.com> Message-ID: <15650.7409.971862.910898@anthem.wooz.org> >>>>> "DA" == David Abrahams writes: >> Java-on-the-brain, but I've never programmed a line of that >> foul black sludge in my life. | Sorry everyone, that remark was in poor taste. -----------------------------------------^^^^^^^^^^ Oh, now I get it! -Barry From tim@zope.com Wed Jul 3 00:06:06 2002 From: tim@zope.com (Tim Peters) Date: Tue, 2 Jul 2002 19:06:06 -0400 Subject: [Python-Dev] Some dull gc stats In-Reply-To: <3D220A86.5070003@lemburg.com> Message-ID: [MaL, replying to me, but presumably bonding with Martin again ] > Patch level releases should *never* include new features (unless > these are essential to fix a serious bug or a simple byproduct > of a fix). I don't know where you got the impression that Python > should move back to the 1.5 branch development process where patch > levels added new features. The pre-PBF Patch Czars generally took a hard "no new features!" stance, but it seems to be up in the air now. > W/r to the PBF: at EuroPython we did a poll to see which version > to base the PBF's activities on. The result was that a majority > voted for Python 2.2 as first target. Cool! Good choice. > Patch levels are there to stabilize a release, not make it > more powerful. This is one popular view, although there's plenty of wiggle room in what "stabilize" means (e.g., is it "stabilizing" to port Python to a new platform? to speed a bottleneck? to add a new encoding? etc). From lalo@laranja.org Wed Jul 3 00:17:15 2002 From: lalo@laranja.org (Lalo Martins) Date: Tue, 2 Jul 2002 20:17:15 -0300 Subject: [Python-Dev] [development doc updates] In-Reply-To: <20020702222813.8990118EC22@grendel.zope.com> References: <20020702222813.8990118EC22@grendel.zope.com> Message-ID: <20020702231715.GG25927@laranja.org> On Tue, Jul 02, 2002 at 06:28:13PM -0400, Fred L. Drake wrote: > The development version of the documentation has been updated: > > http://www.python.org/dev/doc/devel/ > > Many updates and corrections to the documentation, including docs for the > new textwrap module. Re: textwrap.TextWrapper.fix_sentence_endings } ... Furthermore, since it relies on string.lowercase ... it is specific to } English-language texts. Well, actually the convention of separating sentences by two spaces is also specific to the English language, so I don't see that as a problem. []s, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From tim.one@comcast.net Wed Jul 3 03:10:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 02 Jul 2002 22:10:36 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC Message-ID: I don't consider cyclic gc to be an experiment anymore. It's proved to be very solid code, and it hasn't become orphaned either . What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3? They're irritating, the code base without cyclic gc is never tested, the touchy trashcan mechanism works in a radically different way when cyclic gc isn't compiled in, and if cyclic gc is compiled in it's easy to turn it off at will (gc.disable()). It does cost memory for the gc header on containers, but since we never test without it the ability to compile it out isn't much of "a feature". +1 from me . From tim.one@comcast.net Wed Jul 3 03:53:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 02 Jul 2002 22:53:44 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020702205507.A32734@hishome.net> Message-ID: [Jack Jansen] > I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-): > interned strings were specifically defined to be immortal. [Oren Tirosh] > I know it says so in the doc, but I always tended to look at it as an > implementation limitation rather than a feature... Me too: I always read it as a warning not to use interning "too much". However, you can see how far common sense goes once users get ahold of a thing . From oren-py-d@hishome.net Wed Jul 3 05:52:11 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 3 Jul 2002 00:52:11 -0400 Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: References: <20020702082738.A26155@hishome.net> Message-ID: <20020703045211.GA7978@hishome.net> On Tue, Jul 02, 2002 at 03:31:03PM -0400, Tim Peters wrote: > I would have guessed you had a more vivid imagination . It's > precisely because the id has been guaranteed that a program may not care to > save a reference to an interned string. For example, > > """ > _ids = map(id, map(intern, "if then elif else".split())) > TOKEN_IF, TOKEN_THEN, TOKEN_ELIF, TOKEN_ELSE, TOKEN_NAME = range(5) > id2token = dict(zip(_ids, range(4))) > del _ids > > def tokenvector(s): > return [id2token.get(id(intern(word)), TOKEN_NAME) > for word in s.split()] > > print tokenvector("if this is the example, then what's the question?") > """ > > This works reliably today to classify tokens. I'm not certain I'd care if > it broke, but we have to consider that it hasn't been difficult to write > code that would break. Ironically, this code is actually slower than using the strings themselves as keys (interned or not). But I get the point. > > Now for something a bit more radical: > > > > Why not make interned strings a type? could be an > > un-subclassable subclass of string. intern would just be an > > alias for this type. No two istr instances are equal unless they are > > identical. I guess PyString_CheckExact would need to be changed to > > accept either String or InternedString. > > What would the point be? That is, instead of "why not?", why? As to "why > not?", there's something about elevating what's basically an optimization > hack to a type that makes me squirm. Change the name from 'istr' to 'symbol' and add a mild case of language envy and you'll see why ;-) Oren From fdrake@acm.org Wed Jul 3 06:10:00 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 3 Jul 2002 01:10:00 -0400 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <20020702231715.GG25927@laranja.org> References: <20020702222813.8990118EC22@grendel.zope.com> <20020702231715.GG25927@laranja.org> Message-ID: <15650.34600.410233.510315@grendel.zope.com> Lalo Martins writes: > Re: textwrap.TextWrapper.fix_sentence_endings ... > Well, actually the convention of separating sentences by two spaces is also > specific to the English language, so I don't see that as a problem. Insidious, isn't it? I've tried to clarify the matter further in the documentation; please let me know if you think more is needed. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From greg@cosc.canterbury.ac.nz Wed Jul 3 06:23:23 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 03 Jul 2002 17:23:23 +1200 (NZST) Subject: [Python-Dev] Alternative implementation of string interning In-Reply-To: <20020703045211.GA7978@hishome.net> Message-ID: <200207030523.g635NNC18943@oma.cosc.canterbury.ac.nz> Oren Tirosh : > Tim Peters: > > > What would the point be? That is, instead of "why not?", why? As to "why > > not?", there's something about elevating what's basically an optimization > > hack to a type that makes me squirm. > > Change the name from 'istr' to 'symbol' and add a mild case of language envy > and you'll see why ;-) But in Lisp, symbols and strings really are completely separate types. That's not the case in Python, and you still haven't really given a reason why they should be. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From lalo@laranja.org Wed Jul 3 06:28:32 2002 From: lalo@laranja.org (Lalo Martins) Date: Wed, 3 Jul 2002 02:28:32 -0300 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <15650.34600.410233.510315@grendel.zope.com> References: <20020702222813.8990118EC22@grendel.zope.com> <20020702231715.GG25927@laranja.org> <15650.34600.410233.510315@grendel.zope.com> Message-ID: <20020703052832.GA5023@laranja.org> On Wed, Jul 03, 2002 at 01:10:00AM -0400, Fred L. Drake, Jr. wrote: > > Lalo Martins writes: > > Re: textwrap.TextWrapper.fix_sentence_endings > ... > > Well, actually the convention of separating sentences by two spaces is also > > specific to the English language, so I don't see that as a problem. > > Insidious, isn't it? I've tried to clarify the matter further in the > documentation; please let me know if you think more is needed. Seems fine for my particular taste now. thanks, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From ping@zesty.ca Wed Jul 3 07:07:14 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Tue, 2 Jul 2002 23:07:14 -0700 (PDT) Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: Message-ID: Oren Tirosh wrote: > Why not make interned strings a type? could be an > un-subclassable subclass of string. intern would just be an > alias for this type. No two istr instances are equal unless they are > identical. I guess PyString_CheckExact would need to be changed to > accept either String or InternedString. The possibility of people starting to write code that depended on whether strings were 'string' or 'istr', and all the breakage and incompatibility that would result, seems much too ugly to contemplate. Pass an 'istr' into a routine that expects strings, and it would appear to be a string right up until someone tried to == it, whereupon all hell would break loose. The acid test for subtyping is substitutability: type 'istr' would not fulfill the contract of 'string', and neither would 'string' fulfill the contract of 'istr'. Therefore, if you really wanted to do this, your new type (let's call it 'symbol') would have to be completely independent from both strings *and* interned strings. There's no subclass relationship. -- ?!ng From martin@v.loewis.de Wed Jul 3 07:22:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 03 Jul 2002 08:22:08 +0200 Subject: [Python-Dev] Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D220A86.5070003@lemburg.com> References: <3D220A86.5070003@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Patch level releases should *never* include new features (unless > these are essential to fix a serious bug or a simple byproduct > of a fix). I don't know where you got the impression that Python > should move back to the 1.5 branch development process where patch > levels added new features. >From discussions on python-dev... > Patch levels are there to stabilize a release, not make it > more powerful. What precisely does that mean? Specific case in question: xml.dom.minidom.toxml does not support the specification of an encoding of the resulting XML document. Instead, if there are non-ASCII characters in the output document, it returns a Unicode object that starts with u"". People cannot write this to a file as-is, and they cannot encode it in anything but UTF-8 (because the document would then be incorrect). So I added an optional encoding= argument to .toxml, for 2.3. The question now is: should that argument also be made available for 2.2.2? Regards, Martin From martin@v.loewis.de Wed Jul 3 07:23:46 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 03 Jul 2002 08:23:46 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: References: Message-ID: Tim Peters writes: > What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3? Good idea. Martin From oren-py-d@hishome.net Wed Jul 3 08:06:17 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 3 Jul 2002 03:06:17 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: References: Message-ID: <20020703070617.GA25449@hishome.net> On Tue, Jul 02, 2002 at 11:07:14PM -0700, Ka-Ping Yee wrote: > Oren Tirosh wrote: > > Why not make interned strings a type? could be an > > un-subclassable subclass of string. intern would just be an > > alias for this type. No two istr instances are equal unless they are > > identical. I guess PyString_CheckExact would need to be changed to > > accept either String or InternedString. > > The possibility of people starting to write code that depended on > whether strings were 'string' or 'istr', and all the breakage and > incompatibility that would result, seems much too ugly to contemplate. > Pass an 'istr' into a routine that expects strings, and it would > appear to be a string right up until someone tried to == it, whereupon > all hell would break loose. I don't understand your assumptions. What kind of hell? Are you assuming that == would be equivalent to 'is' for istrs? The == operator should work exactly the same, just possibly a little faster when comparing two istrs. > The acid test for subtyping is substitutability: type 'istr' would not > fulfill the contract of 'string', and neither would 'string' fulfill the > contract of 'istr'. Can you be more specific? As i see it an istr would be completely compatible to str with the exception of being non subclassable. It has the additional property that (type(s) is istr and type(t) is istr and s == t) implies (s is t). But that doesn't break anything. Oren From tdelaney@avaya.com Wed Jul 3 08:17:54 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Wed, 3 Jul 2002 17:17:54 +1000 Subject: [Python-Dev] Re: Alternative implementation of string interni ng Message-ID: > From: Oren Tirosh [mailto:oren-py-d@hishome.net] > > > Oren Tirosh wrote: > > > alias for this type. No two istr instances are equal > unless they are > > > identical. I guess PyString_CheckExact would need to be > > you assuming > that == would be equivalent to 'is' for istrs? The == > operator should work > exactly the same, just possibly a little faster when > comparing two istrs. > > (type(s) is istr and type(t) is istr and s == t) implies (s is t). Do you mean that comparing two instances of istr would use *is*, but comparing an istr with any other instance would use the normal str compare? Because that is not how it has come across. My first thought when I saw this proposal was "neat". My second was "yuk". The #1 most important consideration here is backwards compatibility IMO. Whilst I would be personally unaffected by this change (allowing interned strings to be collected), we've already had examples of people and code that would be. Tim Delaney From mal@lemburg.com Wed Jul 3 08:37:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 03 Jul 2002 09:37:26 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: Message-ID: <3D22A9B6.1050208@lemburg.com> Tim Peters wrote: > I don't consider cyclic gc to be an experiment anymore. It's proved to be > very solid code, and it hasn't become orphaned either . > > What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3? > They're irritating, the code base without cyclic gc is never tested, the > touchy trashcan mechanism works in a radically different way when cyclic gc > isn't compiled in, and if cyclic gc is compiled in it's easy to turn it off > at will (gc.disable()). It does cost memory for the gc header on > containers, but since we never test without it the ability to compile it out > isn't much of "a feature". > > +1 from me . Hmm, isn't the idea of having compile time options to give people a chance to eliminate the feature altogether ? I'm thinking in terms of memory footprint of the running interpreter and its binary. Platforms like e.g. Palm or Pocket PC are very touchy about this. Embedded devices even more. How much memory footprint would removing the #ifdefs cause on average ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 3 08:55:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 03 Jul 2002 09:55:05 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> Message-ID: <3D22ADD9.1030901@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: > > >>Patch level releases should *never* include new features (unless >>these are essential to fix a serious bug or a simple byproduct >>of a fix). I don't know where you got the impression that Python >>should move back to the 1.5 branch development process where patch >>levels added new features. > > >>From discussions on python-dev... > > >>Patch levels are there to stabilize a release, not make it >>more powerful. > > > What precisely does that mean? Mainly that only bugs should be fixed. Adding new features doesn't help in fixing bugs since you can't expect that existing code for a particular Python branch will get changed to make use of it. Stabilizing means that code using the existing features in a branch runs more stable, i.e. there are fewer situations where a program can trigger a bug hiding in the Python release. > Specific case in question: xml.dom.minidom.toxml does not support the > specification of an encoding of the resulting XML document. Instead, > if there are non-ASCII characters in the output document, it returns a > Unicode object that starts with u"". People > cannot write this to a file as-is, and they cannot encode it in > anything but UTF-8 (because the document would then be incorrect). > > So I added an optional encoding= argument to .toxml, for 2.3. The > question now is: should that argument also be made available for > 2.2.2? Adding the argument would only help applications which would make use of it. An application written for Python 2.2 couldn't do this since the optional argument wouldn't be available. BTW, the above is trying to fix an application bug rather than a Python one: if the application cannot deal with Unicode, it is not non-ASCII compatible. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 3 09:02:35 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 03 Jul 2002 10:02:35 +0200 Subject: [Python-Dev] Some dull gc stats References: Message-ID: <3D22AF9B.5030104@lemburg.com> Tim Peters wrote: > [MaL, replying to me, but presumably bonding with Martin again ] > >>Patch level releases should *never* include new features (unless >>these are essential to fix a serious bug or a simple byproduct >>of a fix). I don't know where you got the impression that Python >>should move back to the 1.5 branch development process where patch >>levels added new features. > > > The pre-PBF Patch Czars generally took a hard "no new features!" stance, but > it seems to be up in the air now. I wonder why... just because Fossetts can't get back to solid ground doesn't mean we have to follow him ;-) >>W/r to the PBF: at EuroPython we did a poll to see which version >>to base the PBF's activities on. The result was that a majority >>voted for Python 2.2 as first target. > > > Cool! Good choice. > > >>Patch levels are there to stabilize a release, not make it >>more powerful. > > > This is one popular view, although there's plenty of wiggle room in what > "stabilize" means (e.g., is it "stabilizing" to port Python to a new > platform? to speed a bottleneck? to add a new encoding? etc). "Stabilize" should mean to make triggering bugs in a Python release less likely. I don't think that porting to a new platform falls under this definition, a new encoding might (but then only if the encoding is so popular that people consider its absence a bug), performance tweaks are probably within range if they are in the micro-optimization area and hidden within the interpreter. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gherron@islandtraining.com Wed Jul 3 09:25:37 2002 From: gherron@islandtraining.com (Gary Herron) Date: Wed, 3 Jul 2002 01:25:37 -0700 Subject: [Python-Dev] [development doc updates] In-Reply-To: <20020702222813.8990118EC22@grendel.zope.com> References: <20020702222813.8990118EC22@grendel.zope.com> Message-ID: <200207030125.38106.gherron@islandtraining.com> Fred, Reading the "What's New in Python 2.3" section, I find the following sentence in "5 Extended Slices": Ever since Python 1.4 the slice syntax has supported a third ``Stride'' argument, but the builtin sequence types have not supported this feature (it was initially included at the behest of the developers of the Numerical Python package). This changes with Python 2.3. This is ambiguous. Exactly *HOW* does it change with Python 2.3? Does the stride argument go away, or do builtin sequence types now support the stride argument? If I'd followed this newsgroup more carefully, I'd probably know the answer. The paragraph about PendingDeprecationWarning, which follows the above quote, probably provides a clue, but it seems out of place, having nothing to do with slices. Gary Herron gherron@islandtraining.com From ping@zesty.ca Wed Jul 3 10:33:24 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 3 Jul 2002 02:33:24 -0700 (PDT) Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <20020703070617.GA25449@hishome.net> Message-ID: On Wed, 3 Jul 2002, Oren Tirosh wrote: > On Tue, Jul 02, 2002 at 11:07:14PM -0700, Ka-Ping Yee wrote: > > Oren Tirosh wrote: > > > No two istr instances are equal unless they are > > > identical. I guess PyString_CheckExact would need to be changed to > > > accept either String or InternedString. [...] > > Pass an 'istr' into a routine that expects strings, and it would > > appear to be a string right up until someone tried to == it, whereupon > > all hell would break loose. > > I don't understand your assumptions. I just went on what you wrote: "No two istr instances are equal unless they are identical." I read that to mean that == would be implemented with pointer comparison, which would break contracts the way i described. I see now that is not what you meant. It appears that what you are proposing is what interned string comparison already does (since == checks for pointer equality first). So, the only observable effect of the change would be to break all code that tests for type(s) == str. -- ?!ng From oren-py-d@hishome.net Wed Jul 3 10:59:15 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 3 Jul 2002 05:59:15 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: References: <20020703070617.GA25449@hishome.net> Message-ID: <20020703095915.GA43336@hishome.net> On Wed, Jul 03, 2002 at 02:33:24AM -0700, Ka-Ping Yee wrote: > I just went on what you wrote: "No two istr instances are equal unless > they are identical." I read that to mean that == would be implemented > with pointer comparison, which would break contracts the way i described. > I see now that is not what you meant. If all dutchmen like Monty Python it doesn't mean that anyone who likes Monty Python is a dutchman. > It appears that what you are proposing is what interned string > comparison already does (since == checks for pointer equality first). But INequality checking may still require strcmp. Inverse logic again. > So, the only observable effect of the change would be to break all > code that tests for type(s) == str. Yes, that's certainly a problem. This thought experiment is part of a strange fantasy I have that Python might one day use only interned strings to represent names. There are relatively few places where a string may be converted to a name (getattr, hasattr, etc) and these could be interned at the interface if interned strings are not immortal. I expect that nothing will ever come out of this, but it's fun to think about it anyway... Oren From ping@zesty.ca Wed Jul 3 11:14:33 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 3 Jul 2002 03:14:33 -0700 (PDT) Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <20020703095915.GA43336@hishome.net> Message-ID: On Wed, 3 Jul 2002, Oren Tirosh wrote: > > It appears that what you are proposing is what interned string > > comparison already does (since == checks for pointer equality first). > > But INequality checking may still require strcmp. Inverse logic again. I never claimed it wouldn't. All i'm saying is that string comparison already does this: compare pointers, then if not equal, compare strings. > > So, the only observable effect of the change would be to break all > > code that tests for type(s) == str. > > Yes, that's certainly a problem. But you haven't responded to my point. Would there be *any* effect other than breakage? -- ?!ng From oren-py-d@hishome.net Wed Jul 3 12:07:35 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 3 Jul 2002 07:07:35 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: References: <20020703095915.GA43336@hishome.net> Message-ID: <20020703110735.GA50268@hishome.net> On Wed, Jul 03, 2002 at 03:14:33AM -0700, Ka-Ping Yee wrote: > On Wed, 3 Jul 2002, Oren Tirosh wrote: > > > It appears that what you are proposing is what interned string > > > comparison already does (since == checks for pointer equality first). > > > > But INequality checking may still require strcmp. Inverse logic again. > > I never claimed it wouldn't. All i'm saying is that string comparison > already does this: compare pointers, then if not equal, compare strings. > > > > So, the only observable effect of the change would be to break all > > > code that tests for type(s) == str. > > > > Yes, that's certainly a problem. > > But you haven't responded to my point. Would there be *any* effect > other than breakage? The warm fuzzy feeling that you have a real symbol type :-) Just for the record: I am not a LISP zealot. Oren From mwh@python.net Wed Jul 3 13:03:49 2002 From: mwh@python.net (Michael Hudson) Date: 03 Jul 2002 13:03:49 +0100 Subject: [Python-Dev] [development doc updates] In-Reply-To: Gary Herron's message of "Wed, 3 Jul 2002 01:25:37 -0700" References: <20020702222813.8990118EC22@grendel.zope.com> <200207030125.38106.gherron@islandtraining.com> Message-ID: <2m1yalf26y.fsf@starship.python.net> Gary Herron writes: > Fred, > > Reading the "What's New in Python 2.3" section, I find the following > sentence in "5 Extended Slices": > > Ever since Python 1.4 the slice syntax has supported a third > ``Stride'' argument, but the builtin sequence types have not > supported this feature (it was initially included at the behest of > the developers of the Numerical Python package). This changes with > Python 2.3. > > This is ambiguous. Unfinished is closer to the truth. > Exactly *HOW* does it change with Python 2.3? Does the stride > argument go away, No. > or do builtin sequence types now support the > stride argument? Yes. > If I'd followed this newsgroup more carefully, I'd probably know the > answer. The section will be suitably fleshed out by the time of the first 2.3 alpha (I sincerely hope). > The paragraph about PendingDeprecationWarning, which follows the above > quote, probably provides a clue, Nope. > but it seems out of place, having nothing to do with slices. This is because there's a commented out section break in the source. I'll uncomment it. There probably needs to be some editorial work done on the whole document wrt. section ordering, whether things count as sections or subsections, etc. But not by me. Cheers, M. -- Those who have deviant punctuation desires should take care of their own perverted needs. -- Erik Naggum, comp.lang.lisp From fdrake@acm.org Wed Jul 3 13:04:30 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 3 Jul 2002 08:04:30 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <200207030125.38106.gherron@islandtraining.com> References: <20020702222813.8990118EC22@grendel.zope.com> <200207030125.38106.gherron@islandtraining.com> Message-ID: <15650.59470.76955.48172@grendel.zope.com> Gary Herron writes: > Reading the "What's New in Python 2.3" section, I find the following > sentence in "5 Extended Slices": ... > This is ambiguous. Exactly *HOW* does it change with Python 2.3? > Does the stride argument go away, or do builtin sequence types now > support the stride argument? If I'd followed this newsgroup more > carefully, I'd probably know the answer. The built-in types now support stride. Thanks for pointing this ambiguity out; I've changed the explanation in the document so that this is clear. > The paragraph about PendingDeprecationWarning, which follows the above > quote, probably provides a clue, but it seems out of place, having > nothing to do with slices. There was a section heading that was commented out in the document source; I've uncommented the heading. More material will be added to the new section as we have time to complete the material. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Wed Jul 3 14:12:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 3 Jul 2002 09:12:42 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning References: Message-ID: <15650.63562.102139.566843@anthem.wooz.org> >>>>> "TD" == Timothy Delaney writes: TD> The #1 most important consideration here is backwards TD> compatibility IMO. Whilst I would be personally unaffected by TD> this change (allowing interned strings to be collected), we've TD> already had examples of people and code that would be. I still think most applications don't care about interned strings, and they really don't care whether they're immortal or not. Long running apps probably do care, but for them, I'd rather see the application writers have to take an explicit action to free the intern strings. Only they are going to know whether they're depending on immortal interns, and when it's "safe" and prudent to reclaim them. -Barry From barry@zope.com Wed Jul 3 14:26:15 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 3 Jul 2002 09:26:15 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> Message-ID: <15650.64375.162977.160780@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Adding the argument would only help applications which would MAL> make use of it. An application written for Python 2.2 MAL> couldn't do this since the optional argument wouldn't be MAL> available. Ok, here's another question. When I updated the email package in Python 2.3, Guido wanted me to backport it to Python 2.2.x. I did that once, but there's been a lot of changes since then, both bug fixes, API "fixes", and new functionality. The email package can be installed separately as a distutils package, and it is compatible all the way back to Python 2.1.x. Which means someone /could/ install the latest version in their site-packages and have the new functionality in any of the last 3 versions of Python, although it would be tricky for Python 2.2.1. So does it make sense to backport the latest email package to Python 2.2.2? That's what Guido wanted, and I could argue that doing so improves stability of that branch, because while it adds a lot of new stuff, the old stuff was fairly well broken. E.g. you can't properly encode RFC 2047 headers in Python 2.2.1's email package. Backporting allows application writers to fix their code so that it works compatibly and correctly across more versions of Python than if we didn't backport. It also makes no sense to maintain two different code bases (especially now that that's been reduced from 3! :). OTOH, it definitely adds new features. Maybe email is special because it was so new in Python 2.2, and so I took a more naive approach to some issues that a wider use uncovered. it-ain't-always-simple-ly y'rs, -Barry From barry@zope.com Wed Jul 3 14:34:12 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 3 Jul 2002 09:34:12 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning References: <20020703070617.GA25449@hishome.net> Message-ID: <15650.64852.728915.275239@anthem.wooz.org> >>>>> "KY" == Ka-Ping Yee writes: KY> It appears that what you are proposing is what interned string KY> comparison already does (since == checks for pointer equality KY> first). So, the only observable effect of the change would be KY> to break all code that tests for type(s) == str. Shouldn't those already be written as isinstance(s, str)? Maybe with StringType for str? Even so, I'm not much in favor of adding more string types to the language. I think we should be /collapsing/ string types not proliferating them (i.e. removing the distinction between str and unicode -- Jython seems to get by just fine that way). -Barry From tim@zope.com Wed Jul 3 17:56:17 2002 From: tim@zope.com (Tim Peters) Date: Wed, 3 Jul 2002 12:56:17 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: <3D22A9B6.1050208@lemburg.com> Message-ID: [M.-A. Lemburg] > Hmm, isn't the idea of having compile time options to give > people a chance to eliminate the feature altogether ? That may be idea for some symbols; e.g., I suppose HAVE_UNICODE is of that nature, although PythonLabs never tests with that disabled either. WITH_CYCLE_GC wasn't of that nature. Like pymalloc before it, cyclic gc was *thought* to be such a large change that it would be prudent to leave cyclic gc off for a release, but give adventurous people a symbol they could use to try it. WITH_CYCLE_GC was enabled by default in the first alpha release to get it some exercise. That didn't turn up any significant problems, so we left it on in the next alpha release too. Still no problems, so we left it on for all the alphas releases. Still no problems, so we left it on for all the beta releases. Still no problems, so we concluded "screw this, let's leave it enabled for the final release too". So the purpose for which WITH_CYCLE_GC was introduced went away before anyone had a chance to use it for that purpose. > I'm thinking in terms of memory footprint of the running > interpreter and its binary. Platforms like e.g. Palm > or Pocket PC are very touchy about this. Embedded devices > even more. I don't buy this. I don't work on embedded devices in this incarnation, and from what I've seen the people who do aren't helped at all by people who don't guessing about what they might need. If people on embedded devices need help in the core, they can speak for themselves, and get the help they *really* need. > How much memory footprint would removing the #ifdefs > cause on average ? 6, give or take. From skip@pobox.com Wed Jul 3 17:59:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 3 Jul 2002 11:59:37 -0500 Subject: [Python-Dev] Idle thoughts about array objects and xmlrpclib Message-ID: <15651.11641.32411.963918@12-248-8-148.client.attbi.com> I installed the latest version of MySQLdb yesterday and got mildly bitten by a change Andy Dustman made. He began returning BLOB fields as array objects created with a 'c' typecode. Since MySQL doesn't distinguish between TEXT and BLOB fields, I was temporarily unable to pass SQL results back through my XML-RPC interface (I use TEXT, but not BLOB). Andy and I discussed it and he decided to back out this change to MySQLdb. That got me to thinking. Perhaps xmlrpclib should do the obvious thing with array objects. For all numeric typecodes it should marshal them to lists. For 'c' and 'u' typecodes it's a bit more problematic. Should they be lists or strings (or Unicode strings)? Fredrik, have you considered whether xmlrpclib could or should support array objects? Skip From jeremy@zope.com Wed Jul 3 18:49:28 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 3 Jul 2002 13:49:28 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: References: Message-ID: <15651.14632.499800.910362@slothrop.zope.com> >>>>> "MvL" == Martin v Loewis writes: MvL> Tim Peters writes: >> What say ye to nuking the #ifdefs conditionalizing it in the core >> for 2.3? MvL> Good idea. +1 Jeremy From jeremy@zope.com Wed Jul 3 19:17:12 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 3 Jul 2002 14:17:12 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <20020703095915.GA43336@hishome.net> References: <20020703070617.GA25449@hishome.net> <20020703095915.GA43336@hishome.net> Message-ID: <15651.16296.855975.476641@slothrop.zope.com> >>>>> "OT" == Oren Tirosh writes: OT> This thought experiment is part of a strange fantasy I have that OT> Python might one day use only interned strings to represent OT> names. There are relatively few places where a string may be OT> converted to a name (getattr, hasattr, etc) and these could be OT> interned at the interface if interned strings are not OT> immortal. I expect that nothing will ever come out of this, but OT> it's fun to think about it anyway... two responses: What do you mean by "represent names"? Code objects already use interned strings for names. Did you have something else in mind? You might have mentioned this thought experiment / strange fantasy at the outset of the thread <0.2 wink>. There was a lot of email thrashing on this subject, but none of it apeears to have been necessary. Jeremy From fredrik@pythonware.com Wed Jul 3 19:20:39 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 3 Jul 2002 20:20:39 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: Message-ID: <017f01c222be$5792e310$ced241d5@hagrid> tim wrote: > What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3? +1. go ahead. From martin@v.loewis.de Wed Jul 3 19:10:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 03 Jul 2002 20:10:57 +0200 Subject: [Python-Dev] Some dull gc stats In-Reply-To: <3D22AF9B.5030104@lemburg.com> References: <3D22AF9B.5030104@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > I don't think that porting to a new platform falls under > this definition, a new encoding might (but then only if the encoding > is so popular that people consider its absence a bug) Here we go. If "many people consider absence of foo a bug" is enough to allow for a change, I can backport any change if I only find enough people to testify that absence of that change is a bug... Regards, Martin From faassen@vet.uu.nl Wed Jul 3 19:26:57 2002 From: faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 3 Jul 2002 20:26:57 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: References: Message-ID: <20020703182657.GA20472@vet.uu.nl> Tim Peters wrote: > but since we never test without it the ability to compile it out > isn't much of "a feature". If the problem is you don't have time to test it, what about talking to the Snake Farm people of the Python Business Forum? http://www.lysator.liu.se/~sfarmer/ They may be able and willing to help. Regards, Martijn From mal@lemburg.com Wed Jul 3 19:39:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 03 Jul 2002 20:39:11 +0200 Subject: [Python-Dev] Some dull gc stats References: <3D22AF9B.5030104@lemburg.com> Message-ID: <3D2344CF.1@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: > > >>I don't think that porting to a new platform falls under >>this definition, a new encoding might (but then only if the encoding >>is so popular that people consider its absence a bug) > > > Here we go. If "many people consider absence of foo a bug" is enough > to allow for a change, I can backport any change if I only find enough > people to testify that absence of that change is a bug... No, I was not talking about a missing foo; the comment was specifically about an encoding. Adding a new encoding would not need applications to be changed since the encoding information is part of the processed data. Anyway, if this confuses too much, simply go for the more restrictive: no new features at all paradigm. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Wed Jul 3 19:39:59 2002 From: tim@zope.com (Tim Peters) Date: Wed, 3 Jul 2002 14:39:59 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: <20020703182657.GA20472@vet.uu.nl> Message-ID: [Tim Peters] > but since we never test without it the ability to compile it out > isn't much of "a feature". [Martijn Faassen] > If the problem is you don't have time to test it, That's not a problem for me. In context, it was just one more reason why keeping WITH_CYCLE_GC has become a poor idea at best. > what about talking to the Snake Farm people of the Python Business > Forum? > > http://www.lysator.liu.se/~sfarmer/ > > They may be able and willing to help. This would be a good idea for an "optional feature" somebody actually wants. For example, is HAVE_UNICODE actually turned off out in the world? If that possibility is important to the PBF, then they should arrange to test it. We certainly don't. We shouldn't be the ones telling the PBF what's important to them, either -- they need to figure that out following their own lights. We don't test any variations beyond debug vs release build, and it appears that the debug build isn't tested much except on Windows (although I expect the debugging memory allocator in 2.3 will suck more Linux developers into running debug builds). From jeremy@zope.com Wed Jul 3 19:46:58 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 3 Jul 2002 14:46:58 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: References: <20020703182657.GA20472@vet.uu.nl> Message-ID: <15651.18082.124379.188458@slothrop.zope.com> >>>>> "TP" == Tim Peters writes: TP> We don't test any variations beyond debug vs release build, and TP> it appears that the debug build isn't tested much except on TP> Windows (although I expect the debugging memory allocator in 2.3 TP> will suck more Linux developers into running debug builds). It was good enough to suck me in. What's more, it was so helpful that it motivated me to fix Zope so that it runs under a debug build. Jeremy From barry@zope.com Wed Jul 3 19:48:41 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 3 Jul 2002 14:48:41 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: <20020703182657.GA20472@vet.uu.nl> Message-ID: <15651.18185.961668.124297@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> We don't test any variations beyond debug vs release build, TP> and it appears that the debug build isn't tested much except TP> on Windows (although I expect the debugging memory allocator TP> in 2.3 will suck more Linux developers into running debug TP> builds). Yep, I typically run Python2.3cvs --with-pydebug. -Barry From oren-py-d@hishome.net Wed Jul 3 20:21:34 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 3 Jul 2002 15:21:34 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <15651.16296.855975.476641@slothrop.zope.com> References: <20020703070617.GA25449@hishome.net> <20020703095915.GA43336@hishome.net> <15651.16296.855975.476641@slothrop.zope.com> Message-ID: <20020703192134.GA2206@hishome.net> On Wed, Jul 03, 2002 at 02:17:12PM -0400, Jeremy Hylton wrote: > >>>>> "OT" == Oren Tirosh writes: > > OT> This thought experiment is part of a strange fantasy I have that > OT> Python might one day use only interned strings to represent > OT> names. There are relatively few places where a string may be > OT> converted to a name (getattr, hasattr, etc) and these could be > OT> interned at the interface if interned strings are not > OT> immortal. I expect that nothing will ever come out of this, but > OT> it's fun to think about it anyway... > > two responses: > > What do you mean by "represent names"? Code objects already use > interned strings for names. Did you have something else in mind? Not something else - just more of the same. Interned names in co_names tuples are a good start but there are tons of places where literal C-strings are used such as in descriptors. These names are converted to temporary Python strings on demand. My humble goal is for any name that has a predefined meaning in Python to appear exactly once in the executable and that instance will be in the form of a static preinitialized Python string object, not a C string literal. Here's how it might work: to use the name 'foo' you just refer to the C name PYSYMfoo. During build a helper program scans all C sources for names starting with PYSYM and automatically generates a .c file where each of these names appears once as a pre-initialized string object and an .h file included by Python.h. On startup all these string objects are interned, of course. So any name used from C is resolved by the linker to point to the interned single instance. Any name appearing unquoted in Python code is interned when when it's compiled or loaded from the .pyc file. There are some cases where a string becomes a name such as the arguments to functions like getattr and hasattr. These would need to be interned before reaching the 100% interned core of the language. I guess this could be done by a new PyArgs_ParseTuple format char. This obviously requires interned strings to be non-immortal. For example: if (strcmp(sname, "__class__") == 0) becomes if (if sname == PYSYM__class__) This is a pretty trivial example but I have other ideas for optimizations and cleanups that this would enable. These might lead to significant improvements in code size and performance. Well, that's my fantasy. There are still some "minor" problems like totally breaking the C API. Oren From tim@zope.com Wed Jul 3 21:22:36 2002 From: tim@zope.com (Tim Peters) Date: Wed, 3 Jul 2002 16:22:36 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <15650.64375.162977.160780@anthem.wooz.org> Message-ID: [people ask assorted hypothetical questions about backporting] Note that if the PBF is a success (and I sure hope that it is!), backporting stuff to the py-in-a-tie release line is supposed to become its job, not Python-Dev's. They'll backport whatever they see fit, and it won't matter what even Guido thinks then. In the meantime, I suggest *we* stick to backporting unarguable bugfixes. How can you tell whether something is unarguable? If in doubt, backport it, and if someone complains, tell them to revert it <0.8 wink>. From mal@lemburg.com Wed Jul 3 21:36:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 03 Jul 2002 22:36:27 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: Message-ID: <3D23604B.4080408@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>Hmm, isn't the idea of having compile time options to give >>people a chance to eliminate the feature altogether ? > > > That may be idea for some symbols; e.g., I suppose HAVE_UNICODE is of that > nature, although PythonLabs never tests with that disabled either. > > WITH_CYCLE_GC wasn't of that nature. Like pymalloc before it, cyclic gc was > *thought* to be such a large change that it would be prudent to leave cyclic > gc off for a release, but give adventurous people a symbol they could use to > try it. WITH_CYCLE_GC was enabled by default in the first alpha release to > get it some exercise. That didn't turn up any significant problems, so we > left it on in the next alpha release too. Still no problems, so we left it > on for all the alphas releases. Still no problems, so we left it on for all > the beta releases. Still no problems, so we concluded "screw this, let's > leave it enabled for the final release too". So the purpose for which > WITH_CYCLE_GC was introduced went away before anyone had a chance to use it > for that purpose. Fine. >>I'm thinking in terms of memory footprint of the running >>interpreter and its binary. Platforms like e.g. Palm >>or Pocket PC are very touchy about this. Embedded devices >>even more. > > > I don't buy this. I don't work on embedded devices in this incarnation, and > from what I've seen the people who do aren't helped at all by people who > don't guessing about what they might need. If people on embedded devices > need help in the core, they can speak for themselves, and get the help they > *really* need. Then why do we have a switch to optionally remove the Unicode support ? or for disabling interning of strings ? or for caching small integers ? >>How much memory footprint would removing the #ifdefs >>cause on average ? > > > 6, give or take. 6 what ? snakes, rabbits, swallows ? I'm missing a concise concept here :-) If you want to make life hard for people who want to customize the interpreter, then you should remove *all* such #ifdefs. If not, then having the #ifdefs adds important meta-information to the code. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Wed Jul 3 22:05:18 2002 From: tim@zope.com (Tim Peters) Date: Wed, 3 Jul 2002 17:05:18 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: <3D23604B.4080408@lemburg.com> Message-ID: [MAL] > Then why do we have a switch to optionally remove the Unicode > support ? I don't know, although I've asked that question myself. People used to be frightened of the Unicode database sizes; I'm not sure they are anymore. > or for disabling interning of strings ? There is no such switch now. There used to be one. Ditto for whether to cache hash values. Ditto whether to "cache-align" hash table entries. At the time those were nuked, Guido also said he wanted COUNT_ALLOCS to disappear (and to act as if it were always #define'd in a Py_TRACE_REFS build), but nobody has gotten to that yet. > or for caching small integers ? There isn't a switch for that either, although there are two undocumented symbols you can #define such that if their sum is <= 0, small ints waste *more* memory than if you leave the code alone. There's no way to disable the unbounded and immortal int free list, and never was. >>> How much memory footprint would removing the #ifdefs >>> cause on average ? >> 6, give or take. > 6 what ? snakes, rabbits, swallows ? You asked an unanswerable (not to mention unparseable) question, I gave a useless yet accurate answer -- if you can rephrase your question in a way that can be answered, attach whatever units you need to make 6 exactly correct . Although note that since WITH_CYCLE_GC has been #define'd by default since it was introduced, removing its #ifdefery would have no effect on default builds. > I'm missing a concise concept here :-) > > If you want to make life hard for people who want to customize > the interpreter, then you should remove *all* such #ifdefs. If > not, then having the #ifdefs adds important meta-information > to the code. If you don't personally use a specific preprocessor symbol routinely, I won't accept your bare assertion that it makes life easier for anyone. Against that, every preprocessor symbol certainly makes it-- a little to a lot --harder to maintain the code. We almost never hear from anyone that these little nightmares are being used; when we do hear about them, it's almost always from a dabbler who "just tried it" and then complains because Python no longer works (from won't compile to segfaults). Fixing unused code is a waste of time; I won't do it anymore, but I will devote time to getting rid of unused code. From neal@metaslash.com Wed Jul 3 22:15:51 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 03 Jul 2002 17:15:51 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: Message-ID: <3D236987.B9BB266E@metaslash.com> Tim Peters wrote: > Fixing unused code is a waste of time; I won't do it anymore, > but I will devote time to getting rid of unused code. Amen. From tim.one@comcast.net Thu Jul 4 07:40:41 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 04 Jul 2002 02:40:41 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <15650.63562.102139.566843@anthem.wooz.org> Message-ID: [Timothy Delaney] > The #1 most important consideration here is backwards > compatibility IMO. Whilst I would be personally unaffected by > this change (allowing interned strings to be collected), we've > already had examples of people and code that would be. Have we? I posted an example I made up -- I've written and seen code *close* to that, but not close enough to actually break if interned strings were to get collected. I also saw Jack's interned-string refcount abuse in an isolated part of the core Mac support code, but breaking core code never counts because we have 100% control over the core (if interned strings were to get collected, we'd fiddle the Mac code for the same release, and nobody would be the wiser). I don't recall hearing about anything else here, and I don't know of anything else. Any subsystem that can waste an unbounded amount of memory is a potential cause of user headaches. I don't like immortal interned strings, and I don't like the unbounded int or float free lists either. It's also not good that pymalloc never returns arenas to the system, although at least that was carefully designed so that arenas not in use can become and stay paged out (e.g., it doesn't periodically "tickle" them as part of general bookkeeping -- when they're unused by the user, they're also untouched by pymalloc). So far, I don't know of any real loss that would occur as a result of reclaiming unreferenced interned strings. From mal@lemburg.com Thu Jul 4 10:05:54 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jul 2002 11:05:54 +0200 Subject: [Python-Dev] Incompatible changes to xmlrpclib Message-ID: <3D240FF2.3060708@lemburg.com> I noticed yesterday that the xmlrcplib.py version in CVS is incompatible with the version in Python 2.2: all the .dump_XXX() interfaces changed and now include a third argument. Since the Marshaller can be subclassed, this breaks all existing application space subclasses extending or changing the default xmlrpclib behaviour. I'd opt for moving back to the previous style of calling the write method via self.write. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jul 4 11:54:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jul 2002 12:54:27 +0200 Subject: [Python-Dev] Re: Alternative implementation of string interning References: Message-ID: <3D242963.4000204@lemburg.com> Tim Peters wrote: > So far, I don't know of any real loss that would occur as a result of > reclaiming unreferenced interned strings. Has anybody ever checked how many such strings live in the intern dict with ref count 1 in real life apps ? E.g. say you have Zope running on a standard web-site for 2 days -- how many such strings do you find in the interned dict ? Speaking for myself, I would have a problem with removing automatic interning of constant strings in Python source code since I rely on that "feature" for fast switching on values (if..elif..elif.......else). Since code objects usually don't go away while the interpreter is running, these would not be affected by the proposed strategy. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jul 4 12:38:33 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jul 2002 13:38:33 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> Message-ID: <3D2433B9.9080102@lemburg.com> Barry A. Warsaw wrote: >>>>>>"MAL" == M writes: >>>>> > > MAL> Adding the argument would only help applications which would > MAL> make use of it. An application written for Python 2.2 > MAL> couldn't do this since the optional argument wouldn't be > MAL> available. > > Ok, here's another question. When I updated the email package in > Python 2.3, Guido wanted me to backport it to Python 2.2.x. I did > that once, but there's been a lot of changes since then, both bug > fixes, API "fixes", and new functionality. > > The email package can be installed separately as a distutils package, > and it is compatible all the way back to Python 2.1.x. Which means > someone /could/ install the latest version in their site-packages and > have the new functionality in any of the last 3 versions of Python, > although it would be tricky for Python 2.2.1. > > So does it make sense to backport the latest email package to Python > 2.2.2? That's what Guido wanted, and I could argue that doing so > improves stability of that branch, because while it adds a lot of new > stuff, the old stuff was fairly well broken. E.g. you can't properly > encode RFC 2047 headers in Python 2.2.1's email package. Backporting > allows application writers to fix their code so that it works > compatibly and correctly across more versions of Python than if we > didn't backport. It also makes no sense to maintain two different > code bases (especially now that that's been reduced from 3! :). > > OTOH, it definitely adds new features. Maybe email is special because > it was so new in Python 2.2, and so I took a more naive approach to > some issues that a wider use uncovered. > > it-ain't-always-simple-ly y'rs, Never said it was... :-) For cases like the email package or distutils, I think it's perfectly OK to only provide the updates for older Python releases as separate download. Both have their own way of life, so IMHO this is acceptable. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jul 4 12:34:07 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jul 2002 13:34:07 +0200 Subject: [Python-Dev] Death to WITH_CYCLE_GC References: Message-ID: <3D2432AF.8010000@lemburg.com> Tim Peters wrote: > [MAL] > >>Then why do we have a switch to optionally remove the Unicode >>support ? > > > I don't know, although I've asked that question myself. People used to be > frightened of the Unicode database sizes; I'm not sure they are anymore. > > >>or for disabling interning of strings ? > > > There is no such switch now. There used to be one. Ditto for whether to > cache hash values. Ditto whether to "cache-align" hash table entries. At > the time those were nuked, Guido also said he wanted COUNT_ALLOCS to > disappear (and to act as if it were always #define'd in a Py_TRACE_REFS > build), but nobody has gotten to that yet. Interesting. I don't recall any discussions about this... >>or for caching small integers ? > > > There isn't a switch for that either, although there are two undocumented > symbols you can #define such that if their sum is <= 0, small ints waste > *more* memory than if you leave the code alone. There's no way to disable > the unbounded and immortal int free list, and never was. I was talking about NSMALLNEGINTS and NSMALLPOSINTS. >>>>How much memory footprint would removing the #ifdefs >>>>cause on average ? >>> > >>>6, give or take. >> > >>6 what ? snakes, rabbits, swallows ? > > > You asked an unanswerable (not to mention unparseable) question, I gave a > useless yet accurate answer -- if you can rephrase your question in a way > that can be answered, attach whatever units you need to make 6 exactly > correct . Although note that since WITH_CYCLE_GC has been #define'd > by default since it was introduced, removing its #ifdefery would have no > effect on default builds. Ok, let's make it parseable then: a) When removing the GC code from the code base by #undef'ing WITH_CYCLE_GC, how much smaller is the Python interpreter ? b) ..., how is pybench affected by this (speedup/slowdown/ unnoticable) ? c) ..., how many bytes per object are saved for container objects which are GC aware ? If we're talking about just a few kB in interpreter size and only a few kB worth of list and tuples, then removing is fine. If we're talking about 100kBs, then you ought to reconsider the move. >>I'm missing a concise concept here :-) >> >>If you want to make life hard for people who want to customize >>the interpreter, then you should remove *all* such #ifdefs. If >>not, then having the #ifdefs adds important meta-information >>to the code. > > > If you don't personally use a specific preprocessor symbol routinely, I > won't accept your bare assertion that it makes life easier for anyone. I personally know that developers which have tried to create a trimmed down version of the interpreter did like the #ifdefs for removing certain parts like e.g. the complex numbers very much. I'm just lobbying for them. After all, someone has to give you a hard time ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Thu Jul 4 13:18:00 2002 From: mwh@python.net (Michael Hudson) Date: 04 Jul 2002 13:18:00 +0100 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: "Tim Peters"'s message of "Wed, 3 Jul 2002 14:39:59 -0400" References: Message-ID: <2m65zvlm9z.fsf@starship.python.net> "Tim Peters" writes: > [Tim Peters] > > but since we never test without it the ability to compile it out > > isn't much of "a feature". > > [Martijn Faassen] > > If the problem is you don't have time to test it, > > That's not a problem for me. In context, it was just one more reason why > keeping WITH_CYCLE_GC has become a poor idea at best. > > > what about talking to the Snake Farm people of the Python Business > > Forum? > > > > http://www.lysator.liu.se/~sfarmer/ > > > > They may be able and willing to help. > > This would be a good idea for an "optional feature" somebody actually wants. > For example, is HAVE_UNICODE actually turned off out in the world? I build 3 debug builds (ucs2, ucs4 and without unicode) and run the test suites every night on linux/x86. test_unicode still fails in ucs4 builds... Cheers, M. -- For every complex problem, there is a solution that is simple, neat, and wrong. -- H. L. Mencken From martin@v.loewis.de Thu Jul 4 21:17:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jul 2002 22:17:00 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D2433B9.9080102@lemburg.com> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > For cases like the email package or distutils, I think it's > perfectly OK to only provide the updates for older Python > releases as separate download. Both have their own way of > life, so IMHO this is acceptable. In neither case, this is really possible: Once you have the package in the Python core, a separate installation in site-packages cannot override the core implementation. I believe that was the motivation for Barry to consider backporting large amounts of changes. The same holds for distutils, except that there aren't that many major changes. Regards, Martin From aleax@aleax.it Fri Jul 5 06:30:16 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 5 Jul 2002 07:30:16 +0200 Subject: [Python-Dev] Some dull gc stats In-Reply-To: References: Message-ID: <02070507301605.27343@arthur> On Tuesday 02 July 2002 21:06, Tim Peters wrote: > [martin@v.loewis.de] > > > I understand that it is not a requirement anymore that changes to > > Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2 > > evolves and continues to grow new features, as long as they are > > "strictly backwards compatible". > > Alex made a case here for "new features", but the Python Business Forum > hasn't shown interest in that. As a Python Business Forum member and board member, I think I can state that if a (business) case is indeed made, the PBF interest is there. > Like most businessfolk, I expect they'll > ignore such issues until someone discovers that the lack of a new feature > is putting them out of business <0.8 wink>. I suspect instead that a businessperson clever enough to pick Python rather than heavily-hyped Java or widely-popular Perl or PHP is most likely to be an unusually clever businessperson, with some level of perception of what programming productivity is worth and how to get it. > > For any user-visible feature, it is normally debatable whether it is > > "strictly backwards compatible", since it is, by nature, a change in > > observable behaviour. > > > > This specific case is not in that category (i.e. has no > > user-observable behaviour change), so I think it qualifies for 2.2 - > > provided there is enough trust in its correctness. > > The "bugfix part" of these changes certainly had user-visible aspects, in > that before it was possible for objects in older generations to get > yanked back into younger generations. This can affect when objects get > collected, and so throw off over-tuned programs slinging gc.enable() and > disable() "at exactly the best time(s)". Performance change is not quite the same thing as behavior change. I agree with Martin that, assuming a performance-oriented change is 'known' to be correct (no change in the inputs-to-outputs behavior of programs), the criterion should be one of overall benefit rather han one of Pareto optima. > > I'm concerned that backporting more changes to Python 2.2 will become > > difficult in that area, if the GC implementations vary significantly. > > Maintaining multiple branches is always a PITA. Yes, but the degree of pain varies with the branches' separation. > > Maybe this can be reconsidered when there actually is another change > > to backport. > > Anyone who is so inclined is welcome to reconsider it non-stop . I suspect we'll indeed reconsider it. Whether we do something about it after the reconsideration will depend on cost-benefit analysis... Alex From aleax@aleax.it Fri Jul 5 07:08:16 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 5 Jul 2002 08:08:16 +0200 Subject: [Python-Dev] List comprehensions In-Reply-To: References: Message-ID: <02070508081606.27343@arthur> On Thursday 27 June 2002 01:29, Tim Peters wrote: > [Gerald S. Williams, on listcomp (non)scopes] > > > No problem. As long as it was decided that there's a use for > > the current behavior, I won't question it. > > I'm not sure there's a use for it, but I am sure I'd shoot any coworker > who found one and relied on it . The real (and no doubt PSU-intended) use of list comprehensions is, of course, to finesse Python's _apparent_ lack of assignment-in-expression. Instead of coding a vulgar, typo-prone, hoi-polloi oriented: while x = bluh(): whatever(x) you get to code an elegant, refined, hoi-oligoi reserved: while [ x for x in [bluh()] if x ] : whatever(x) (I'm told Beretta makes excellent small arms, should you need one...). Alex From gerhard.haering@gmx.de Fri Jul 5 07:47:09 2002 From: gerhard.haering@gmx.de (=?ISO-8859-1?Q?Gerhard=20H=E4ring?=) Date: Fri, 5 Jul 2002 08:47:09 +0200 (Central Europe Daylight Time) Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> Message-ID: <20020705064452.A05DA3F1@gargamel.hqd-internal> "Martin v. Loewis" wrote: > "M.-A. Lemburg" writes: > > > For cases like the email package or distutils, I think it's > > perfectly OK to only provide the updates for older Python > > releases as separate download. Both have their own way of > > life, so IMHO this is acceptable. > > In neither case, this is really possible: Once you have the package in > the Python core, a separate installation in site-packages cannot > override the core implementation. This might sound clueless, but wouldn't it be a good idea to change that? So that site-packages comes before Lib/ in sys.path? Gerhard -- mail: gerhard.haering@gmx.de registered Linux user #64239 web: http://www.cs.fhm.edu/~ifw00065/ OpenPGP public key id 86AB43C0 public key fingerprint: DEC1 1D02 5743 1159 CD20 A4B6 7B22 6575 86AB 43C0 reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b'))) From mal@lemburg.com Fri Jul 5 09:45:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jul 2002 10:45:36 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> Message-ID: <3D255CB0.9080502@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: > > >>For cases like the email package or distutils, I think it's >>perfectly OK to only provide the updates for older Python >>releases as separate download. Both have their own way of >>life, so IMHO this is acceptable. > > > In neither case, this is really possible: Once you have the package in > the Python core, a separate installation in site-packages cannot > override the core implementation. True, but it is easily possible to install those packages in a directory which is scanned before the standard lib, thus overriding the distribution versions: python setup.py install install-lib=~/lib > I believe that was the motivation for Barry to consider backporting > large amounts of changes. The same holds for distutils, except that > there aren't that many major changes. If that's the case, then we probably ought to make it easier for user installed Python add-ons to override builtin packages. This would help to get rid off the hacks which the PyXML distribution has to use in order to achieve the same. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Fri Jul 5 17:03:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 05 Jul 2002 18:03:30 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D255CB0.9080502@lemburg.com> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <3D255CB0.9080502@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > True, but it is easily possible to install those packages > in a directory which is scanned before the standard lib, thus > overriding the distribution versions: > > python setup.py install install-lib=~/lib Why is ~/lib scanned before the standard lib? Regards, Martin From martin@v.loewis.de Fri Jul 5 17:01:37 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 05 Jul 2002 18:01:37 +0200 Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <20020705064452.A05DA3F1@gargamel.hqd-internal> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <20020705064452.A05DA3F1@gargamel.hqd-internal> Message-ID: Gerhard H=E4ring writes: > This might sound clueless, but wouldn't it be a good idea to change that? > So that site-packages comes before Lib/ in sys.path? No, this is by design, to prevent people from overriding the standard library. Essentially, all module names in the standard library are reserved; this procedure enforces that (somewhat, you can always insert things in the beginning of sys.path). Regards, Martin From mal@lemburg.com Fri Jul 5 18:17:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jul 2002 19:17:47 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <3D255CB0.9080502@lemburg.com> Message-ID: <3D25D4BB.6070407@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: > > >>True, but it is easily possible to install those packages >>in a directory which is scanned before the standard lib, thus >>overriding the distribution versions: >> >>python setup.py install install-lib=~/lib > > > Why is ~/lib scanned before the standard lib? Because I have it defined in PYTHONPATH :-) As I said, perhaps we need to make it easier to override std lib packages... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jul 5 18:19:03 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jul 2002 19:19:03 +0200 Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <20020705064452.A05DA3F1@gargamel.hqd-internal> Message-ID: <3D25D507.50805@lemburg.com> Martin v. Loewis wrote: > Gerhard H=E4ring writes: >=20 >=20 >>This might sound clueless, but wouldn't it be a good idea to change tha= t? >>So that site-packages comes before Lib/ in sys.path? >=20 >=20 > No, this is by design, to prevent people from overriding the standard > library. Essentially, all module names in the standard library are > reserved; this procedure enforces that (somewhat, you can always > insert things in the beginning of sys.path). Uhm, just for the record: all paths defined in PYTHONPATH are inserted before the std lib dirs in sys.path on startup, so the "restriction" is not really all that restrictive. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Fri Jul 5 18:48:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 05 Jul 2002 13:48:11 -0400 Subject: [Python-Dev] List comprehensions In-Reply-To: <02070508081606.27343@arthur> Message-ID: [Alex Martelli] > The real (and no doubt PSU-intended) use of list comprehensions is, of > course, to finesse Python's _apparent_ lack of assignment-in-expression. > > Instead of coding a vulgar, typo-prone, hoi-polloi oriented: > while x = bluh(): > whatever(x) > you get to code an elegant, refined, hoi-oligoi reserved: > while [ x for x in [bluh()] if x ] : > whatever(x) Excellent! Guido uses while [x for x in [bluh()]][0]: whatever(x) because he thinks it's "more elegant" (whatever that means to a Dutch guy), but either way it's major relief from the obscurity of embedded assignment. > (I'm told Beretta makes excellent small arms, should you need one...). Thanks for the suggestion! Some Americans consider it rude to shoot coworkers, so I'm always looking for ways to get across to them that it's more a matter of defending good taste than of killing people. Using a piece with sleek Italian design should go a long way toward helping to make this point. From tismer@tismer.com Sat Jul 6 00:01:43 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 05 Jul 2002 23:01:43 +0000 Subject: [Python-Dev] GC bug with __slots__ ? Message-ID: <3D262557.4000502@tismer.com> Hi Guido, I haven't been able to search lists since my laptop is stolen, so maybethis is a known issue: When I create a cyclic reference in a class with slots, it will not be detected by gc. #This one works fine: class a(int): pass x=a(7) x.x=x del x gc.collect # frees cycle #This one doesn't: class a(int): __slots__=["x"] x=a(7) x.x=x del x gc.collect # frees cycle ciao - chris (greetings from iceland) [yes there is no .sig, was stolen, too :-] From tim.one@comcast.net Sat Jul 6 00:09:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 05 Jul 2002 19:09:03 -0400 Subject: [Python-Dev] GC bug with __slots__ ? In-Reply-To: <3D262557.4000502@tismer.com> Message-ID: This is already fixed in CVS Python. From CVS NEWS: - Classes using __slots__ are now properly garbage collected. [SF bug 519621] I suspect your laptop may be hiding in the repository too! > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Christian Tismer > Sent: Friday, July 05, 2002 7:02 PM > Cc: python-dev@python.org > Subject: [Python-Dev] GC bug with __slots__ ? > > > Hi Guido, > > I haven't been able to search lists since > my laptop is stolen, so maybethis is a known issue: > > When I create a cyclic reference in a class with > slots, it will not be detected by gc. > > #This one works fine: > > class a(int): pass > x=a(7) > x.x=x > del x > gc.collect # frees cycle > > #This one doesn't: > > class a(int): __slots__=["x"] > x=a(7) > x.x=x > del x > gc.collect # frees cycle > > ciao - chris (greetings from iceland) > > [yes there is no .sig, was stolen, too :-] > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From tim.one@comcast.net Sat Jul 6 00:21:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 05 Jul 2002 19:21:12 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <3D242963.4000204@lemburg.com> Message-ID: [M.-A. Lemburg] > Has anybody ever checked how many such strings live in the > intern dict with ref count 1 in real life apps ? > > E.g. say you have Zope running on a standard web-site > for 2 days -- how many such strings do you find in the > interned dict ? I don't know. Jim Fulton has raised it as a Zope issue in the past, and my recollection is that each time this comes up we go through a dance like: OK, we'll turn off interning in that path. ... Oops! It looks like we already did! ... Oops! I guess we didn't on *that* path. ... *Which* paths does Zope use again? ... Ah, OK, no, we already turned off interning in those paths. ... Or at least we did in Python version i.j.k. *Which* versions are we worried about again? ... Does anyone remember which paths we're worried about? ... It fizzles out then due to terminal boredom . > Speaking for myself, I would have a problem with removing > automatic interning of constant strings in Python source > code I don't believe anyone has suggested doing so. Note that we don't automatically intern all constant strings in Python source, we only intern constant strings that "look like" identifiers. This is from fear of the immortality of interned strings. > since I rely on that "feature" for fast switching > on values (if..elif..elif.......else). Since code objects > usually don't go away while the interpreter is running, > these would not be affected by the proposed strategy. From tim.one@comcast.net Sat Jul 6 00:39:53 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 05 Jul 2002 19:39:53 -0400 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: <3D2432AF.8010000@lemburg.com> Message-ID: [Tim] >> There is no such switch now. There used to be one. ... [MAL] > Interesting. I don't recall any discussions about this... It was a topic on Python-Dev at the time, very brief, possibly consisting of no more than a "how about this?" msg, and a "yes, and that too" reply from Guido. >> ... although there are two undocumented symbols you can #define such >> that if their sum is <= 0, small ints waste *more* memory than if you >> leave the code alone. > I was talking about NSMALLNEGINTS and NSMALLPOSINTS. Me too. They're not there to turn off caching, they're there to tune it. If you #define them both to 0, you will, as I said, end up using more memory, not less. If you #undef them, it will have no effect -- the code will #define them back to their defaults then. > Ok, let's make it parseable then: > > a) When removing the GC code from the code base by #undef'ing > WITH_CYCLE_GC, how much smaller is the Python interpreter ? I don't know. > b) ..., how is pybench affected by this (speedup/slowdown/ > unnoticable) ? Ditto. > c) ..., how many bytes per object are saved for container objects > which are GC aware ? That's platform-dependent. It's 16 bytes on Win32 using MSVC 6. > If we're talking about just a few kB in interpreter size > and only a few kB worth of list and tuples, then removing > is fine. If we're talking about 100kBs, then you ought to > reconsider the move. Why? Python-Dev is for Python developers, and if nobody here *uses* the non-feature of being able to compile out cyclic gc, and the hypothetical people who do use it aren't serious enough about Python to participate here, there's no paying audience for this continued unused complexity. If somebody wants it, they can step up and volunteer to (a) maintain this code, and (b) test it. Short of those two happening, it's history. >> If you don't personally use a specific preprocessor symbol routinely, I >> won't accept your bare assertion that it makes life easier for anyone. > I personally know that developers which have tried to create > a trimmed down version of the interpreter did like the #ifdefs > for removing certain parts like e.g. the complex numbers very > much. I'm just lobbying for them. They can lobby for themselves, provided they still exist. BTW, WITHOUT_COMPLEX is the only preprocessor symbol I can think of that was deliberately intended to make life easier on small platforms (HAVE_UNICODE may or may not be in that boat -- I don't know why it's there). Given the comparatively trivial savings WITHOUT_COMPLEX affords, it hardly seems worth the bother. People who have written up the results of serious Python ports to tiny platforms report needing *major* surgery, far beyond anything these goofy #ifdefs provide (tiny platforms are, from a std C plus std POSIX view, deeply broken in many ways). The best such effort I knew of used to live here http://www.abo.fi/~iporres/python but that link is dead now, and a Google search doesn't suggest the project has moved somewhere else. > After all, someone has to give you a hard time ;-) Very true, and I thank you for playing along . From pinard@iro.umontreal.ca Sat Jul 6 17:41:40 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 06 Jul 2002 12:41:40 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: <20020628134254.GA14414@panix.com> References: <036101c21e68$8abed730$6501a8c0@boostconsulting.com> <20020628134254.GA14414@panix.com> Message-ID: [Aahz] > Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax > into something that I assume is maintainable (haven't looked at the code > myself) *and* Unicode compliant. Yes, this seems to be a very good thing for Python. Speedy regexp engines are notoriously hard to maintain cleanly, at least, so told me a few successive maintainers of GNU regexp. Difficult points are deterministic matching (avoiding backtracking) and POSIX compliance, and the longest match criterion in particular. For one, I'm pretty happy with Python regexp implementation, even if it avoids the above points. It has other virtues that are well worth the trade, at least from the experience I have of it so far! So, in a word, thanks too! :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From pinard@iro.umontreal.ca Sat Jul 6 17:56:06 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 06 Jul 2002 12:56:06 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: <20020624213318.A5740@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> Message-ID: --=-=-= [Kevin O'Connor] > I often find myself needing priority queues in python, and I've finally > broken down and written a simple implementation. [...] Any chance > something like this could make it into the standard python library? Two years ago, I (too!) wrote one (appended below) and I offered it to Guido. He replied he was not feeling like adding into the Python standard library each and every interesting algorithm on this earth. So I did not insist :-). --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename=heap.py Content-Transfer-Encoding: 8bit #!/usr/bin/env python # Copyright İ 2000, 2002 Progiciels Bourbeau-Pinard inc. # François Pinard , 2000. """\ Handle priority heaps. Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for all k, counting elements from 0. For the sake of comparison, unexisting elements are considered to be infinite. The interesting property of a heap is that a[0] is always its smallest element. The strange invariant above is meant to be an efficient memory representation for a tournament. The numbers below are `k', not a[k]: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'. In an usual binary tournament we see in sports, each cell is the winner over the two cells it tops, and we can trace the winner down the tree to see all opponents s/he had. However, in many computer applications of such tournaments, we do not need to trace the history of a winner. To be more memory efficient, when a winner is promoted, we try to replace it by something else at a lower level, and the rule becomes that a cell and the two cells it tops contain three different items, but the top cell "wins" over the two topped cells. If this heap invariant is protected at all time, index 0 is clearly the overall winner. The simplest algorithmic way to remove it and find the "next" winner is to move some looser (let's say cell 30 in the diagram above) into the 0 position, and then percolate this new 0 down the tree, exchanging values, until the invariant is re-established. This is clearly logarithmic on the total number of items in the tree. By iterating over all items, you get an O(n ln n) sort. A nice feature of this sort is that you can efficiently insert new items while the sort is going on, provided that the inserted items are not "better" than the last 0'th element you extracted. This is especially useful in simulation contexts, where the tree holds all incoming events, and the "win" condition means the smallest scheduled time. When an event schedule other events for execution, they are scheduled into the future, so they can easily go into the heap. So, a heap is a good structure for implementing schedulers (this is what I used for my MIDI sequencer :-). Various structures for implementing schedulers have been extensively studied, and heaps are good for this, as they are reasonably speedy, the speed is almost constant, and the worst case is not much different than the average case. However, there are other representations which are more efficient overall, yet the worst cases might be terrible. Heaps are also very useful in big disk sorts. You most probably all know that a big sort implies producing "runs" (which are pre-sorted sequences, which size is usually related to the amount of CPU memory), followed by a merging passes for these runs, which merging is often very cleverly organised[1]. It is very important that the initial sort produces the longest runs possible. Tournaments are a good way to that. If, using all the memory available to hold a tournament, you replace and percolate items that happen to fit the current run, you'll produce runs which are twice the size of the memory for random input, and much better for input fuzzily ordered. Moreover, if you output the 0'th item on disk and get an input which may not fit in the current tournament (because the value "wins" over the last output value), it cannot fit in the heap, so the size of the heap decreases. The freed memory could be cleverly reused immediately for progressively building a second heap, which grows at exactly the same rate the first heap is melting. When the first heap completely vanishes, you switch heaps and start a new run. Clever and quite effective! In a word, heaps are useful memory structures to know. I use them in a few applications, and I think it is good to keep a `heap' module around. :-) -------------------- [1] The disk balancing algorithms which are current, nowadays, are more annoying than clever, and this is a consequence of the seeking capabilities of the disks. On devices which cannot seek, like big tape drives, the story was quite different, and one had to be very clever to ensure (far in advance) that each tape movement will be the most effective possible (that is, will best participate at "progressing" the merge). Some tapes were even able to read backwards, and this was also used to avoid the rewinding time. Believe me, real good tape sorts were quite spectacular to watch! From all times, sorting has always been a Great Art! :-) """ class Heap: def __init__(self, compare=cmp): """\ Set a new heap. If COMPARE is given, use it instead of built-in comparison. COMPARE, given two items, should return negative, zero or positive depending on the fact the first item compares smaller, equal or greater than the second item. """ self.compare = compare self.array = [] def __call__(self): """\ A heap instance, when called as a function, return all its items. """ return self.array def __len__(self): """\ Return the number of items in the current heap instance. """ return len(self.array) def __getitem__(self, index): """\ Return the INDEX-th item from the heap instance. INDEX is usually zero. """ return self.array[index] def push(self, item): """\ Add ITEM to the current heap instance. """ array = self.array compare = self.compare array.append(item) high = len(array) - 1 while high > 0: low = (high-1)/2 if compare(array[low], array[high]) <= 0: break array[low], array[high] = array[high], array[low] high = low def pop(self): """\ Remove and return the smallest item from the current heap instance. """ array = self.array item = array[0] if len(array) == 1: del array[0] else: compare = self.compare array[0] = array.pop() low, high = 0, 1 while high < len(array): if ((high+1 < len(array) and compare(array[high], array[high+1]) > 0)): high = high+1 if compare(array[low], array[high]) <= 0: break array[low], array[high] = array[high], array[low] low, high = high, 2*high+1 return item def test(n=2000): heap = Heap() for k in range(n-1, -1, -1): heap.push(k) for k in range(n): assert k+len(heap) == n assert k == heap.pop() --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit -- François Pinard http://www.iro.umontreal.ca/~pinard --=-=-=-- From pinard@iro.umontreal.ca Sat Jul 6 18:04:05 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 06 Jul 2002 13:04:05 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: <20020625065203.GA27183@hishome.net> References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> Message-ID: [Oren Tirosh] > A sorted list is a much more general-purpose data structure than a priority > queue and can be used to implement a priority queue. [...] The only > advantage of a heap is O(1) peek which doesn't seem so critical. [...] > the internal order of a heap-based priority queue is very non-intuitive and > quite useless for other purposes while a sorted list is, umm..., sorted! It surely occurred to many of us to sort a file (or any set of data) from the most interesting entry to the least interesting entry, look at the first 5% to 10%, and drop all the rest. A heap is a good way to retain the first few percents of items, without going through the lengths of fully sorting all the rest. By comparison, it would not be efficient to use `.sort()' then truncate. Within a simulation, future events are scheduled while current events are being processed, so we do not have all the events to `.sort()' first. It is likely that heaps would beat insertion after binary search, given of course that both are implemented with the same care, speed-wise. -- François Pinard http://www.iro.umontreal.ca/~pinard From pinard@iro.umontreal.ca Sat Jul 6 23:43:13 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 06 Jul 2002 18:43:13 -0400 Subject: [Python-Dev] Re: The OOM-Killer vs. Python In-Reply-To: References: <3c9e2b5f.9993062@news.t-online.de> Message-ID: [Martin v. Loewis] > If you don't create any cyclic garbage, you can find all container > objects with gc.get_objects. If you find that gc.get_objects does not > grow longer over time, but your process still consumes more memory, > one of your C extensions has a refcounting bug. Out of curiosity, I checked with the latest HTML documentation (2.3a0) and did not find documentation for `gc.get_objects'. Should it be there? P.S. - Looking at http://www.python.org/dev/doc/devel/lib/module-gc.html. -- François Pinard http://www.iro.umontreal.ca/~pinard From martin@v.loewis.de Sun Jul 7 08:50:36 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 07 Jul 2002 09:50:36 +0200 Subject: [Python-Dev] Re: The OOM-Killer vs. Python In-Reply-To: References: <3c9e2b5f.9993062@news.t-online.de> Message-ID: pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes: > Out of curiosity, I checked with the latest HTML documentation (2.3a0) > and did not find documentation for `gc.get_objects'. Should it be there? I think so, yes. I filed bug #578308. Regards, Martin From vinay_sajip@red-dove.com Mon Jul 8 02:16:32 2002 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Mon, 8 Jul 2002 02:16:32 +0100 Subject: [Python-Dev] PEP 282 Implementation Message-ID: <00e001c2261d$19bfc320$652b6992@alpha> I've uploaded my logging module, the proposed implementation for PEP 282, for committer review, to the SourceForge patch manager: http://sourceforge.net/tracker/index.php?func=detail&aid=578494&group_id=547 0&atid=305470 I've assigned it to Mark Hammond as (a) he had posted some comments to Trent Mick's original PEP posting, and (b) Barry Warsaw advised not assigning to PythonLabs people on account of their current workload. The file logging.py is (apart from some test scripts) all that's supposed to go into Python 2.3. The file logging-0.4.6.tar.gz contains the module, an updated version of the PEP (which I mailed to Barry Warsaw on 26th June), numerous test/example scripts, TeX documentation etc. You can also refer to http://www.red-dove.com/python_logging.html Here's hoping for a speedy review :-) Regards, Vinay Sajip From barry@zope.com Mon Jul 8 14:58:30 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 8 Jul 2002 09:58:30 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> Message-ID: <15657.39558.325764.651122@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> For cases like the email package or distutils, I think it's >> perfectly OK to only provide the updates for older Python >> releases as separate download. Both have their own way of life, >> so IMHO this is acceptable. MvL> In neither case, this is really possible: Once you have the MvL> package in the Python core, a separate installation in MvL> site-packages cannot override the core implementation. MvL> I believe that was the motivation for Barry to consider MvL> backporting large amounts of changes. The same holds for MvL> distutils, except that there aren't that many major changes. Exactly. For my own purposes (e.g. Mailman) it's no problem; I provide my own email package and arrange for MM to use it before the Python standard one. I actually think it's the right thing for a normal distutils install to not override the standard version. But I also think there is a use case for allowing a standard package to be separately upgraded for a particular Python installation. As more and more standard Python libraries are packagized, they will probably have life-cycles separate from the Python core themselves (this will only be more true once we evolve toward a CPAN-like arrangement). So I think we will eventually need a way to upgrade (not override :) a standard library package. My suggestion would be to prepend a new directory on the standard search path, let's call it site-upgrade for now. A normal "python setup.py install" would still install to site-packages, but we'd add a "python setup.py upgrade" command that would install to site-upgrade. -Barry From barry@zope.com Mon Jul 8 15:01:13 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 8 Jul 2002 10:01:13 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <3D255CB0.9080502@lemburg.com> Message-ID: <15657.39721.348050.614837@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> True, but it is easily possible to install those packages in MAL> a directory which is scanned before the standard lib, thus MAL> overriding the distribution versions: MAL> python setup.py install install-lib=~/lib A general solution that requires uses to set environment variables isn't acceptable IMO. >> I believe that was the motivation for Barry to consider >> backporting large amounts of changes. The same holds for >> distutils, except that there aren't that many major changes. MAL> If that's the case, then we probably ought to make it easier MAL> for user installed Python add-ons to override builtin MAL> packages. +1 MAL> This would help to get rid off the hacks which the PyXML MAL> distribution has to use in order to achieve the same. Yup, see my previous response. -Barry From mal@lemburg.com Mon Jul 8 15:14:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 08 Jul 2002 16:14:26 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> Message-ID: <3D299E42.70200@lemburg.com> Barry A. Warsaw wrote: > But I also think there is a use case for allowing a standard package > to be separately upgraded for a particular Python installation. As > more and more standard Python libraries are packagized, they will > probably have life-cycles separate from the Python core themselves > (this will only be more true once we evolve toward a CPAN-like > arrangement). So I think we will eventually need a way to upgrade > (not override :) a standard library package. +1 > My suggestion would be to prepend a new directory on the standard > search path, let's call it site-upgrade for now. A normal "python > setup.py install" would still install to site-packages, but we'd add a > "python setup.py upgrade" command that would install to site-upgrade. +1 (maybe with s/site-upgrade/system-packages) Not sure whether it's already possible or not, but I'd prefer to keep the install command and have the package provide this information (site-packages vs. system-packages) as part of the setup.py or setup.cfg file. Perhaps we could have some kind of category for distutils packages which marks them as system add-ons vs. site add-ons. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Mon Jul 8 17:24:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 08 Jul 2002 18:24:28 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D299E42.70200@lemburg.com> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Perhaps we could have some kind of category for distutils > packages which marks them as system add-ons vs. site add-ons. One approach would be for distutils to have a list of system packages built-in, depending on the Python release. That list would cover PyXML and email; perhaps others. Of couse, taking the _xmlplus hack out of PyXML will cause backwards compatibility problems (regardless what the alternative hook is). Regards, Martin From gmcm@hypernet.com Mon Jul 8 17:39:55 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 8 Jul 2002 12:39:55 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: References: <3D299E42.70200@lemburg.com> Message-ID: <3D29881B.30590.6BF1A7E@localhost> On 8 Jul 2002 at 18:24, Martin v. Loewis wrote: > Of couse, taking the _xmlplus hack out of PyXML > will cause backwards compatibility problems > (regardless what the alternative hook is). How? As long as "import xml" gets them _xmlplus, I can't see how it would break anything. I'd say it's broken already, since code written for _xmlplus assumes a different contract, and that is completely implicit. -- Gordon http://www.mcmillan-inc.com/ From skip@pobox.com Sat Jul 6 17:17:30 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 6 Jul 2002 11:17:30 -0500 Subject: [Python-Dev] Death to WITH_CYCLE_GC In-Reply-To: <3D2432AF.8010000@lemburg.com> References: <3D2432AF.8010000@lemburg.com> Message-ID: <15655.6170.182746.75875@localhost.localdomain> mal> I personally know that developers which have tried to create a mal> trimmed down version of the interpreter did like the #ifdefs for mal> removing certain parts like e.g. the complex numbers very much. I'm mal> just lobbying for them. I think complex numbers are a bit different. They had the #ifdef from start precisely because it was expected they wouldn't be needed in some cases where memory footprint mattered. WITH_CYCLE_GC was just a debugging #ifdef. Skip From barry@zope.com Mon Jul 8 17:51:58 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 8 Jul 2002 12:51:58 -0400 Subject: [Python-Dev] PEP 282 Implementation References: <00e001c2261d$19bfc320$652b6992@alpha> Message-ID: <15657.49966.835748.511346@anthem.wooz.org> >>>>> "VS" == Vinay Sajip writes: VS> The file logging.py is (apart from some test scripts) all VS> that's supposed to go into Python 2.3. The file VS> logging-0.4.6.tar.gz contains the module, an updated version VS> of the PEP (which I mailed to Barry Warsaw on 26th June), VS> numerous test/example scripts, TeX documentation etc. You can VS> also refer to PEP 282 update has been installed. One coment about the PEP: where `lvl' is used as an argument to methods and functions, I think we shouldn't be so cute. Please spell it out as `level'. -Barry From martin@v.loewis.de Mon Jul 8 18:03:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 08 Jul 2002 19:03:44 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D29881B.30590.6BF1A7E@localhost> References: <3D299E42.70200@lemburg.com> <3D29881B.30590.6BF1A7E@localhost> Message-ID: "Gordon McMillan" writes: > > Of couse, taking the _xmlplus hack out of PyXML > > will cause backwards compatibility problems > > (regardless what the alternative hook is). > > How? As long as "import xml" gets them _xmlplus, I > can't see how it would break anything. Of course, once the hack that is taken out of PyXML, there won't be any _xmlplus anymore. I was thinking about applications that package Python applications, like freeze or Installer. People might have taken into account that they have to look inside _xmlplus as well. If the hack changes, they have to take into account that they need to look somewhere else, instead. Regards, Martin From gmcm@hypernet.com Mon Jul 8 18:20:02 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 8 Jul 2002 13:20:02 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: References: <3D29881B.30590.6BF1A7E@localhost> Message-ID: <3D299182.4516.6E3D582@localhost> On 8 Jul 2002 at 19:03, Martin v. Loewis wrote: > "Gordon McMillan" writes: > > > > Of couse, taking the _xmlplus hack out of PyXML > > > will cause backwards compatibility problems > > > (regardless what the alternative hook is). > > > > How? As long as "import xml" gets them _xmlplus, I > > can't see how it would break anything. > > Of course, once the hack that is taken out of PyXML, > there won't be any _xmlplus anymore. > > I was thinking about applications that package > Python applications, like freeze or Installer. > People might have taken into account that they have > to look inside _xmlplus as well. If the hack > changes, they have to take into account that they > need to look somewhere else, instead. py2exe doesn't do _xmlplus (unless that's changed recently) - Thomas has people overlay xml with _xmlplus. Installer does do it, but it's a horrid hack (one bad hack deserves another) and I'd be delighted to remove it. -- Gordon http://www.mcmillan-inc.com/ From barry@zope.com Mon Jul 8 18:23:35 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 8 Jul 2002 13:23:35 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> Message-ID: <15657.51863.523283.977726@anthem.wooz.org> >>>>> "MAL" == M writes: >> My suggestion would be to prepend a new directory on the >> standard search path, let's call it site-upgrade for now. A >> normal "python setup.py install" would still install to >> site-packages, but we'd add a "python setup.py upgrade" command >> that would install to site-upgrade. MAL> +1 (maybe with s/site-upgrade/system-packages) I like that: system-packages. MAL> Not sure whether it's already possible or not, but I'd prefer MAL> to keep the install command and have the package provide this MAL> information (site-packages vs. system-packages) as part of MAL> the setup.py or setup.cfg file. Ok, yeah. I think it would be a good idea for the package to somehow register itself as an upgrade to an existing system package. I still want the install command to install to site-packages, but whether the upgrade happens as an upgrade command or "python setup.py install -U" or some other mechanism is up for grabs. -Barry From David Abrahams" I keep running into the problem that there is no reliable way to introspect about whether a type supports multi-pass iterability (in the sense that an input stream might support only a single pass, but a list supports multiple passes). I suppose you could check for __getitem__, but that wouldn't cover linked lists, for example. Can anyone channel Guido's intent for me? Is this an oversight or a deliberate design decision? Is there an interface for checking multi-pass-ability that I've missed? TIA, Dave From jacobs@penguin.theopalgroup.com Mon Jul 8 19:30:33 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 8 Jul 2002 14:30:33 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> Message-ID: On Mon, 8 Jul 2002, David Abrahams wrote: > I keep running into the problem that there is no reliable way to introspect > about whether a type supports multi-pass iterability (in the sense that an > input stream might support only a single pass, but a list supports multiple > passes). I suppose you could check for __getitem__, but that wouldn't cover > linked lists, for example. > > Can anyone channel Guido's intent for me? Is this an oversight or a > deliberate design decision? Is there an interface for checking > multi-pass-ability that I've missed? As far as I can tell, there is no published Python mechanism that distinguishes "input iterators" from "forward iterators" (using the C++ parlance). -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tim.one@comcast.net Mon Jul 8 19:44:25 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 08 Jul 2002 14:44:25 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > I keep running into the problem that there is no reliable way to > introspect about whether a type supports multi-pass iterability (in the > sense that an input stream might support only a single pass, but a list > supports multiple passes). I suppose you could check for __getitem__, but > that wouldn't cover linked lists, for example. > > Can anyone channel Guido's intent for me? Is this an oversight or a > deliberate design decision? Is there an interface for checking > multi-pass-ability that I've missed? The language makes no such distinctions. If an app wants to make them, it's up to the app to implement them. Likewise for a way to tell a multipass iterator to "start over again". The Python iteration protocol has only two methods, .next() to get "the next" item, and .iter() to return self; given a random iterator, those are the only things you can rely on. From barry@zope.com Mon Jul 8 21:03:58 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 8 Jul 2002 16:03:58 -0400 Subject: [Python-Dev] New Persistence SIG created Message-ID: <15657.61486.222685.665859@anthem.wooz.org> As recently discussed on meta-sig@python.org, we have created a new SIG focussed on producing a common persistence and transactional framework for Python programs. This SIG is called persistence-sig@python.org. For more information on the SIG, its mission, and deadlines see http://www.python.org/sigs/persistence-sig/ To join the mailing list see http://mail.python.org/mailman-21/listinfo/persistence-sig -Barry From jacobs@penguin.theopalgroup.com Mon Jul 8 21:48:37 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 8 Jul 2002 16:48:37 -0400 (EDT) Subject: [Python-Dev] Are we having https/ssl problems? Message-ID: Hi all, This is not a bug report. It is more of a query to find out if there are known problems with the current Python 2.3 CVS regarding SSL, httplib w/ https, or urllib w/ https. I seem to remember tuning out some discussions on timeout sockets and SSL of late, so I thought I would ask. Here is code that has worked previously, but does not in the current CVS: import urllib def get(url): u = urllib.urlopen(url) junk = '' while 1: chunk = u.read() if not chunk: break junk += chunk return junk exlen = len(get('https://dbserv2.theopalgroup.com/mediumfile')) aclen = len(get('https://dbserv2.theopalgroup.com/mediumfile')) print "File 1 len = %d, File 2 len = %d" % (exlen,aclen) > python2.0 testhttps.py HTTP len = 37140, HTTPS len = 37140 > python2.1 testhttps.py HTTP len = 37140, HTTPS len = 37140 > python2.2 testhttps.py HTTP len = 37140, HTTPS len = 37140 > python2.3 testhttps.py HTTP len = 37140, HTTPS len = 0 If this doesn't ring a bell with anyone, I will battle SourceForge once more and file a bug report. The interesting thing is that the problem is sensitive to the size of the file requested. Here is what happens when I use 'smallfile' instead of 'mediumfile': > python2.3 testhttps.py HTTP len = 3713, HTTPS len = 3713 -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jacobs@penguin.theopalgroup.com Mon Jul 8 21:51:16 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 8 Jul 2002 16:51:16 -0400 (EDT) Subject: [Python-Dev] Are we having https/ssl problems? In-Reply-To: Message-ID: On Mon, 8 Jul 2002, Kevin Jacobs wrote: > exlen = len(get('https://dbserv2.theopalgroup.com/mediumfile')) ^^^^^ Oops. Obviously, this should be http. The trials of cut-n-paste, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mal@lemburg.com Mon Jul 8 21:59:59 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 08 Jul 2002 22:59:59 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org> Message-ID: <3D29FD4F.4060607@lemburg.com> Barry A. Warsaw wrote: >>>>>>"MAL" == M writes: >>>>> > > >> My suggestion would be to prepend a new directory on the > >> standard search path, let's call it site-upgrade for now. A > >> normal "python setup.py install" would still install to > >> site-packages, but we'd add a "python setup.py upgrade" command > >> that would install to site-upgrade. > > MAL> +1 (maybe with s/site-upgrade/system-packages) > > I like that: system-packages. > > MAL> Not sure whether it's already possible or not, but I'd prefer > MAL> to keep the install command and have the package provide this > MAL> information (site-packages vs. system-packages) as part of > MAL> the setup.py or setup.cfg file. > > Ok, yeah. I think it would be a good idea for the package to somehow > register itself as an upgrade to an existing system package. I still > want the install command to install to site-packages, but whether the > upgrade happens as an upgrade command or "python setup.py install -U" > or some other mechanism is up for grabs. Hmm, maybe I wasn't clear enough: I think that a distutils package should have a flag in its setup.py which lets distutils tell whether it's a site package or a system package, e.g. setup(... pkgtype='site-package' ...) vs. setup(... pkgtype='system-package' ...) (with pkgtype='site-package' as default value if not given) The user would in both cases type 'python setup.py install' but the install command would automatically choose the right target subdir (site-packages/ or system-packages/). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From andymac@bullseye.apana.org.au Mon Jul 8 13:53:01 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Mon, 8 Jul 2002 23:53:01 +1100 (edt) Subject: [Python-Dev] test_socket failure on FreeBSD In-Reply-To: <200206192037.g5JKbSj03086@pcp02138704pcs.reston01.va.comcast.net> Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---888574994-8578-1026132781=:28004 Content-Type: TEXT/PLAIN; charset=US-ASCII On Wed, 19 Jun 2002, Guido van Rossum wrote: > There are probably some differences in the socket semantics. I'd > appreciate it if you could provide a patch or at least a clue! I've not read enough Stevens to grok sockets code (yet) :-( However, I hope that the instrumented verbose output of test_socket might give you a clue.... I've attached the diff from the version of test_socket (vs recent CVS) that I used, as well as output from test_socket on FreeBSD 4.4 and OS/2+EMX. Getting the FreeBSD issues sorted is a higher priority for me than getting OS/2+EMX working (though that would be nice too). Please let me know if there's more testing/debugging I can do. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia ---888574994-8578-1026132781=:28004 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.py.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.py.diff Content-Disposition: attachment; filename="test_socket.py.diff" KioqIHRlc3Rfc29ja2V0LnB5Lm9yaWcJU3VuIEp1biAzMCAyMjoyOTo1NyAy MDAyDQotLS0gdGVzdF9zb2NrZXQucHkJTW9uIEp1bCAgOCAyMzoxNTo0MSAy MDAyDQoqKioqKioqKioqKioqKioNCioqKiA4LDEzICoqKioNCi0tLSA4LDE0 IC0tLS0NCiAgaW1wb3J0IHRpbWUNCiAgaW1wb3J0IHRocmVhZCwgdGhyZWFk aW5nDQogIGltcG9ydCBRdWV1ZQ0KKyBpbXBvcnQgdHJhY2ViYWNrDQogIA0K ICBQT1JUID0gNTAwMDcNCiAgSE9TVCA9ICdsb2NhbGhvc3QnDQoqKioqKioq KioqKioqKioNCioqKiAzNDQsMzQ5ICoqKioNCi0tLSAzNDUsMzUxIC0tLS0N CiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2VsZik6DQogICAgICAgICAgIiIi VGVzdGluZyBsYXJnZSByZWN2ZnJvbSgpIG92ZXIgVENQLiIiIg0KICAgICAg ICAgIG1zZywgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20oMTAyNCkN CisgICAgICAgICBwcmludCAiXG5tc2c9JyVzJywgYWRkcj0nJXMnIiAlICht c2csIHJlcHIoYWRkcikpDQogICAgICAgICAgaG9zdG5hbWUsIHBvcnQgPSBh ZGRyDQogICAgICAgICAgIyNzZWxmLmFzc2VydEVxdWFsKGhvc3RuYW1lLCBz b2NrZXQuZ2V0aG9zdGJ5bmFtZSgnbG9jYWxob3N0JykpDQogICAgICAgICAg c2VsZi5hc3NlcnRFcXVhbChtc2csIE1TRykNCioqKioqKioqKioqKioqKg0K KioqIDM1NCwzNjEgKioqKg0KLS0tIDM1NiwzNjUgLS0tLQ0KICAgICAgZGVm IHRlc3RPdmVyRmxvd1JlY3ZGcm9tKHNlbGYpOg0KICAgICAgICAgICIiIlRl c3RpbmcgcmVjdmZyb20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuIiIiDQogICAg ICAgICAgc2VnMSwgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20obGVu KE1TRyktMykNCisgICAgICAgICBwcmludCAiXG5zZWcxPSclcycsIGFkZHI9 JyVzJyIgJSAoc2VnMSwgcmVwcihhZGRyKSkNCiAgICAgICAgICBzZWcyLCBh ZGRyID0gc2VsZi5jbGlfY29ubi5yZWN2ZnJvbSgxMDI0KQ0KICAgICAgICAg IG1zZyA9IHNlZzEgKyBzZWcyDQorICAgICAgICAgcHJpbnQgInNlZzI9JyVz JywgYWRkcj0nJXMnIiAlIChzZWcyLCByZXByKGFkZHIpKQ0KICAgICAgICAg IGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0KICAgICAgICAgICMjc2VsZi5hc3Nl cnRFcXVhbChob3N0bmFtZSwgc29ja2V0LmdldGhvc3RieW5hbWUoJ2xvY2Fs aG9zdCcpKQ0KICAgICAgICAgIHNlbGYuYXNzZXJ0RXF1YWwobXNnLCBNU0cp DQoqKioqKioqKioqKioqKioNCioqKiA0NDgsNDUzICoqKioNCi0tLSA0NTIs NDU4IC0tLS0NCiAgICAgICAgICBleGNlcHQgc29ja2V0LmVycm9yOg0KICAg ICAgICAgICAgICBwYXNzDQogICAgICAgICAgZWxzZToNCisgICAgICAgICAg ICAgcHJpbnQgIlxuY29ubj0iICsgcmVwcihjb25uKSArICJcbmFkZHI9IiAr IHJlcHIoYWRkcikNCiAgICAgICAgICAgICAgc2VsZi5mYWlsKCJFcnJvciB0 cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5nIGFjY2VwdC4iKQ0KICAgICAgICAg IHJlYWQsIHdyaXRlLCBlcnIgPSBzZWxlY3Quc2VsZWN0KFtzZWxmLnNlcnZd LCBbXSwgW10pDQogICAgICAgICAgaWYgc2VsZi5zZXJ2IGluIHJlYWQ6DQoq KioqKioqKioqKioqKioNCioqKiA0NzUsNDgwICoqKioNCi0tLSA0ODAsNDg2 IC0tLS0NCiAgICAgICAgICBleGNlcHQgc29ja2V0LmVycm9yOg0KICAgICAg ICAgICAgICBwYXNzDQogICAgICAgICAgZWxzZToNCisgICAgICAgICAgICAg cHJpbnQgIlxuY29ubj0iICsgcmVwcihjb25uKSArICJcbmFkZHI9IiArIHJl cHIoYWRkcikNCiAgICAgICAgICAgICAgc2VsZi5mYWlsKCJFcnJvciB0cnlp bmcgdG8gZG8gbm9uLWJsb2NraW5nIHJlY3YuIikNCiAgICAgICAgICByZWFk LCB3cml0ZSwgZXJyID0gc2VsZWN0LnNlbGVjdChbY29ubl0sIFtdLCBbXSkN CiAgICAgICAgICBpZiBjb25uIGluIHJlYWQ6DQoqKioqKioqKioqKioqKioN CioqKiA1NDQsNTUwICoqKioNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndy aXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAg DQohIGRlZiBtYWluKCk6DQogICAgICBzdWl0ZSA9IHVuaXR0ZXN0LlRlc3RT dWl0ZSgpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0ZXN0Lm1ha2VTdWl0 ZShHZW5lcmFsTW9kdWxlVGVzdHMpKQ0KICAgICAgc3VpdGUuYWRkVGVzdCh1 bml0dGVzdC5tYWtlU3VpdGUoQmFzaWNUQ1BUZXN0KSkNCi0tLSA1NTAsNTU2 IC0tLS0NCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndyaXRlKE1TRykNCiAg ICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAgDQohIGRlZiB0ZXN0 X21haW4oKToNCiAgICAgIHN1aXRlID0gdW5pdHRlc3QuVGVzdFN1aXRlKCkN CiAgICAgIHN1aXRlLmFkZFRlc3QodW5pdHRlc3QubWFrZVN1aXRlKEdlbmVy YWxNb2R1bGVUZXN0cykpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0ZXN0 Lm1ha2VTdWl0ZShCYXNpY1RDUFRlc3QpKQ0KKioqKioqKioqKioqKioqDQoq KiogNTU0LDU1NyAqKioqDQogICAgICB0ZXN0X3N1cHBvcnQucnVuX3N1aXRl KHN1aXRlKQ0KICANCiAgaWYgX19uYW1lX18gPT0gIl9fbWFpbl9fIjoNCiEg ICAgIG1haW4oKQ0KLS0tIDU2MCw1NjMgLS0tLQ0KICAgICAgdGVzdF9zdXBw b3J0LnJ1bl9zdWl0ZShzdWl0ZSkNCiAgDQogIGlmIF9fbmFtZV9fID09ICJf X21haW5fXyI6DQohICAgICB0ZXN0X21haW4oKQ0K ---888574994-8578-1026132781=:28004 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.fbsd44" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.log.fbsd44 Content-Disposition: attachment; filename="test_socket.log.fbsd44" dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4g b2sNClRlc3RpbmcgZ2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9z dG5hbWUgcmVzb2x1dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBz dXJlIGdldG5hbWVpbmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVy LiAuLi4gb2sNClRlc3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lh bCBjb25zdGFudHMuIC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQg Zm9yIGdldG5hbWVpbmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgp LiAuLi4gb2sNClRlc3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0 aW5nIHRoYXQgc29ja2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRl c3RpbmcgZnJvbWZkKCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNo dW5rcyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4g Y2h1bmtzIG92ZXIgVENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3 YXMgaGUnLCBhZGRyPSdOb25lJw0Kc2VnMj0ncmUNCicsIGFkZHI9J05vbmUn DQpFUlJPUg0KVGVzdGluZyBsYXJnZSByZWNlaXZlIG92ZXIgVENQLiAuLi4g b2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4gLi4uIA0K bXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0KJywgYWRkcj0nTm9uZScN CkVSUk9SDQpUZXN0aW5nIHNlbmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0 cmluZyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHNodXRkb3duKCkuIC4u LiBvaw0KVGVzdGluZyByZWN2ZnJvbSgpIG92ZXIgVURQLiAuLi4gb2sNClRl c3Rpbmcgc2VuZHRvKCkgYW5kIFJlY3YoKSBvdmVyIFVEUC4gLi4uIG9rDQpU ZXN0aW5nIG5vbi1ibG9ja2luZyBhY2NlcHQuIC4uLiANCmNvbm49PHNvY2tl dCBvYmplY3QsIGZkPTgsIGZhbWlseT0yLCB0eXBlPTEsIHByb3RvY29sPTA+ DQphZGRyPSgnMTI3LjAuMC4xJywgMzE0NCkNCkZBSUwNClRlc3Rpbmcgbm9u LWJsb2NraW5nIGNvbm5lY3QuIC4uLiBvaw0KVGVzdGluZyBub24tYmxvY2tp bmcgcmVjdi4gLi4uIA0KY29ubj08c29ja2V0IG9iamVjdCwgZmQ9OCwgZmFt aWx5PTIsIHR5cGU9MSwgcHJvdG9jb2w9MD4NCmFkZHI9KCcxMjcuMC4wLjEn LCAzMTQ2KQ0KRkFJTA0KVGVzdGluZyB3aGV0aGVyIHNldCBibG9ja2luZyB3 b3Jrcy4gLi4uIG9rDQpQZXJmb3JtaW5nIGZpbGUgcmVhZGxpbmUgdGVzdC4g Li4uIG9rDQpQZXJmb3JtaW5nIHNtYWxsIGZpbGUgcmVhZCB0ZXN0LiAuLi4g b2sNClBlcmZvcm1pbmcgdW5idWZmZXJlZCBmaWxlIHJlYWQgdGVzdC4gLi4u IG9rDQoNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0 aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIgVENQLg0KLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLQ0KVHJhY2ViYWNrIChtb3N0IHJlY2VudCBjYWxs IGxhc3QpOg0KICBGaWxlICJMaWIvdGVzdC90ZXN0X3NvY2tldC5weSIsIGxp bmUgMzYzLCBpbiB0ZXN0T3ZlckZsb3dSZWN2RnJvbQ0KICAgIGhvc3RuYW1l LCBwb3J0ID0gYWRkcg0KVHlwZUVycm9yOiB1bnBhY2sgbm9uLXNlcXVlbmNl DQoNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0aW5n IGxhcmdlIHJlY3Zmcm9tKCkgb3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6 DQogIEZpbGUgIkxpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSAzNDks IGluIHRlc3RSZWN2RnJvbQ0KICAgIGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0K VHlwZUVycm9yOiB1bnBhY2sgbm9uLXNlcXVlbmNlDQoNCj09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT0NCkZBSUw6IFRlc3Rpbmcgbm9uLWJsb2NraW5nIGFj Y2VwdC4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NClRyYWNlYmFjayAo bW9zdCByZWNlbnQgY2FsbCBsYXN0KToNCiAgRmlsZSAiTGliL3Rlc3QvdGVz dF9zb2NrZXQucHkiLCBsaW5lIDQ1NiwgaW4gdGVzdEFjY2VwdA0KICAgIHNl bGYuZmFpbCgiRXJyb3IgdHJ5aW5nIHRvIGRvIG5vbi1ibG9ja2luZyBhY2Nl cHQuIikNCiAgRmlsZSAiL2hvbWUvYW5keW1hYy9jdnMvcHl0aG9uL3B5dGhv bi10ZXN0L0xpYi91bml0dGVzdC5weSIsIGxpbmUgMjU0LCBpbiBmYWlsDQog ICAgcmFpc2Ugc2VsZi5mYWlsdXJlRXhjZXB0aW9uLCBtc2cNCkFzc2VydGlv bkVycm9yOiBFcnJvciB0cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5nIGFjY2Vw dC4NCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRkFJTDogVGVzdGlu ZyBub24tYmxvY2tpbmcgcmVjdi4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0NClRyYWNlYmFjayAobW9zdCByZWNlbnQgY2FsbCBsYXN0KToNCiAgRmls ZSAiTGliL3Rlc3QvdGVzdF9zb2NrZXQucHkiLCBsaW5lIDQ4NCwgaW4gdGVz dFJlY3YNCiAgICBzZWxmLmZhaWwoIkVycm9yIHRyeWluZyB0byBkbyBub24t YmxvY2tpbmcgcmVjdi4iKQ0KICBGaWxlICIvaG9tZS9hbmR5bWFjL2N2cy9w eXRob24vcHl0aG9uLXRlc3QvTGliL3VuaXR0ZXN0LnB5IiwgbGluZSAyNTQs IGluIGZhaWwNCiAgICByYWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1z Zw0KQXNzZXJ0aW9uRXJyb3I6IEVycm9yIHRyeWluZyB0byBkbyBub24tYmxv Y2tpbmcgcmVjdi4NCg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KUmFu IDI2IHRlc3RzIGluIDAuMzMzcw0KDQpGQUlMRUQgKGZhaWx1cmVzPTIsIGVy cm9ycz0yKQ0KdGVzdCB0ZXN0X3NvY2tldCBmYWlsZWQgLS0gZXJyb3JzIG9j Y3VycmVkOyBydW4gaW4gdmVyYm9zZSBtb2RlIGZvciBkZXRhaWxzDQoxIHRl c3QgZmFpbGVkOg0KICAgIHRlc3Rfc29ja2V0DQoa ---888574994-8578-1026132781=:28004 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.os2emx" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.log.os2emx Content-Disposition: attachment; filename="test_socket.log.os2emx" dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4g b2sNClRlc3RpbmcgZ2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9z dG5hbWUgcmVzb2x1dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBz dXJlIGdldG5hbWVpbmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVy LiAuLi4gb2sNClRlc3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lh bCBjb25zdGFudHMuIC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQg Zm9yIGdldG5hbWVpbmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgp LiAuLi4gb2sNClRlc3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0 aW5nIHRoYXQgc29ja2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRl c3RpbmcgZnJvbWZkKCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNo dW5rcyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4g Y2h1bmtzIG92ZXIgVENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3 YXMgaGUnLCBhZGRyPSdOb25lJw0Kc2VnMj0ncmUNCicsIGFkZHI9J05vbmUn DQpFUlJPUg0KVGVzdGluZyBsYXJnZSByZWNlaXZlIG92ZXIgVENQLiAuLi4g b2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4gLi4uIA0K bXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0KJywgYWRkcj0nTm9uZScN CkVSUk9SDQpUZXN0aW5nIHNlbmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0 cmluZyBvdmVyIFRDUC4gLi4uIEZBSUwNClRlc3Rpbmcgc2h1dGRvd24oKS4g Li4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgb3ZlciBVRFAuIC4uLiBvaw0K VGVzdGluZyBzZW5kdG8oKSBhbmQgUmVjdigpIG92ZXIgVURQLiAuLi4gb2sN ClRlc3Rpbmcgbm9uLWJsb2NraW5nIGFjY2VwdC4gLi4uIA0KY29ubj08c29j a2V0IG9iamVjdCwgZmQ9MTMsIGZhbWlseT0yLCB0eXBlPTEsIHByb3RvY29s PTA+DQphZGRyPSgnMTI3LjAuMC4xJywgMzQ0MykNCkZBSUwNClRlc3Rpbmcg bm9uLWJsb2NraW5nIGNvbm5lY3QuIC4uLiBFUlJPUg0KVGVzdGluZyBub24t YmxvY2tpbmcgcmVjdi4gLi4uIG9rDQpUZXN0aW5nIHdoZXRoZXIgc2V0IGJs b2NraW5nIHdvcmtzLiAuLi4gb2sNClBlcmZvcm1pbmcgZmlsZSByZWFkbGlu ZSB0ZXN0LiAuLi4gb2sNClBlcmZvcm1pbmcgc21hbGwgZmlsZSByZWFkIHRl c3QuIC4uLiBvaw0KUGVyZm9ybWluZyB1bmJ1ZmZlcmVkIGZpbGUgcmVhZCB0 ZXN0LiAuLi4gb2sNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRVJS T1I6IFRlc3RpbmcgcmVjdmZyb20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuDQot LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVj ZW50IGNhbGwgbGFzdCk6DQogIEZpbGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rf c29ja2V0LnB5IiwgbGluZSAzNjMsIGluIHRlc3RPdmVyRmxvd1JlY3ZGcm9t DQogICAgaG9zdG5hbWUsIHBvcnQgPSBhZGRyDQpUeXBlRXJyb3I6IHVucGFj ayBub24tc2VxdWVuY2UNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0K RVJST1I6IFRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4NCi0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NClRyYWNlYmFjayAobW9zdCByZWNl bnQgY2FsbCBsYXN0KToNCiAgRmlsZSAiLi4vLi4vTGliL3Rlc3QvdGVzdF9z b2NrZXQucHkiLCBsaW5lIDM0OSwgaW4gdGVzdFJlY3ZGcm9tDQogICAgaG9z dG5hbWUsIHBvcnQgPSBhZGRyDQpUeXBlRXJyb3I6IHVucGFjayBub24tc2Vx dWVuY2UNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRVJST1I6IFRl c3Rpbmcgbm9uLWJsb2NraW5nIGNvbm5lY3QuDQotLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6 DQogIEZpbGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGlu ZSAxMTcsIGluIF90ZWFyRG93bg0KICAgIHNlbGYuZmFpbChtc2cpDQogIEZp bGUgIkY6L0RFVi9DVlNfVEVTVC9QWVRIT04tVEVTVC9MaWIvdW5pdHRlc3Qu cHkiLCBsaW5lIDI1NCwgaW4gZmFpbA0KICAgIHJhaXNlIHNlbGYuZmFpbHVy ZUV4Y2VwdGlvbiwgbXNnDQpBc3NlcnRpb25FcnJvcjogKDU2LCAnU29ja2V0 IGlzIGFscmVhZHkgY29ubmVjdGVkJykNCg0KPT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PQ0KRkFJTDogVGVzdGluZyBzZW5kYWxsKCkgd2l0aCBhIDIwNDgg Ynl0ZSBzdHJpbmcgb3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6DQogIEZp bGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSAzNzYs IGluIHRlc3RTZW5kQWxsDQogICAgc2VsZi5hc3NlcnRfKGxlbihyZWFkKSA9 PSAxMDI0LCAiRXJyb3IgcGVyZm9ybWluZyBzZW5kYWxsLiIpDQogIEZpbGUg IkY6L0RFVi9DVlNfVEVTVC9QWVRIT04tVEVTVC9MaWIvdW5pdHRlc3QucHki LCBsaW5lIDI2MiwgaW4gZmFpbFVubGVzcw0KICAgIGlmIG5vdCBleHByOiBy YWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1zZw0KQXNzZXJ0aW9uRXJy b3I6IEVycm9yIHBlcmZvcm1pbmcgc2VuZGFsbC4NCg0KPT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PQ0KRkFJTDogVGVzdGluZyBub24tYmxvY2tpbmcgYWNj ZXB0Lg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KVHJhY2ViYWNrICht b3N0IHJlY2VudCBjYWxsIGxhc3QpOg0KICBGaWxlICIuLi8uLi9MaWIvdGVz dC90ZXN0X3NvY2tldC5weSIsIGxpbmUgNDU2LCBpbiB0ZXN0QWNjZXB0DQog ICAgc2VsZi5mYWlsKCJFcnJvciB0cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5n IGFjY2VwdC4iKQ0KICBGaWxlICJGOi9ERVYvQ1ZTX1RFU1QvUFlUSE9OLVRF U1QvTGliL3VuaXR0ZXN0LnB5IiwgbGluZSAyNTQsIGluIGZhaWwNCiAgICBy YWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1zZw0KQXNzZXJ0aW9uRXJy b3I6IEVycm9yIHRyeWluZyB0byBkbyBub24tYmxvY2tpbmcgYWNjZXB0Lg0K DQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpSYW4gMjYgdGVzdHMgaW4g MC4xMzBzDQoNCkZBSUxFRCAoZmFpbHVyZXM9MiwgZXJyb3JzPTMpDQp0ZXN0 IHRlc3Rfc29ja2V0IGZhaWxlZCAtLSBlcnJvcnMgb2NjdXJyZWQ7IHJ1biBp biB2ZXJib3NlIG1vZGUgZm9yIGRldGFpbHMNCjEgdGVzdCBmYWlsZWQ6DQog ICAgdGVzdF9zb2NrZXQNCg== ---888574994-8578-1026132781=:28004-- From gward@python.net Tue Jul 9 02:20:56 2002 From: gward@python.net (Greg Ward) Date: Mon, 8 Jul 2002 21:20:56 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: <3D299E42.70200@lemburg.com> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> Message-ID: <20020709012056.GA2526@cthulhu.gerg.ca> On 08 July 2002, M.-A. Lemburg said: > Not sure whether it's already possible or not, but I'd prefer > to keep the install command and have the package provide this > information (site-packages vs. system-packages) as part of the > setup.py or setup.cfg file. > > Perhaps we could have some kind of category for distutils > packages which marks them as system add-ons vs. site add-ons. +1 -- this should definitely be up to the package author/packager, not the local admin. I once tried to convince Guido that the ability to occasionally upgrade standard library modules/packages would be a good thing, but he wasn't having it. Any change of heart, O Mighty BDFL? Greg -- Greg Ward - Python bigot gward@python.net http://starship.python.net/~gward/ What the hell, go ahead and put all your eggs in one basket. From tdelaney@avaya.com Tue Jul 9 02:30:51 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Tue, 9 Jul 2002 11:30:51 +1000 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) Message-ID: > From: martin@v.loewis.de [mailto:martin@v.loewis.de] > > "M.-A. Lemburg" writes: > > > Perhaps we could have some kind of category for distutils > > packages which marks them as system add-ons vs. site add-ons. > > One approach would be for distutils to have a list of system packages > built-in, depending on the Python release. +1 Arbitrary package authors shouldn't be able to state that their package is a system package - that should be up to the core team. Of course, this would require that distutils can be updated (to allow new system packages). I don't see much point in putting any more security in place than that though ... make it a bit difficult, so people don't bother trying to circumvent it. If someone wants to modify distutils themself, then there isn't going to be much anyone can do about it. Tim Delaney From jeremy@zope.com Mon Jul 8 23:41:27 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 8 Jul 2002 18:41:27 -0400 Subject: [Python-Dev] Are we having https/ssl problems? In-Reply-To: References: Message-ID: <15658.5399.500893.599454@slothrop.zope.com> >>>>> "KJ" == Kevin Jacobs writes: KJ> Hi all, This is not a bug report. It is more of a query to find KJ> out if there are known problems with the current Python 2.3 CVS KJ> regarding SSL, httplib w/ https, or urllib w/ https. I seem to KJ> remember tuning out some discussions on timeout sockets and SSL KJ> of late, so I thought I would ask. Here is code that has worked KJ> previously, but does not in the current CVS: It sounds like a bug to me. KJ> If this doesn't ring a bell with anyone, I will battle KJ> SourceForge once more and file a bug report. I hope you can get away with at worst a minor skirmish. I've made several changes to httplib recently to fix other SSL related problems. It appears the new code has some bugs. Since you're using CVS, I'll mention that it provides many ways to look for changes -- e.g. cvs log / annotate of individual files. The SF pages show all files, last revision, & mod time. You can sort by any of those fields. Jeremy From oren-py-d@hishome.net Tue Jul 9 06:18:33 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 9 Jul 2002 01:18:33 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> Message-ID: <20020709051833.GA32041@hishome.net> On Mon, Jul 08, 2002 at 02:44:25PM -0400, Tim Peters wrote: > [David Abrahams] > > I keep running into the problem that there is no reliable way to > > introspect about whether a type supports multi-pass iterability (in the > > sense that an input stream might support only a single pass, but a list > > supports multiple passes). I suppose you could check for __getitem__, but > > that wouldn't cover linked lists, for example. > > > > Can anyone channel Guido's intent for me? Is this an oversight or a > > deliberate design decision? Is there an interface for checking > > multi-pass-ability that I've missed? > > The language makes no such distinctions. If an app wants to make them, it's > up to the app to implement them. Likewise for a way to tell a multipass > iterator to "start over again". The Python iteration protocol has only two > methods, .next() to get "the next" item, and .iter() to return self; given a > random iterator, those are the only things you can rely on. I believe that when David was talking about multi-pass iterability he wasn't referring to an iterator that can be told to "start over again" but to an iterable object that can produce multiple independent iterators of itself, each one good for a single iteration. The language does make a distinction between an *iterable* object that may have only an __iter__ method and an *iterator* that has a next method. This distinction is blurred a bit by the fact that iterators also have an __iter__ method that also makes them appear as one-shot iterables. Imagine an altenative universe where a south african programmer called Rossu van Guidom writes a wonderful language called Mamba and in that language iterator semantics are defined like this: * Objects that wish to be iterable define an __iter__() method returning an iterator. * An iterator is an object with a next() method. That's all. * The for statement checks if an object has an __iter__ method. If it does, it calls it and uses the returned iterator. If it doesn't, it tries to use the object itself. If it doesn't have .next either it will fail and report that the object is not iterable. A Mamba programmer called Nero Hsorit has speculated in a mamba-dev posting that in an alternative universe in a language called 'Cobra' people kept getting confused between iterators and iterables :-) Oren From tim.one@comcast.net Tue Jul 9 06:47:13 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 09 Jul 2002 01:47:13 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020709051833.GA32041@hishome.net> Message-ID: [Oren Tirosh] > I believe that when David was talking about multi-pass iterability he > wasn't referring to an iterator that can be told to "start over again" but > to an iterable object that can produce multiple independent iterators of > itself, each one good for a single iteration. To an excellent first approximation, it makes no difference. I read David the same as you, and that's where my "no such distinctions" came from. The "likewise" in "Likewise for a way to tell a multipass iterator to 'start over again'" means "and in addition to what you asked about, not that either" (which is something other people have asked about, and more often than what David asked about). > The language does make a distinction between an *iterable* object that > may have only an __iter__ method and an *iterator* that has a next > method. Sure. At the wrapper-level David works at, all Python supplies here is PyObject_GetIter(x), which returns an iterator or a NULL, and in the former case the only useful thing he can do with it is call PyIter_Next() on it. There's simply no way for him to know whether calling PyObject_GetIter(x) again will yield an iterator that produces the same sequence of values, or even whether it will yield an iterator again at all. He could hardcode knowledge about a few types, like, e.g., the builtin list type, but that wouldn't even extend to subclasses of list; similarly a subclass of file may well fiddle its iterator to be multi-pass despite that the builtin file doesn't. > ... > A Mamba programmer called Nero Hsorit has speculated in a > mamba-dev posting that in an alternative universe in a language > called 'Cobra' people kept getting confused between iterators and > iterables :-) David can't get there from here with or without confusion . From greg@cosc.canterbury.ac.nz Tue Jul 9 06:51:42 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 09 Jul 2002 17:51:42 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020709051833.GA32041@hishome.net> Message-ID: <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> > Imagine an altenative universe where a south african programmer called > Rossu van Guidom writes a wonderful language called Mamba and in that > language iterator semantics are defined like this: > > * Objects that wish to be iterable define an __iter__() method returning an > iterator. > > * An iterator is an object with a next() method. That's all. But that doesn't allow for things like file objects, which, although not iterators themselves, are capable of producing iterators of different sorts which iterate over them in different ways -- and yet they can only be iterated over once. In other words, there are such things as one-shot iterables, even if iterables and iterators are kept separate. Maybe a one-shot iterable should raise an exception if you try to obtain a second iterator from it? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mal@lemburg.com Tue Jul 9 08:33:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 09 Jul 2002 09:33:45 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: Message-ID: <3D2A91D9.3080301@lemburg.com> Delaney, Timothy wrote: >>From: martin@v.loewis.de [mailto:martin@v.loewis.de] >> >>"M.-A. Lemburg" writes: >> >> >>>Perhaps we could have some kind of category for distutils >>>packages which marks them as system add-ons vs. site add-ons. >> >>One approach would be for distutils to have a list of system packages >>built-in, depending on the Python release. > > > +1 > > Arbitrary package authors shouldn't be able to state that their package is a > system package - that should be up to the core team. Hmm, I don't really see the need to make this more complicated. Package authors should be sensible enough to not create system packages unless these are actually part of the core or understood as optional but standard add-on (e.g. the Japanese codecs could be such an add-on). Besides, it's easy enough to achieve the same effect by subclassing the install command in your setup.py, so there is not much gained security there. The only advantage I see in Martin's approach is that it would seem backwards compatible, but then: installing a system package in a pre-2.3 system would not have the desired effect at all, so the gained backwards compatibility can't really be put to use. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From David Abrahams" <20020709051833.GA32041@hishome.net> Message-ID: <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> From: "Oren Tirosh" > > I believe that when David was talking about multi-pass iterability he > wasn't referring to an iterator that can be told to "start over again" but > to an iterable object that can produce multiple independent iterators of > itself, each one good for a single iteration. That's right. > The language does make a distinction between an *iterable* object that may > have only an __iter__ method and an *iterator* that has a next method. This > distinction is blurred a bit by the fact that iterators also have an > __iter__ method that also makes them appear as one-shot iterables. Yep. [Part of the reason I want to know whether I've got a one-shot sequence is that inspecting that sequence then becomes an information-destroying operation -- only being able to touch it once changes how you have to handle it] I was thinking one potentially nice way to introspect about multi-pass-ability might be to get an iterator and to see whether it was copyable. Currently even most multi-pass iterators can't be copied with copy.copy(). -Dave From Paul.Moore@atosorigin.com Tue Jul 9 09:57:35 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Tue, 9 Jul 2002 09:57:35 +0100 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> IIRC from earlier discussions on the list, iterators "by design" do not expose this information. In C++ terms, all Python iterators are forward iterators (I could argue here that it's the C++ usage of the word "iterator" for something that "points" and often does more than just "iterate" that is misleading, but that's off topic). If you need to know more than that, I think that the design intent is that you pass the *container* around, and get at the iterator via iter(container). Of course, this sort of begs the question as to how you can introspect a container, to determine what properties its iterators can have (but lets not go there - I can see Alex Martelli popping up to claim that the adaption PEP will let you do that :-)). But you do have a better chance, by requiring that the container support a richer interface, or just by type testing. Paul. From aleax@aleax.it Tue Jul 9 10:09:19 2002 From: aleax@aleax.it (Alex Martelli) Date: Tue, 9 Jul 2002 11:09:19 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote: > IIRC from earlier discussions on the list, iterators "by design" do not > expose this information. In C++ terms, all Python iterators are forward > iterators I think they're _input_ iterators -- you can only "get" items through the iterator, not "set" them (as you can with forward, but not input, iterators in C++). > can introspect a container, to determine what properties its iterators can > have (but lets not go there - I can see Alex Martelli popping up to claim > that the adaption PEP will let you do that :-)). But you do have a better The adaptation (not "adaption") PEP 246 would just obviate the need to invent yet one more infrastructure/plumbing ad-hoc "solution" here, but would not by itself alone solve the need to design and designate one or more protocols for "containers that yield augmented-iterators of kind X" or for augmented-iterators themselves ("iterator able to replicate itself", "iterator able to 'rewind'", "iterator to which you can write an item", etc). The first step in studying such a need is whether it IS in fact a need. Sure, "rich iterators" might come in handy, but do we NEED them...? If so, then what kinds of rich-iterators do we in fact need? How to get at them seems a third-order problem at best (and here, of course, I would suggest that adaptation IS good for this tertiary problem:-). > chance, by requiring that the container support a richer interface, or just > by type testing. *Shudder*. You're advocating MORE type-testing...? Alex From Paul.Moore@atosorigin.com Tue Jul 9 10:31:01 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Tue, 9 Jul 2002 10:31:01 +0100 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B42B@UKRUX002.rundc.uk.origin-it.com> From: Alex Martelli [mailto:aleax@aleax.it] >On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote: > >> IIRC from earlier discussions on the list, iterators "by >> design" do not expose this information. In C++ terms, all >> Python iterators are forward iterators > >I think they're _input_ iterators -- you can only "get" >items through the iterator, not "set" them (as you can with >forward, but not input, iterators in C++). Rats, you're right. I can never get the C++ terminology correct... >> can introspect a container, to determine what properties >> its iterators can have (but lets not go there - I can see >> Alex Martelli popping up to claim that the adaption PEP >> will let you do that :-)). But you do have a better > >The adaptation (not "adaption") PEP 246 would just obviate >the need to invent yet one more infrastructure/plumbing >ad-hoc "solution" here, but would not by itself alone solve >the need to design and designate one or more protocols >for "containers that yield augmented-iterators of kind X" >or for augmented-iterators themselves ("iterator able to >replicate itself", "iterator able to 'rewind'", "iterator >to which you can write an item", etc). Oh, I agree. Sorry, that was just an offhand comment without enough detail to make sense on its own. In a way, I was expressing mild support for the PEP as a general solution to "issues like this". >The first step in studying such a need is whether it IS >in fact a need. Sure, "rich iterators" might come in >handy, but do we NEED them...? If so, then what kinds of >rich-iterators do we in fact need? How to get at them >seems a third-order problem at best (and here, of course, >I would suggest that adaptation IS good for this tertiary >problem:-). Yes. I was taking as a given that if the original question had been asked, then there was at least a perceived need. And refining that need into a protocol is David's problem (should he want to go down that route). Of course, David has since clarified his original question - what he's really concerned about is telling whether calling next() on an iterator destroys information (as it does for a file iterator). That's a valid concern, but as I pointed out it's a property of the container, not of the iterator [and querying the container as to whether its iterators *have* that property is back to where we started]. I think a key issue here is that Python iterators are real objects, not "concepts" as they are in C++. But my brain isn't up to understanding *why* that issue is key...:-) [I knew I shouldn't have started this]. >> chance, by requiring that the container support a richer >> interface, or just by type testing. > >*Shudder*. You're advocating MORE type-testing...? Definitely not. I was trying to point out that there may be a hole, if type testing *is* the only answer. But the hole could easily be in my ability to think of a better solution (quite possible, as I don't have the problem myself). Paul. From David Abrahams" Message-ID: <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> From: "Alex Martelli" > On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote: > > IIRC from earlier discussions on the list, iterators "by design" do not > > expose this information. In C++ terms, all Python iterators are forward > > iterators > > I think they're _input_ iterators -- you can only "get" items through the > iterator, not "set" them (as you can with forward, but not input, > iterators in C++). C++ also has forward constant iterators which are not writable. Take the const_iterator type of your favorite singly-linked-list implementation for example. Unfortunately, C++ iterators mix up a bunch of concepts which ought to be orthogonal, like single-vs-multipass, whether they iterate over lvalues or must use a proxy, direction of iterability. Excellent paper on the topic at http://groups.yahoo.com/group/boost/files/iterator-categories.html. > The first step in studying such a need is whether it IS in fact a need. > Sure, "rich iterators" might come in handy, but do we NEED them...? > If so, then what kinds of rich-iterators do we in fact need? How to get > at them seems a third-order problem at best (and here, of course, I > would suggest that adaptation IS good for this tertiary problem:-). I don't know if we need them, but I'm certainly finding that not having some more information is difficult for me. If I need to make multiple passes over the information in a generalized iterable object, the only solution AFAICT is to unconditionally copy all the information into a list first. -Dave From aleax@aleax.it Tue Jul 9 11:03:40 2002 From: aleax@aleax.it (Alex Martelli) Date: Tue, 9 Jul 2002 12:03:40 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> Message-ID: On Tuesday 09 July 2002 11:37 am, David Abrahams wrote: > From: "Alex Martelli" > > > On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote: > > > IIRC from earlier discussions on the list, iterators "by design" do not > > > expose this information. In C++ terms, all Python iterators are forward > > > iterators > > > > I think they're _input_ iterators -- you can only "get" items through the > > iterator, not "set" them (as you can with forward, but not input, > > iterators in C++). > > C++ also has forward constant iterators which are not writable. Take the > const_iterator type of your favorite singly-linked-list implementation for > example. Right, but then you can at least get the current item as many times as you want before advancing -- input iterators and Python's iterators have in common that get-and-advance is inseparable. > I don't know if we need them, but I'm certainly finding that not having > some more information is difficult for me. If I need to make multiple > passes over the information in a generalized iterable object, the only > solution AFAICT is to unconditionally copy all the information into a list > first. Yes, I can see that making such a copy willy-nilly could be a pity from the performance viewpoint when, theoretically, one could otherwise guarantee that the information is inalterable. But is it all that frequent that one can make such guarantees, e.g. that the underlying list or dictionary (if any) cannot possibly be altered (e.g. by other threads) during multiple iterations over it? I.e. it might not be enough to know that you can iterate again if needed -- you might also need some guarantee that further iterations yield identical information, and that, in turn, might prove more problematic in many cases (although maybe not in yours -- I don't know enough details to tell!-). So maybe using a "snapshot" strategy for the general case, and then maybe specialcasing and optimizing a very few performance hotspots where information CAN be guaranteed to be unchangeable and multiply iterable (if you can locate any such hotspots) isn't quite as bad as all that. Just musing... Alex From oren-py-d@hishome.net Tue Jul 9 12:21:36 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 9 Jul 2002 07:21:36 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> Message-ID: <20020709112136.GA73672@hishome.net> On Tue, Jul 09, 2002 at 04:43:25AM -0400, David Abrahams wrote: > Yep. [Part of the reason I want to know whether I've got a one-shot > sequence is that inspecting that sequence then becomes an > information-destroying operation -- only being able to touch it once > changes how you have to handle it] > > I was thinking one potentially nice way to introspect about > multi-pass-ability might be to get an iterator and to see whether it was > copyable. Currently even most multi-pass iterators can't be copied with > copy.copy(). I wouldn't call it a one-shot sequence - it's just an iterator. The name iterator is enough to suggest that it is disposable and good for just one pass through the container. If the object has an __iter__ method but no next it's not an iterator and therefore most likely re-iterable. One notable exception is a file object. File iterators affect the current position of the file. If you think about it you'll see that file objects aren't really containers - they are already iterators. The real container is the file on the disk and the file object represents a pointer to a position in this container used for scanning it which a pretty good definition of an iterator. The difference is cosmetic: the next method is called readline and it returns an empty string instead of raising StopIteration. class ifile(file): def __iter__(self): return self def next(self): s = self.readline() if s: return s raise StopIteration class xfile: def __init__(self, filename): self.filename = filename def __iter__(self): return ifile(self.filename) This pair of objects has a proper container/iterator relationship. The xfile (stands for eXternal file, nothing to do with Mulder and Scully) represents the file on the disk and each call to iter(xfileobject) returns a new and independent iterator of the same container. Oren From David Abrahams" <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net> Message-ID: <07bb01c2273f$2ed8ac90$6601a8c0@boostconsulting.com> From: "Oren Tirosh" > On Tue, Jul 09, 2002 at 04:43:25AM -0400, David Abrahams wrote: > > Yep. [Part of the reason I want to know whether I've got a one-shot > > sequence is that inspecting that sequence then becomes an > > information-destroying operation -- only being able to touch it once > > changes how you have to handle it] > > > > I was thinking one potentially nice way to introspect about > > multi-pass-ability might be to get an iterator and to see whether it was > > copyable. Currently even most multi-pass iterators can't be copied with > > copy.copy(). > > I wouldn't call it a one-shot sequence - it's just an iterator. The name > iterator is enough to suggest that it is disposable and good for just one > pass through the container. > > If the object has an __iter__ method but no next it's not an iterator and > therefore most likely re-iterable. One notable exception is a file object. > File iterators affect the current position of the file. No kidding, that's the problem I'm talking about. It does me no good to have a criterion for determinining re-iterability which fails for the case I'm most concerned with ;-) > If you think about > it you'll see that file objects aren't really containers - they are already > iterators. The real container is the file on the disk There might not be a "real container" -- if it's an input pipe the data disappears as you iterate it. -Dave From pinard@iro.umontreal.ca Tue Jul 9 13:14:38 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 09 Jul 2002 08:14:38 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <20020709112136.GA73672@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net> Message-ID: [Oren Tirosh] > class ifile(file): > def __iter__(self): > return self > def next(self): > s = self.readline() > if s: > return s > raise StopIteration > class xfile: > def __init__(self, filename): > self.filename = filename > def __iter__(self): > return ifile(self.filename) > This pair of objects has a proper container/iterator relationship. This is all clear to me, except for one little thing. I wonder why class `ifile' has an `__iter__' method itself. I know it is said to be the "iterator protocol", and I wonder why it has to be. My understanding is that `__iter__' returns an iterator all ready to be enquired a number of times through `.next()' calls, and I presume that if any re-initialisation has to take place, it is within `__iter__'. However, as the iterator maintains its own progressive state, I do not see the intent and purpose of the iterator having an `__iter__' method itself. Would it make sense using the iterator `__iter__' as the preferred place where it re-initialises itself? -- François Pinard http://www.iro.umontreal.ca/~pinard From oren-py-d@hishome.net Tue Jul 9 13:27:19 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 9 Jul 2002 08:27:19 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net> Message-ID: <20020709122718.GA85236@hishome.net> On Tue, Jul 09, 2002 at 08:14:38AM -0400, François Pinard wrote: > This is all clear to me, except for one little thing. I wonder why class > `ifile' has an `__iter__' method itself. I know it is said to be the > "iterator protocol", and I wonder why it has to be. I don't like it either. In my previous message about the language 'Mamba' in an alternative universe I have an example of an alternative: if object has a tp_iter it is called, otherwise the object must have a tp_next. > My understanding is that `__iter__' returns an iterator all ready to be > enquired a number of times through `.next()' calls, and I presume that > if any re-initialisation has to take place, it is within `__iter__'. > However, as the iterator maintains its own progressive state, I do not see > the intent and purpose of the iterator having an `__iter__' method itself. > Would it make sense using the iterator `__iter__' as the preferred place > where it re-initialises itself? As far as I can tell this was done so that for could iterate over both iterables and iterators. I just don't see why it has to be done by all iterators instead of in just one place, adding much confusion between iterators and iterables in the process. Oren From jacobs@penguin.theopalgroup.com Tue Jul 9 13:49:13 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 9 Jul 2002 08:49:13 -0400 (EDT) Subject: [Python-Dev] Are we having https/ssl problems? In-Reply-To: <15658.5399.500893.599454@slothrop.zope.com> Message-ID: On Mon, 8 Jul 2002, Jeremy Hylton wrote: > >>>>> "KJ" == Kevin Jacobs writes: > > KJ> Hi all, This is not a bug report. It is more of a query to find > KJ> out if there are known problems with the current Python 2.3 CVS > KJ> regarding SSL, httplib w/ https, or urllib w/ https. I seem to > KJ> remember tuning out some discussions on timeout sockets and SSL > KJ> of late, so I thought I would ask. Here is code that has worked > KJ> previously, but does not in the current CVS: > > It sounds like a bug to me. Now it is: http://sourceforge.net/tracker/index.php?func=detail&aid=579107&group_id=5470&atid=105470 > I've made several changes to httplib recently to fix other SSL related > problems. It appears the new code has some bugs. It looks like your change to rework the fake SSL file exposed this one. It has something to do with the non-trivial way that httplib closes connections. In several places it just looks wrong. I've attached a patch that fixes the problem, but may break other things, and points out some other potential problems. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From barry@zope.com Tue Jul 9 14:10:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 9 Jul 2002 09:10:24 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org> <3D29FD4F.4060607@lemburg.com> Message-ID: <15658.57536.133296.126976@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Hmm, maybe I wasn't clear enough: I think that a distutils MAL> package should have a flag in its setup.py which lets MAL> distutils tell whether it's a site package or a system MAL> package, e.g. | setup(... pkgtype='site-package' ...) | vs. | setup(... pkgtype='system-package' ...) MAL> (with pkgtype='site-package' as default value if not given) MAL> The user would in both cases type 'python setup.py install' MAL> but the install command would automatically choose the MAL> right target subdir (site-packages/ or system-packages/). Except you can't always tell if its a system package or an add-on. email for example is an add-on for Python 2.1, but a system package for Python 2.2. Ignoring this specific example for now (since none of this will exist until Py2.3 anyway), it seems to me that there will be future packages for which this is true too. In that case, hardwiring site vs. system in the package's setup.py isn't the right approach. -Barry From mal@lemburg.com Tue Jul 9 14:28:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 09 Jul 2002 15:28:52 +0200 Subject: [Python-Dev] AtExit Functions Message-ID: <3D2AE514.2050909@lemburg.com> While working with mxTextTools 2.1.0b2, Mike Fletcher found that he gets a fatal error when the interpreter exits. Some tracing indicates that the cause is the at exit function of mxTextTools which clears the cache of tag tables used by the Tagging Engine in mxTextTools. If these tables include references to (callable) Python instances, Python can't properly clean them up when decref'ing them at AtExit time. Would it be safe to simply move the call_dll_exitfunc() call just before the "clear threat" code in Py_Finalize() ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fredrik@pythonware.com Tue Jul 9 14:50:26 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 9 Jul 2002 15:50:26 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com><3D22ADD9.1030901@lemburg.com><15650.64375.162977.160780@anthem.wooz.org><3D2433B9.9080102@lemburg.com><15657.39558.325764.651122@anthem.wooz.org><3D299E42.70200@lemburg.com><15657.51863.523283.977726@anthem.wooz.org><3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> Message-ID: <006301c2274f$95cfabf0$0900a8c0@spiff> barry wrote: > MAL> The user would in both cases type 'python setup.py install' > MAL> but the install command would automatically choose the > MAL> right target subdir (site-packages/ or system-packages/). >=20 > Except you can't always tell if its a system package or an add-on. > email for example is an add-on for Python 2.1, but a system package > for Python 2.2. assuming that the package maintainer is informed when a package is added to the standard library, that packages won't move in and out too much, and/or that most users probably don't want to down- grade to an older package version, you could of course write: if sys.version_info >=3D (2, 2): pkgtype =3D "system-package" else: pkgtype =3D "site-package" setup(... pkgtype=3Dpkgtype ...) in your setup.py file, once your package has been added. From mal@lemburg.com Tue Jul 9 15:15:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 09 Jul 2002 16:15:15 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com><3D22ADD9.1030901@lemburg.com><15650.64375.162977.160780@anthem.wooz.org><3D2433B9.9080102@lemburg.com><15657.39558.325764.651122@anthem.wooz.org><3D299E42.70200@lemburg.com><15657.51863.523283.977726@anthem.wooz.org><3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> <006301c2274f$95cfabf0$0900a8c0@spiff> Message-ID: <3D2AEFF3.5000600@lemburg.com> Fredrik Lundh wrote: > barry wrote: > > >> MAL> The user would in both cases type 'python setup.py install' >> MAL> but the install command would automatically choose the >> MAL> right target subdir (site-packages/ or system-packages/). >> >>Except you can't always tell if its a system package or an add-on. >>email for example is an add-on for Python 2.1, but a system package >>for Python 2.2. > > > assuming that the package maintainer is informed when a package > is added to the standard library, that packages won't move in and > out too much, and/or that most users probably don't want to down- > grade to an older package version, you could of course write: > > if sys.version_info >= (2, 2): > pkgtype = "system-package" > else: > pkgtype = "site-package" > > setup(... pkgtype=pkgtype ...) > > in your setup.py file, once your package has been added. Right. A package author whose package moves into the core would have to do this anyway, if s/he wants to maintain backwards compatibility with older Python versions, since the distutils package in those versions would not accept the new keyword. Anyway, regardless of how we do it, we need to add the 'system-packages' dir to just before the '.../lib/pythonX.X' entry in sys.path. If there's consent about this, I'd suggest to move ahead in this direction as first step. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Tue Jul 9 15:15:45 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 09 Jul 2002 10:15:45 -0400 Subject: [Python-Dev] AtExit Functions In-Reply-To: Your message of "Tue, 09 Jul 2002 15:28:52 +0200." <3D2AE514.2050909@lemburg.com> References: <3D2AE514.2050909@lemburg.com> Message-ID: <200207091415.g69EFjT01619@odiug.zope.com> > While working with mxTextTools 2.1.0b2, Mike Fletcher found that > he gets a fatal error when the interpreter exits. > > Some tracing indicates that the cause is the at exit function > of mxTextTools which clears the cache of tag tables used by > the Tagging Engine in mxTextTools. > > If these tables include references to (callable) Python instances, > Python can't properly clean them up when decref'ing them at > AtExit time. > > Would it be safe to simply move the call_dll_exitfunc() > call just before the "clear threat" code in Py_Finalize() ? You mean call_ll_exitfuncs(). :-) I think you may be making a wrong use of Py_AtExit(). The docs state (since 1998): Since Python's internal finallization will have completed before the cleanup function, no Python APIs should be called by *func*. I don't think it's safe to move the call forward. (I don't know which line you are referring to with ``"clear threat" code'' so I don't know how far back you want to move it, but I think the intention is very clear that this should be done at the very last.) You may want to use the atexit.py module instead to schedule your module's cleanup action; these exit functions are called much earlier. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Tue Jul 9 15:37:38 2002 From: mwh@python.net (Michael Hudson) Date: 09 Jul 2002 15:37:38 +0100 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py Message-ID: <2mr8id0xxp.fsf@starship.python.net> I'm curious about this bit of code from Lib/distutils/sysconfig.py: ,----------------------------------------------------------------------- | # python_build: (Boolean) if true, we're either building Python or | # building an extension with an un-installed Python, so we use | # different (hard-wired) directories. | | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) | landmark = os.path.join(argv0_path, "Modules", "Setup") | if not os.path.isfile(landmark): | python_build = 0 | elif os.path.isfile(os.path.join(argv0_path, "Lib", "os.py")): | python_build = 1 | else: | python_build = os.path.isfile(os.path.join(os.path.dirname(argv0_path), | "Lib", "os.py")) | del argv0_path, landmark `----------------------------------------------------------------------- Well, curious is a bit weak. It's broken, and breaks (eg) the snake farm's builds (because that is set up to build python in a directory far away and over the hills from the source directory). Why isn't it just ,----------------------------------------------------------------------- | # python_build: (Boolean) if true, we're either building Python or | # building an extension with an un-installed Python, so we use | # different (hard-wired) directories. | | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) | landmark = os.path.join(argv0_path, "Modules", "Setup") | | python_build = os.path.isfile(landmark): | | del argv0_path, landmark `----------------------------------------------------------------------- ? What cases does that get wrong? I'd have changed it already, but I have this feeling I must be missing something. Cheers, M. -- The meaning of "brunch" is as yet undefined. -- Simon Booth, ucam.chat From mal@lemburg.com Tue Jul 9 16:00:39 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 09 Jul 2002 17:00:39 +0200 Subject: [Python-Dev] AtExit Functions References: <3D2AE514.2050909@lemburg.com> <200207091415.g69EFjT01619@odiug.zope.com> Message-ID: <3D2AFA97.2030402@lemburg.com> Guido van Rossum wrote: >>While working with mxTextTools 2.1.0b2, Mike Fletcher found that >>he gets a fatal error when the interpreter exits. >> >>Some tracing indicates that the cause is the at exit function >>of mxTextTools which clears the cache of tag tables used by >>the Tagging Engine in mxTextTools. >> >>If these tables include references to (callable) Python instances, >>Python can't properly clean them up when decref'ing them at >>AtExit time. >> >>Would it be safe to simply move the call_dll_exitfunc() >>call just before the "clear threat" code in Py_Finalize() ? > > > You mean call_ll_exitfuncs(). :-) Yeah... today's my typo day :-) > I think you may be making a wrong use of Py_AtExit(). The docs state > (since 1998): > > Since Python's internal finallization will have completed before the > cleanup function, no Python APIs should be called by *func*. Hmm, and that includes Py_DECREF() and PyObject_Del() ? In that case, I have a problem since I'm using those two to clean up caches and free lists in the mx tools. > I don't think it's safe to move the call forward. (I don't know which > line you are referring to with ``"clear threat" code'' so I don't know > how far back you want to move it, but I think the intention is very > clear that this should be done at the very last.) Here's the snippet: void Py_Finalize(void) { PyInterpreterState *interp; PyThreadState *tstate; if (!initialized) return; /* The interpreter is still entirely intact at this point, and the * exit funcs may be relying on that. In particular, if some thread * or exit func is still waiting to do an import, the import machinery * expects Py_IsInitialized() to return true. So don't say the * interpreter is uninitialized until after the exit funcs have run. * Note that Threading.py uses an exit func to do a join on all the * threads created thru it, so this also protects pending imports in * the threads created via Threading. */ call_sys_exitfunc(); initialized = 0; /* Get current thread state and interpreter pointer */ tstate = PyThreadState_Get(); interp = tstate->interp; /* Disable signal handling */ PyOS_FiniInterrupts(); /* Cleanup Codec registry */ _PyCodecRegistry_Fini(); /* Destroy all modules */ PyImport_Cleanup(); /* Destroy the database used by _PyImport_{Fixup,Find}Extension */ _PyImport_Fini(); ---------------------------------- move call_ll_exitfuncs() here ---------------------------------- /* Debugging stuff */ #ifdef COUNT_ALLOCS dump_counts(); #endif #ifdef Py_REF_DEBUG fprintf(stderr, "[%ld refs]\n", _Py_RefTotal); #endif #ifdef Py_TRACE_REFS if (Py_GETENV("PYTHONDUMPREFS")) { _Py_PrintReferences(stderr); } #endif /* Py_TRACE_REFS */ /* Now we decref the exception classes. After this point nothing can raise an exception. That's okay, because each Fini() method below has been checked to make sure no exceptions are ever raised. */ _PyExc_Fini(); /* Delete current thread */ PyInterpreterState_Clear(interp); PyThreadState_Swap(NULL); PyInterpreterState_Delete(interp); PyMethod_Fini(); PyFrame_Fini(); PyCFunction_Fini(); PyTuple_Fini(); PyString_Fini(); PyInt_Fini(); PyFloat_Fini(); #ifdef Py_USING_UNICODE /* Cleanup Unicode implementation */ _PyUnicode_Fini(); #endif /* XXX Still allocated: - various static ad-hoc pointers to interned strings - int and float free list blocks - whatever various modules and libraries allocate */ PyGrammar_RemoveAccelerators(&_PyParser_Grammar); #ifdef PYMALLOC_DEBUG if (Py_GETENV("PYTHONMALLOCSTATS")) _PyObject_DebugMallocStats(); #endif -------------------------------- call_ll_exitfuncs(); -------------------------------- #ifdef Py_TRACE_REFS _Py_ResetReferences(); #endif /* Py_TRACE_REFS */ } > You may want to use the atexit.py module instead to schedule your > module's cleanup action; these exit functions are called much earlier. That's difficult to get right since I have to register such a function from C. Also, atexit.py is not present in Python 1.5.2. I could probably use a hack in the module dictionary which then triggers calling a cleanup function when the dictionary gets cleared, but there's a problem with this: clearing the module is easily possible for a user as well and doing so would cause seg faults if the user continues to call API on the module (maybe unknowingly through destructors). Looks like the only way to "solve" the problem is by simply leaking memory :-( -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Jul 9 16:08:20 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 09 Jul 2002 17:08:20 +0200 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py References: <2mr8id0xxp.fsf@starship.python.net> Message-ID: <3D2AFC64.3030400@lemburg.com> Michael Hudson wrote: > I'm curious about this bit of code from Lib/distutils/sysconfig.py: > > ,----------------------------------------------------------------------- > | # python_build: (Boolean) if true, we're either building Python or > | # building an extension with an un-installed Python, so we use > | # different (hard-wired) directories. > | > | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) > | landmark = os.path.join(argv0_path, "Modules", "Setup") > | if not os.path.isfile(landmark): > | python_build = 0 > | elif os.path.isfile(os.path.join(argv0_path, "Lib", "os.py")): > | python_build = 1 > | else: > | python_build = os.path.isfile(os.path.join(os.path.dirname(argv0_path), > | "Lib", "os.py")) > | del argv0_path, landmark > `----------------------------------------------------------------------- > > Well, curious is a bit weak. It's broken, and breaks (eg) the snake > farm's builds (because that is set up to build python in a directory > far away and over the hills from the source directory). > > Why isn't it just > > ,----------------------------------------------------------------------- > | # python_build: (Boolean) if true, we're either building Python or > | # building an extension with an un-installed Python, so we use > | # different (hard-wired) directories. > | > | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) > | landmark = os.path.join(argv0_path, "Modules", "Setup") > | > | python_build = os.path.isfile(landmark): > | > | del argv0_path, landmark > `----------------------------------------------------------------------- > > ? What cases does that get wrong? I'd have changed it already, but I > have this feeling I must be missing something. Is Modules/Setup a landmark on all Python build platforms, e.g. on Macs, Windows and other non-Unix platforms as well ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Tue Jul 9 16:14:13 2002 From: mwh@python.net (Michael Hudson) Date: 09 Jul 2002 16:14:13 +0100 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: "M.-A. Lemburg"'s message of "Tue, 09 Jul 2002 17:08:20 +0200" References: <2mr8id0xxp.fsf@starship.python.net> <3D2AFC64.3030400@lemburg.com> Message-ID: <2mu1n9ndbu.fsf@starship.python.net> "M.-A. Lemburg" writes: > Michael Hudson wrote: > > I'm curious about this bit of code from Lib/distutils/sysconfig.py: > > [...] > > ? What cases does that get wrong? I'd have changed it already, but I > > have this feeling I must be missing something. > > Is Modules/Setup a landmark on all Python build platforms, > e.g. on Macs, Windows and other non-Unix platforms as well ? Given that distutils isn't used for building Python's own extension modules on non-Unix platforms, does it matter? Cheers, M. -- ... Windows proponents tell you that it will solve things that your Unix system people keep telling you are hard. The Unix people are right: they are hard, and Windows does not solve them, ... -- Tim Bradshaw, comp.lang.lisp From fdrake@acm.org Tue Jul 9 19:02:35 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 9 Jul 2002 14:02:35 -0400 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: <3D2AFC64.3030400@lemburg.com> References: <2mr8id0xxp.fsf@starship.python.net> <3D2AFC64.3030400@lemburg.com> Message-ID: <15659.9531.709381.202975@grendel.zope.com> M.-A. Lemburg writes: > Is Modules/Setup a landmark on all Python build platforms, > e.g. on Macs, Windows and other non-Unix platforms as well ? It's only on Unix as far as I'm aware. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Tue Jul 9 19:42:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 09 Jul 2002 20:42:08 +0200 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: <2mr8id0xxp.fsf@starship.python.net> References: <2mr8id0xxp.fsf@starship.python.net> Message-ID: Michael Hudson writes: > ? What cases does that get wrong? I'd have changed it already, but I > have this feeling I must be missing something. This looks overly complex to me, too, but you may want to ask the author specifically: revision 1.46 date: 2002/06/04 15:28:21; author: fdrake; state: Exp; lines: +23 -15 When using a Python that has not been installed to build 3rd-party modules, distutils does not understand that the build version of the source tree is needed. This patch fixes distutils.sysconfig to understand that the running Python is part of the build tree and needs to use the appropriate "shape" of the tree. This does not assume anything about the current directory, so can be used to build 3rd-party modules using Python's build tree as well. This is useful since it allows us to use a non-installed debug-mode Python with 3rd-party modules for testing. It as the side-effect that set_python_build() is no longer needed (the hack which was added to allow distutils to be used to build the "standard" extension modules). Regards, Martin From guido@python.org Tue Jul 9 21:04:13 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 09 Jul 2002 16:04:13 -0400 Subject: [Python-Dev] Re: [Python-checkins] Using string methods in stdlib In-Reply-To: Your message of "Wed, 26 Jun 2002 19:05:03 EDT." <3D1A489F.93B8942B@metaslash.com> References: <3D1A362B.F3D2D585@metaslash.com> <003901c21d5f$3970af20$ced241d5@hagrid> <3D1A489F.93B8942B@metaslash.com> Message-ID: <200207092004.g69K4Do03854@odiug.zope.com> > (moved to python-dev and changed title) The move didn't succeed. But I'm moving this response there. [Ping] > > > > > > > > Update of /cvsroot/python/python/dist/src/Lib > > > > > > > > Modified Files: > > > > cgitb.py > > > > > > > > + if name[:1] == '_': continue [Neal] > > > Any reason not to use: > > > > > > if name.startswith('_'): continue > > > > > > ? [Fredrik] > > tried benchmarking? [Neal again] > I wasn't asking because of speed. I don't know > which version is faster and I couldn't care less. > I think using the method is clearer. startswith() was added because it was observed that there were (relatively) frequent bugs involving tests for s[:I] == C where len(C) != I, either due to miscounting or due to an edit of C without a matching edit of I. startswith() avoids all that. > > and figuring out that "_" is exactly one character long isn't > > that hard, really. > > I agree that for a single character either way is clear. Agreed too. The startswith() use case is for string long enough that you don't "see" the length immediately. Probably that means anything longer than 4. But in order to create good habits I think it's fine to use it in all cases. > > (can we please cut this python newspeak enforcement crap > > now, btw. even if slicing hadn't been much faster, there's > > nothing wrong with using an idiom that has worked perfectly > > fine for the last decade...) Maybe Neal is showing a bit too much of youthful enthusiasm for the new way. But I don't see it as enforcement crap. When I see Python code I wrote 10 years ago that works fine, I usually still think, "um, I wouldn't have written it that way now." If I think that about code that I feel is important as an example for later generations I like to fix it. We're trying to stay out of the modules that need to remain 1.5.2 compatible. > I thought the stdlib used startswith/endswith. But I did > a simple grep just to find where startswith could be used and > was surprised to find about 150 cases. Many are 1 char, > but there are many others of 5+ chars which make it harder > to determine immediately if the code is correct. > > I also see several cases of code like this in mimify: > > line[:len(prefix)] == prefix > > and other places where the length is calculated elsewhere, > (rlcompleter) making it even harder to verify correctness. > > Part of the reason to prefer the methods is for defensive programming. > There is duplicate information by using slicing (str & length) and > it's possible to change half the information and not the other, > leading to bugs. That's not possible with the methods. > > I don't think the stdlib should use every new feature. But I > do think it should reflect the best programming practices and > should be programmed defensively in order to try to avoid future bugs. I agree for new code, but I think we should be conservative in migrating existing code to use new idioms. It's better only to do that as part of a general overhaul of a module. As I've remarked before, I'm no big fan of "peephole" changes, where lots of modules are changed to implement one particular style change (e.g. string methods). Historically, such peephole changes have always introduced bugs because it's 99% boring work, and then you start making mistakes. Also, it leads to anachronisms where ancient code suddenly makes use of a modern feature but otherwise still looks ancient. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Tue Jul 9 22:54:40 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 09 Jul 2002 17:54:40 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <20020709122718.GA85236@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net> <20020709122718.GA85236@hishome.net> Message-ID: [Oren Tirosh] > [François Pinard] > > However, as the iterator maintains its own progressive state, I do not see > > the intent and purpose of the iterator having an `__iter__' method itself. > As far as I can tell this was done so that for could iterate over both > iterables and iterators. That is, that an iterator is always itself an iterable. I guess the real question is: could we have the guarantee that if an iterable returns an iterator through the iterable's __iter__, the iterator's __iter__ method will never be called from looping over the iterable? If we do not have that guarantee, then, when (and why) will the iterator's __iter__ be called? I did not find an answer to these questions neither from the Reference Manual nor the PEP, yet I confess that the exposition of the C API might hold an answer I could not understand. Could it be explained without referring to C? -- François Pinard http://www.iro.umontreal.ca/~pinard From greg@cosc.canterbury.ac.nz Wed Jul 10 00:54:25 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 10 Jul 2002 11:54:25 +1200 (NZST) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz> pinard@iro.umontreal.ca: > could we have the guarantee that if an iterable returns an > iterator through the iterable's __iter__, the iterator's __iter__ method > will never be called from looping over the iterable? [...pause while Greg's brain parses that sentence...] Yes, I believe that's true. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From skip@pobox.com Wed Jul 10 00:57:56 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 9 Jul 2002 18:57:56 -0500 Subject: [Python-Dev] anydbm/whichdb fix? Message-ID: <15659.30852.807030.896503@gargle.gargle.HOWL> Folks, Jack has been having trouble with dbm stuff on Mac OS X since my recent changes to setup.py and configure, and I am supposed to be shepherding a test case for the whichdb module. The two sort of go hand-in-hand. I've seen the same problem as Jack under certain circumstances on Linux. I reimplemented Greg Ball's whichdb.py patch and would appreciate some feedback from others who've crossed this bit of dirt in the past before I check in the two files (Jack, Guido, Barry, I seem to recall all of you having anydbm/whichdb problems at one point). The patch in question is at http://python.org/sf/541694 Skip From pinard@iro.umontreal.ca Wed Jul 10 02:05:58 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 09 Jul 2002 21:05:58 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz> References: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > pinard@iro.umontreal.ca: > > could we have the guarantee that if an iterable returns an > > iterator through the iterable's __iter__, the iterator's __iter__ method > > will never be called from looping over the iterable? > [...pause while Greg's brain parses that sentence...] (Sorry for my bad English.) > Yes, I believe that's true. If yes, then, the Library Reference is misleading, at the page: http://www.python.org/dev/doc/devel/lib/typeiter.html when it strongly says that any iterator's __iter__ method is "required". I guess this is a possible source of confusion. The context does not make it clear that the iterator's __iter__ method is *only* required whenever one *also* wants to use an iterator as an iterable. Better would be to describe __iter__ only once, the first time through, saying everything there that has to be said, and only retain for iterators the requirement of having a `next()' method. We should describe the truth. P.S. - Also, I do not understand the tiny bit about the `in' statement in the above page. Has `in' ever been a statement? If it refers to the comparison operator `in', then has it any special properties when used with iterators? I'm unsuccessful at seeing any hint about this from the documentation: http://www.python.org/dev/doc/devel/ref/comparisons.html -- François Pinard http://www.iro.umontreal.ca/~pinard From tim_one@email.msn.com Wed Jul 10 04:48:33 2002 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 9 Jul 2002 23:48:33 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Message-ID: [François Pinard] > If yes, then, the Library Reference is misleading, at the page: > > http://www.python.org/dev/doc/devel/lib/typeiter.html > > when it strongly says that any iterator's __iter__ method is "required". > I guess this is a possible source of confusion. This is how Python the language is defined. The current C Python implementation doesn't enforce it, and you may be able to get away with defining an iterator that doesn't supply an __iter__ method in some contexts under the current C implementation. If you do, though, you're breaking the rules, and there's no guarantee your code will continue to work. > The context does not make it clear that the iterator's __iter__ method is > *only* required whenever one *also* wants to use an iterator as an > iterable. That's not how the iteration protocol is defined, and isn't how it should be defined either. Requiring *some* method with a reserved name is an aid to introspection, lest it become impossible to distinguish, say, an iterator from an instance of a doubly-linked list node class that just happens to supply methods named .prev() and .next() for an unrelated purpose. > Better would be to describe __iter__ only once, the first time through, > saying everything there that has to be said, and only retain for iterators > the requirement of having a `next()' method. We should describe > the truth. Except that iterators are required to have an __iter__ method: this is a matter of definition, not just of reverse-engineering the minimum you can get away with under the current implementation in assorted contexts. You'll discover this hard way the first time you try to pass an iterator without an __iter__ method to a routine you didn't write that says it accepts any iterable object as an argument. Such a routine is entitled-- by the documented requirements --to rely on its argument responding sensibly to an __iter__ message. > P.S. - Also, I do not understand the tiny bit about the `in' > statement in the above page. Has `in' ever been a statement? I figure you're talking about this: ... to be used with the for and in statements The tail end of that is indeed worded poorly; 'in' isn't a statement. > If it refers to the comparison operator `in', Yes, that's the intent. > then has it any special properties when used with iterators? In x in y y can be any iterable object. As an extreme example, if "error\n" in file('msgs'): From barry@zope.com Wed Jul 10 05:46:19 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 10 Jul 2002 00:46:19 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org> <3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> <006301c2274f$95cfabf0$0900a8c0@spiff> <3D2AEFF3.5000600@lemburg.com> Message-ID: <15659.48155.786445.996056@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Anyway, regardless of how we do it, we need to add the MAL> 'system-packages' dir to just before the '.../lib/pythonX.X' MAL> entry in sys.path. If there's consent about this, I'd suggest MAL> to move ahead in this direction as first step. +1 Perhaps also backport to 2.2 and (maybe? maybe not?) 2.1. -Barry From Jack.Jansen@cwi.nl Wed Jul 10 09:58:35 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 10 Jul 2002 10:58:35 +0200 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: <2mr8id0xxp.fsf@starship.python.net> Message-ID: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl> On Tuesday, July 9, 2002, at 04:37 , Michael Hudson wrote: > Why isn't it just > > ,----------------------------------------------------------------------- > | # python_build: (Boolean) if true, we're either building Python or > | # building an extension with an un-installed Python, so we use > | # different (hard-wired) directories. > | > | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) > | landmark = os.path.join(argv0_path, "Modules", "Setup") > | > | python_build = os.path.isfile(landmark): > | > | del argv0_path, landmark > `----------------------------------------------------------------------- This won't work for one of the standard use cases: having multiple "build" subdirectories of the source directory (where you build for different platforms or some such). And on the other question: as of a week ago setup.py is also being used to build at least some of the MacPython extension modules. But as for MacPython the build tree and the install tree are one and the same there is no problem. And as to a general solution to the problem: how about parsing the Makefile that sits beside the interpreter? In all use cases (I think also in your example of build directories very far away over the hills) the Makefile will sit in the same directory as the interpreter. And the Makefile will have the srcdir variable that points to the source directory. And we have a makefile parser in distutils. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mwh@python.net Wed Jul 10 10:56:03 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 10:56:03 +0100 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: Jack Jansen's message of "Wed, 10 Jul 2002 10:58:35 +0200" References: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl> Message-ID: <2mznwzsy8c.fsf@starship.python.net> Jack Jansen writes: > On Tuesday, July 9, 2002, at 04:37 , Michael Hudson wrote: > > > Why isn't it just > > > > ,----------------------------------------------------------------------- > > | # python_build: (Boolean) if true, we're either building Python or > > | # building an extension with an un-installed Python, so we use > > | # different (hard-wired) directories. > > | > > | argv0_path = os.path.dirname(os.path.abspath(sys.executable)) > > | landmark = os.path.join(argv0_path, "Modules", "Setup") > > | > > | python_build = os.path.isfile(landmark): > > | > > | del argv0_path, landmark > > `----------------------------------------------------------------------- > > This won't work for one of the standard use cases: having multiple > "build" subdirectories of the source directory (where you build for > different platforms or some such). How so? It worked for my cron jobs last night, which build in this fashion. > And on the other question: as of a week ago setup.py is also being used > to build at least some of the MacPython extension modules. Is this MacPython as built by CodeWarrior? I'm counting MacOS X as unix when it's convenient to do so :) > But as for MacPython the build tree and the install tree are one and > the same there is no problem. Don't understand, sorry. > And as to a general solution to the problem: how about parsing the > Makefile that sits beside the interpreter? If there's a Modules/Setup file, that's what we do. > In all use cases (I think also in your example of build directories > very far away over the hills) the Makefile will sit in the same > directory as the interpreter. So you're suggesting that we use the Makefile as the landmark instead of Modules/Setup? > And the Makefile will have the srcdir variable that points to the > source directory. And we have a makefile parser in distutils. That's in effect what happens now. Cheers, M. -- I hate leaving Windows95 boxes publically accessible, so shifting even to NT is a blessing in some ways. At least I can reboot them remotely in a sane manner, rather than having to send them malformed packets. -- http://bofhcam.org/journal/journal.html, 20/06/2000 From mwh@python.net Wed Jul 10 10:56:38 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 10:56:38 +0100 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: martin@v.loewis.de's message of "09 Jul 2002 20:42:08 +0200" References: <2mr8id0xxp.fsf@starship.python.net> Message-ID: <2mwus3sy7d.fsf@starship.python.net> martin@v.loewis.de (Martin v. Loewis) writes: > Michael Hudson writes: > > > ? What cases does that get wrong? I'd have changed it already, but I > > have this feeling I must be missing something. > > This looks overly complex to me, too, but you may want to ask the > author specifically: I already did; didn't you see the Cc: line in my first post? Cheers, M. -- Counting lines is probably a good idea if you want to print it out and are short on paper, but I fail to see the purpose otherwise. -- Erik Naggum, comp.lang.lisp From mwh@python.net Wed Jul 10 10:58:36 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 10:58:36 +0100 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: "Fred L. Drake, Jr."'s message of "Tue, 9 Jul 2002 14:02:35 -0400" References: <2mr8id0xxp.fsf@starship.python.net> <3D2AFC64.3030400@lemburg.com> <15659.9531.709381.202975@grendel.zope.com> Message-ID: <2mu1n7sy43.fsf@starship.python.net> "Fred L. Drake, Jr." writes: > M.-A. Lemburg writes: > > Is Modules/Setup a landmark on all Python build platforms, > > e.g. on Macs, Windows and other non-Unix platforms as well ? > > It's only on Unix as far as I'm aware. Now how about answering the initial question? It's your code. pesteringly-ly y'rs, M. -- Sufficiently advanced political correctness is indistinguishable from irony. -- Erik Naggum From guido@python.org Wed Jul 10 11:23:36 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 06:23:36 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: Your message of "Wed, 10 Jul 2002 00:46:19 EDT." <15659.48155.786445.996056@anthem.wooz.org> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org> <3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> <006301c2274f$95cfabf0$0900a8c0@spiff> <3D2AEFF3.5000600@lemburg.com> <15659.48155.786445.996056@anthem.wooz.org> Message-ID: <200207101023.g6AANaT25347@pcp02138704pcs.reston01.va.comcast.net> > >>>>> "MAL" == M writes: > > MAL> Anyway, regardless of how we do it, we need to add the > MAL> 'system-packages' dir to just before the '.../lib/pythonX.X' > MAL> entry in sys.path. If there's consent about this, I'd suggest > MAL> to move ahead in this direction as first step. > > +1 > > Perhaps also backport to 2.2 and (maybe? maybe not?) 2.1. Smells like a new feature to me, so -1 on a 2.2 backport. I haven't seen enough of this thread to comment on this for 2.3. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Wed Jul 10 11:27:24 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 11:27:24 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58 In-Reply-To: jhylton@users.sourceforge.net's message of "Tue, 09 Jul 2002 14:22:38 -0700" References: Message-ID: <2mk7o3sws3.fsf@starship.python.net> jhylton@users.sourceforge.net writes: > Update of /cvsroot/python/python/dist/src/Lib > In directory usw-pr-cvs1:/tmp/cvs-serv26945 > > Modified Files: > httplib.py > Log Message: > Fix for SF bug 579107. > > The recent SSL changes resulted in important, but subtle changes to > close() semantics. Since builtin socket makefile() is not called for > SSL connections, we don't get separately closeable fds for connection > and response. Comments in the code explain how to restore makefile > semantics. > > Bug fix candidate. I have a feeling that it was this checkin that broke test_pyclbr. Certainly, something did. Perhaps this module could do with a better test? Cheers, M. -- Get out your salt shakers folks, this one's going to take more than one grain. -- Ator in an Ars Technica news item From mwh@python.net Wed Jul 10 11:36:03 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 11:36:03 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58 In-Reply-To: Michael Hudson's message of "10 Jul 2002 11:27:24 +0100" References: <2mk7o3sws3.fsf@starship.python.net> Message-ID: <2mfzyrswdo.fsf@starship.python.net> Michael Hudson writes: > jhylton@users.sourceforge.net writes: [...] > > Modified Files: > > httplib.py [...] > > I have a feeling that it was this checkin that broke test_pyclbr. > > Certainly, something did. Oh, Tim fixed it already. > Perhaps this module could do with a better test? This still stands, though. Cheers, M. -- NUTRIMAT: That drink was individually tailored to meet your personal requirements for nutrition and pleasure. ARTHUR: Ah. So I'm a masochist on a diet am I? -- The Hitch-Hikers Guide to the Galaxy, Episode 9 From fdrake@acm.org Wed Jul 10 12:55:03 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 10 Jul 2002 07:55:03 -0400 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl> References: <2mr8id0xxp.fsf@starship.python.net> <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl> Message-ID: <15660.8343.626355.368514@grendel.zope.com> Jack Jansen writes: > This won't work for one of the standard use cases: having multiple > "build" subdirectories of the source directory (where you build for > different platforms or some such). Actually, it does support this case; Modules/Setup is relative to the interpreter, and will have been created on Unix. > And as to a general solution to the problem: how about parsing the > Makefile that sits beside the interpreter? In all use cases (I think > also in your example of build directories very far away over the hills) > the Makefile will sit in the same directory as the interpreter. And the > Makefile will have the srcdir variable that points to the source > directory. And we have a makefile parser in distutils. Does this work on Windows? Parsing the Makefile isn't a problem, but I don't think there is one in the MSVC build. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From wdraxinger@darkstargames.de Wed Jul 10 12:59:44 2002 From: wdraxinger@darkstargames.de (Wolfgang Draxinger) Date: Wed, 10 Jul 2002 13:59:44 +0200 Subject: [Python-Dev] Embedding Python the extreme way Message-ID: <3D2C21B0.4040108@darkstargames.de> For my current 3D engine project I decided to use Python as a important part of the whole design. And it works well. However, now I want to make python an integal part of the engine, not just a external lib. And I not want to statically link it with the engine. My goal is, to discard all modules and builtin functions that I don't need, e.g. sys. It's intended to control the 3D engine, not to write complex scripts. Can anybody give me some advice for that. Embedding Python and writing extension modules is no problem at all, just "cleaning" the python sources. My base is Python 2.2.1 -- +------------------------------------------------+ | +----------------+ WOLFGANG DRAXINGER | | | ,-. DARKSTAR | lead programmer | | |( ) +---------+ wdraxinger@darkstargames.de | | | `-' / GAMES / | | +----+'''''''' http://www.darkstargames.de | +------------------------------------------------+ From mwh@python.net Wed Jul 10 15:17:57 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 15:17:57 +0100 Subject: [Python-Dev] The C API and wide unicode support Message-ID: <2mr8ibzmy2.fsf@starship.python.net> It may be best to allow this particular dead horse to go on being dead, but I thought I'd ask here. Beats work, anyway. Picture the situation: you're wrapping a C library that returns a unicode string (let's say encoded as UCS-2). You want to return this as a Python object. So you'd think you can write return PyUnicode_Decode(encstr, "ucs-2", NULL); (or something close to that). But for reasons that escape me, PyUnicode_Decode is included in the API renaming in Include/unicodeobject.h, so if you want to provide binaries you have to provide two, and you can be sure that users will have no idea which they need. So, questions: (1) am I correct in thinking that PyUnicode_Decode (and a bunch of others) could safely be omitted from the renaming? (2) if so, is it worth omitting those APIs that could be omitted for 2.3? This train of thinking came about because the version of 2.2 that comes with Redhat 7.3 is compiled with wide unicode support (which surprised me), and so the pygame RPMs broke. Cheers, M. -- Any form of evilness that can be detected without *too* much effort is worth it... I have no idea what kind of evil we're looking for here or how to detect is, so I can't answer yes or no. -- Guido Van Rossum, python-dev From walter@livinglogic.de Wed Jul 10 15:57:16 2002 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 10 Jul 2002 16:57:16 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> Message-ID: <3D2C4B4C.6050204@livinglogic.de> Michael Hudson wrote: > It may be best to allow this particular dead horse to go on being > dead, but I thought I'd ask here. Beats work, anyway. > > Picture the situation: you're wrapping a C library that returns a > unicode string (let's say encoded as UCS-2). You want to return this > as a Python object. So you'd think you can write > > return PyUnicode_Decode(encstr, "ucs-2", NULL); There is no "ucs-2" encoding. This should be "utf-16", "utf-16-le" or "utf-16-be". > (or something close to that). But for reasons that escape me, > PyUnicode_Decode is included in the API renaming in > Include/unicodeobject.h, so if you want to provide binaries you have > to provide two, and you can be sure that users will have no idea which > they need. > > So, questions: > > (1) am I correct in thinking that PyUnicode_Decode (and a bunch of > others) could safely be omitted from the renaming? No, because the unicode objects generated will consist of either UCS-2 or UCS-4 "characters". This has nothing to do with the encoding of the byte array which you use to create the unicode object. Any C function that uses Unicode objects in any way needs name mangling, because the storage layout of the Unicode objects changes. > (2) if so, is it worth omitting those APIs that could be omitted for 2.3? > > This train of thinking came about because the version of 2.2 that > comes with Redhat 7.3 is compiled with wide unicode support (which > surprised me), and so the pygame RPMs broke. I don't know, probably because sizeof(wchar_t)==4 ? Bye, Walter Dörwald From guido@python.org Wed Jul 10 16:02:53 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 11:02:53 -0400 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Your message of "Wed, 10 Jul 2002 16:57:16 +0200." <3D2C4B4C.6050204@livinglogic.de> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> Message-ID: <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> > Any C function that uses Unicode objects in any way needs name > mangling, because the storage layout of the Unicode objects > changes. Really? If I am only using the published APIs and not peeking directly inside the Unicode object, why should I care about its internal lay-out? Shouldn't only functions whose signature uses PY_UNICODE_TYPE be name-mangled? What am I missing? --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Wed Jul 10 16:25:09 2002 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 10 Jul 2002 17:25:09 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2C51D5.8060000@livinglogic.de> Guido van Rossum wrote: >>Any C function that uses Unicode objects in any way needs name >>mangling, because the storage layout of the Unicode objects >>changes. > > > Really? If I am only using the published APIs and not peeking > directly inside the Unicode object, why should I care about its > internal lay-out? That's what I meant with "using". Function that only pass unicode objects around don't need to know (as long as they pass the objects only to functions that themselves either "know" or "don't need to know" the layout). PyUnicode_Decode creates unicode objects, so I guess it needs to know. > Shouldn't only functions whose signature uses PY_UNICODE_TYPE be > name-mangled? What am I missing? What about the functions that use the C macros (PyUnicode_AS_UNICODE etc.) directly or indirectly? Those functions will rely on the internal lay-out. Bye, Walter Dörwald From mwh@python.net Wed Jul 10 16:29:14 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jul 2002 16:29:14 +0100 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: =?ISO-8859-15?Q?Walter_D=F6rwald?='s message of "Wed, 10 Jul 2002 17:25:09 +0200" References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> Message-ID: <2m1yabiotx.fsf@starship.python.net> =?ISO-8859-15?Q?Walter_D=F6rwald?= writes: > Guido van Rossum wrote: > > >>Any C function that uses Unicode objects in any way needs name > >>mangling, because the storage layout of the Unicode objects > >>changes. > > > > > > Really? If I am only using the published APIs and not peeking > > directly inside the Unicode object, why should I care about its > > internal lay-out? > > That's what I meant with "using". Function that only pass > unicode objects around don't need to know (as long as they pass > the objects only to functions that themselves either "know" > or "don't need to know" the layout). > > PyUnicode_Decode creates unicode objects, so I guess it needs > to know. *It* needs to know, yes. But surely the caller doesn't? > > Shouldn't only functions whose signature uses PY_UNICODE_TYPE be > > name-mangled? What am I missing? > > What about the functions that use the C macros (PyUnicode_AS_UNICODE > etc.) directly or indirectly? Those functions will rely on the > internal lay-out. They're verboten in extension modules anyway, so I don't care. Cheers, M. -- Like most people, I don't always agree with the BDFL (especially when he wants to change things I've just written about in very large books), ... -- Mark Lutz, http://python.oreilly.com/news/python_0501.html From walter@livinglogic.de Wed Jul 10 17:00:17 2002 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 10 Jul 2002 18:00:17 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> Message-ID: <3D2C5A11.6020501@livinglogic.de> Michael Hudson wrote: > =?ISO-8859-15?Q?Walter_D=F6rwald?= writes: > >>Guido van Rossum wrote: >> >> >>>>Any C function that uses Unicode objects in any way needs name >>>>mangling, because the storage layout of the Unicode objects >>>>changes. >>> >>> >>>Really? If I am only using the published APIs and not peeking >>>directly inside the Unicode object, why should I care about its >>>internal lay-out? >> >>That's what I meant with "using". Function that only pass >>unicode objects around don't need to know (as long as they pass >>the objects only to functions that themselves either "know" >>or "don't need to know" the layout). >> >>PyUnicode_Decode creates unicode objects, so I guess it needs >>to know. > > *It* needs to know, yes. But surely the caller doesn't? This depends on what the caller does with the result of PyUnicode_Decode. >>>Shouldn't only functions whose signature uses PY_UNICODE_TYPE be >>>name-mangled? What am I missing? >> >>What about the functions that use the C macros (PyUnicode_AS_UNICODE >>etc.) directly or indirectly? Those functions will rely on the >>internal lay-out. > > They're verboten in extension modules anyway, so I don't care. I didn't know that. Neither Include/unicodeobject.h nor Doc/api/concrete.tex mention it. Is there any other location where this is mentioned? I think to forbid the use of the macros is too restrictive. What if I want to implement a version of foo.replace(u"&", u"&") .replace(u"<", u"<") .replace(u"\"", u""") .replace(u">", u">") in C for performance reasons? How is this possible without using the C macros? And if extension modules are not allowed to access the internal layout of unicode objects, what's the use of name mangling? Bye, Walter Dörwald From mal@lemburg.com Wed Jul 10 18:19:12 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 10 Jul 2002 19:19:12 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> Message-ID: <3D2C6C90.7090006@lemburg.com> Michael Hudson wrote: >>Shouldn't only functions whose signature uses PY_UNICODE_TYPE be >>>name-mangled? What am I missing? > >>What about the functions that use the C macros (PyUnicode_AS_UNICODE >>etc.) directly or indirectly? Those functions will rely on the >>internal lay-out. > > > They're verboten in extension modules anyway, so I don't care. They are not disallowed in extensions... don't know where you have that idea from. Note that the name mangling is done to prevent an extension which uses Unicode in some way from loading if the interpreter and extension Unicode "width" doesn't match. If we would allow this, extensions using the macros would cause memory corruption since they'd index differently. That's not only a potential cause for a seg fault, it's also a security risk. The name mangling does not provide a 100% bullet proof way of preventing this (an extension might use Py_UNICODE and the Unicode macros without touching any of the other C APIs), but it goes a long way in that direction. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From aahz@pythoncraft.com Wed Jul 10 18:28:57 2002 From: aahz@pythoncraft.com (Aahz) Date: Wed, 10 Jul 2002 13:28:57 -0400 Subject: [Python-Dev] Embedding Python the extreme way In-Reply-To: <3D2C21B0.4040108@darkstargames.de> References: <3D2C21B0.4040108@darkstargames.de> Message-ID: <20020710172857.GA5093@panix.com> python-dev is the wrong place for this discussion. Please post your message to comp.lang.python or look on www.python.org for other resources if you think that's not suitable. On Wed, Jul 10, 2002, Wolfgang Draxinger wrote: > > For my current 3D engine project I decided to use Python as a important > part of the whole design. And it works well. > However, now I want to make python an integal part of the engine, not > just a external lib. And I not want to statically link it with the > engine. My goal is, to discard all modules and builtin functions that I > don't need, e.g. sys. It's intended to control the 3D engine, not to > write complex scripts. > > Can anybody give me some advice for that. Embedding Python and writing > extension modules is no problem at all, just "cleaning" the python > sources. My base is Python 2.2.1 -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim@zope.com Wed Jul 10 19:00:42 2002 From: tim@zope.com (Tim Peters) Date: Wed, 10 Jul 2002 14:00:42 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58 In-Reply-To: <2mk7o3sws3.fsf@starship.python.net> Message-ID: >> Modified Files: >> httplib.py >> Log Message: >> Fix for SF bug 579107. [Michael Hudson] > ... > I have a feeling that it was this checkin that broke test_pyclbr. Yes. > ... > Perhaps this module could do with a better test? "This module" is ambiguous given that two modules are involved, but it's hard to disagree either way <0.9 wink>. The change to httplib that broke test_pyclbr should not, of course, have been checked in in that state regardless. Whatever, it's fixed now. From guido@python.org Wed Jul 10 19:26:28 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 14:26:28 -0400 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Your message of "Wed, 10 Jul 2002 19:19:12 +0200." <3D2C6C90.7090006@lemburg.com> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> Message-ID: <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> > > They're verboten in extension modules anyway, so I don't care. > > They are not disallowed in extensions... don't know where you > have that idea from. Maybe because other macros are often disallowed in (3rd party) extensions, the reason being that the macros dig in the internal representation which isn't guaranteed to be binary compatible? It would make sense that the same rules applies to the Unicode macros in 3rd party extensions. (I admit that these restrictions may be underdocumented. Nevertheless they were intended and I believe they were discussed.) > Note that the name mangling is done to prevent an extension > which uses Unicode in some way from loading if the interpreter > and extension Unicode "width" doesn't match. > > If we would allow this, extensions using the macros would cause > memory corruption since they'd index differently. That's not only > a potential cause for a seg fault, it's also a security risk. If there was a way so that only extensions that use the macros or APIs whose signature uses Py_UNICODE_TYPE would fail to load, that would be better. But I don't know how to enforce that. > The name mangling does not provide a 100% bullet proof way > of preventing this (an extension might use Py_UNICODE and > the Unicode macros without touching any of the other C APIs), > but it goes a long way in that direction. Maybe it goes too far. OTOH, Michael, is this really something you cannot live with? Or is it simply a surprise? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 10 20:26:55 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 15:26:55 -0400 Subject: [Python-Dev] Provide a Python wrapper for any new C extension In-Reply-To: Your message of "Fri, 21 Jun 2002 14:18:39 PDT." References: Message-ID: <200207101926.g6AJQtg27630@pcp02138704pcs.reston01.va.comcast.net> > The only obvious objection I can see to this is a performance hit for > having to go through the Python stub to call the C extension. But I just > did a very simple test of calling strftime('%c') 25,000 times from time > directly and using a Python stub and it was .470 and .490 secs total > respectively according to profile.run(). If the Python module does "from _Cmodule import *", there should be *no* difference in performance, since you get the same object in either case. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 10 20:25:22 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 15:25:22 -0400 Subject: [Python-Dev] Provide a Python wrapper for any new C extension In-Reply-To: Your message of "Fri, 21 Jun 2002 20:31:27 BST." <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> Message-ID: <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> [Hamish Lawson] > One of the arguments put forward against renaming the existing time > module to _time (as part of incorporating a pure-Python strptime > function) is that it could break some builds. Therefore I'd suggest > that it could be a useful principle for any C extension added in the > future to the standard library to have an accompanying pure-Python > wrapper that would be the one that client code would usually import. There are too many distinct use cases to make this a hard and fast rule. The problem with maintaining many builds is best served by keeping the number of extensions small, period. [Marc-Andre Lemburg] > BTW, this reminds me of the old idea to move that standard > lib into a package, eg. 'python'... > > from python import time. Maybe in Python 3000. In 2.x, I think rearranging the standard library will just cause more upheaval without much benefits. > We should at least reserve such a name RSN so that we don't > run into problems later on. I can guarantee you that that name won't be used as a standard Python module or package name any time soon. If someone creates a 3rd party package or module named 'python' I'd question their sanity. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Wed Jul 10 21:17:19 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Wed, 10 Jul 2002 22:17:19 +0200 Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py In-Reply-To: <2mznwzsy8c.fsf@starship.python.net> Message-ID: <0938EF98-9442-11D6-B6F5-003065517236@oratrix.com> On woensdag, juli 10, 2002, at 11:56 , Michael Hudson wrote: >> This won't work for one of the standard use cases: having multiple >> "build" subdirectories of the source directory (where you build for >> different platforms or some such). > > How so? It worked for my cron jobs last night, which build in this > fashion. You're right, of course. I was confusing this with the sys.path initialization code. I think there's absolutely nothing wrong with your solution. > >> And on the other question: as of a week ago setup.py is also >> being used >> to build at least some of the MacPython extension modules. > > Is this MacPython as built by CodeWarrior? I'm counting MacOS X as > unix when it's convenient to do so :) Correct. Over on the pythonmac-sig the use of "MacPython" is (for the time being) reserved to mean the CodeWarrior-built Python that will also run on OS9. MachoPython is used for the OSX-only unix Python. And, as I said (or try to say:-), "don't worry about MacPython builds, they'll continue to work. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From martin@v.loewis.de Wed Jul 10 21:32:11 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Jul 2002 22:32:11 +0200 Subject: [Python-Dev] Embedding Python the extreme way In-Reply-To: <3D2C21B0.4040108@darkstargames.de> References: <3D2C21B0.4040108@darkstargames.de> Message-ID: Wolfgang Draxinger writes: > Can anybody give me some advice for that. Not on this list, which is for the development *of* Python, not for the development *with* Python. Regards, Martin From martin@v.loewis.de Wed Jul 10 21:39:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Jul 2002 22:39:55 +0200 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Really? If I am only using the published APIs and not peeking > directly inside the Unicode object, why should I care about its > internal lay-out? The safeguard is to tell apart module that use Unicode objects from modules which don't. If a module uses Unicode objects, it might be using PyUnicode_AS_UNICODE. Unfortunately, this does not result in a symbol reference, so using a module that only uses PyUnicode_AS_UNICODE would break if it was compiled for the wrong width of Py_UNICODE. Mangling all Unicode functions is the best safeguard we could find to protect against this case. It is still possible to cheat that, but it is unlikely that somebody breaks the safeguard by accident. Likewise, it is unlikely that a single platform has builds for two different Py_UNICODE sizes simultaneously, so the safeguard does not add additional burden, either. Regards, Martin From martin@v.loewis.de Wed Jul 10 21:44:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Jul 2002 22:44:03 +0200 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: <2mr8ibzmy2.fsf@starship.python.net> References: <2mr8ibzmy2.fsf@starship.python.net> Message-ID: Michael Hudson writes: > (or something close to that). But for reasons that escape me, > PyUnicode_Decode is included in the API renaming in > Include/unicodeobject.h, so if you want to provide binaries you have > to provide two, and you can be sure that users will have no idea which > they need. That is not true. One option is to provide the sources; if you do so, you do not need to provide binaries at all (thanks to distutils (*)). Another option is to provide binaries for the default installation only, which will be UCS-2. Nobody will notice. Regards, Martin (*) If distutils is unacceptable, it is probably because it requires users to have a C compiler. In that case, you are probably targeting Win32. In that case, you can be certain how the binaries have been built. From mal@lemburg.com Wed Jul 10 22:21:30 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 10 Jul 2002 23:21:30 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2CA55A.6080803@lemburg.com> Guido van Rossum wrote: >>>They're verboten in extension modules anyway, so I don't care. >> >>They are not disallowed in extensions... don't know where you >>have that idea from. > > > Maybe because other macros are often disallowed in (3rd party) > extensions, the reason being that the macros dig in the internal > representation which isn't guaranteed to be binary compatible? It > would make sense that the same rules applies to the Unicode macros in > 3rd party extensions. Which macros would that be ? I modelled the macros in the Unicode implementation after those of the string implementation. And those macros are certainly used in a lot of 3rd party extensions. > (I admit that these restrictions may be underdocumented. Nevertheless > they were intended and I believe they were discussed.) I guess, having the macros in the header files without an explicit warning marks them as public interface. That's how I have used them in tons of code and I think that I'm not alone in using this approach. >>Note that the name mangling is done to prevent an extension >>which uses Unicode in some way from loading if the interpreter >>and extension Unicode "width" doesn't match. >> >>If we would allow this, extensions using the macros would cause >>memory corruption since they'd index differently. That's not only >>a potential cause for a seg fault, it's also a security risk. > > If there was a way so that only extensions that use the macros or > APIs whose signature uses Py_UNICODE_TYPE would fail to load, that > would be better. But I don't know how to enforce that. That's certainly possible for C API, but not for the macros (without defeating their purpose). You also have a problem in case the extension defines its own Unicode routines relying on the Python types and macros, e.g. for extensions which subclass the Unicode type. These don't necessarily need to use the APIs; not even the macros... but they do rely on the binary layout used in the Unicode type. >>The name mangling does not provide a 100% bullet proof way >>of preventing this (an extension might use Py_UNICODE and >>the Unicode macros without touching any of the other C APIs), >>but it goes a long way in that direction. > > > Maybe it goes too far. > > OTOH, Michael, is this really something you cannot live with? Or is > it simply a surprise? I think that the fact that Michael is seeing breakage is a good thing. Otherwise, he would probably not have noticed that RedHat chose to use the wide build as default. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 10 22:33:18 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 10 Jul 2002 23:33:18 +0200 Subject: [Python-Dev] python package References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2CA81E.6060408@lemburg.com> Guido van Rossum wrote: >>BTW, this reminds me of the old idea to move that standard >>lib into a package, eg. 'python'... >> >>from python import time. > > > Maybe in Python 3000. In 2.x, I think rearranging the standard > library will just cause more upheaval without much benefits. > > >>We should at least reserve such a name RSN so that we don't >>run into problems later on. > > > I can guarantee you that that name won't be used as a standard Python > module or package name any time soon. If someone creates a 3rd party > package or module named 'python' I'd question their sanity. :-) How about adding python.py: __path__ = ['.'] This would not only reserve the name in the global namespace, but also enable applications to start using 'from python import x' now without much fuzz. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Wed Jul 10 23:51:59 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 18:51:59 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Wed, 10 Jul 2002 23:33:18 +0200." <3D2CA81E.6060408@lemburg.com> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> Message-ID: <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> > How about adding > > python.py: > __path__ = ['.'] > > This would not only reserve the name in the global namespace, > but also enable applications to start using 'from python import x' > now without much fuzz. Then I have to ask the question I originally wanted to ask: what problem would that solve? And is this the right solution? Also, it would make *all* standard modules accessible through the python package -- surely this isn't what we want (not if we use the Java example at least). Also, for some modules (that keep some global state) it's a bad idea if they are imported twice, since their initialization code would be run twice, and there would be two separate instances of the module. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Thu Jul 11 01:26:36 2002 From: aahz@pythoncraft.com (Aahz) Date: Wed, 10 Jul 2002 20:26:36 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> Message-ID: <20020711002636.GA6958@panix.com> On Tue, Jul 09, 2002, Greg Ewing wrote: > > Maybe a one-shot iterable should raise an exception > if you try to obtain a second iterator from it? Then you couldn't do this: done = False for line in f: if not check(line): break process(line) else: done = True if not done: for line in file: another_process(line) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Thu Jul 11 02:10:18 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 10 Jul 2002 21:10:18 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 10 Jul 2002 20:26:36 EDT." <20020711002636.GA6958@panix.com> References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> <20020711002636.GA6958@panix.com> Message-ID: <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net> > Then you couldn't do this: > > done = False > for line in f: > if not check(line): > break > process(line) > else: > done = True > > if not done: > for line in file: > another_process(line) That's already broken, see SF bug 524804. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Thu Jul 11 07:15:28 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 11 Jul 2002 02:15:28 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net> References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> <20020711002636.GA6958@panix.com> <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020711061528.GA6367@hishome.net> On Wed, Jul 10, 2002 at 09:10:18PM -0400, Guido van Rossum wrote: > > Then you couldn't do this: > > > > done = False > > for line in f: > > if not check(line): > > break > > process(line) > > else: > > done = True > > > > if not done: > > for line in file: > > another_process(line) > > That's already broken, see SF bug 524804. Xreadlines is buffered and therefore leaves the file position of the file in an unexpected state. If you use xreadlines explicitly you should expect that. The fact that file.__iter__ returns an xreadlines object implicitly is therefore a bit surprising. What's the reason for using xreadlines as a file iterator? Was it performance or was it just the easiest way to implement it using an existing object? "Files support the iterator protocol. Each iteration returns the same result as file.readline()" This is not correct. Files support what I call the iterable protocol. Objects supporting the iterator protocol have a .next() method, files don't. While it's true that each iteration has the same result as readline it doesn't have the same side effects. Proposal: make files really support the iterator protocol. __iter__ would return self and next() would call readline and raise StopIteration if ''. If anyone wants the xreadline performance improvement it should be explicit. definitions: iterable := hasattr(obj, '__iter__') iterator := hasattr(obj, '__iter__') and hasattr(obj, 'next') If object is iterable and not an iterator it would be reasonable to expect that it is also re-iterable. I don't know if this should be a requirement but I think it would be a good idea if all builtin objects should conform to it anyway. Currently files are the only builtin that is iterable, not an iterator and not re-iterable. explicit-is-better-than-implicit-ly yours, Oren From martin@v.loewis.de Thu Jul 11 07:49:42 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 11 Jul 2002 08:49:42 +0200 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Maybe because other macros are often disallowed in (3rd party) > extensions, the reason being that the macros dig in the internal > representation which isn't guaranteed to be binary compatible? In this specific case, using a function vs. using a macro makes no difference: the function exposes implementation details just as the macro. In theory, using the function would allow to rearrange Unicode objects to have their characters in the same memory block as the object, which would break applications of the macro - but apparently, the risk of the result relying too much on implementation details (i.e. wide or narrow Unicode) is more serious. > It would make sense that the same rules applies to the Unicode > macros in 3rd party extensions. Given the potential change of the layout of Unicode objects, I would agree that it is good to ban PyUnicode_UPPER_CASE from use in extension modules. > If there was a way so that only extensions that use the macros or > APIs whose signature uses Py_UNICODE_TYPE would fail to load, that > would be better. But I don't know how to enforce that. That is indeed the problem, and the last time we concluded that it would be best to bind all Unicode functions to the unicode width, to be on the safe side. > Maybe it goes too far. > > OTOH, Michael, is this really something you cannot live with? Or is > it simply a surprise? That is the central question here. As I said before, I would expect this to be a non-issue, in real life. Regards, Martin From mal@lemburg.com Thu Jul 11 08:43:28 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 11 Jul 2002 09:43:28 +0200 Subject: [Python-Dev] python package References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2D3720.9040100@lemburg.com> Guido van Rossum wrote: >>How about adding >> >>python.py: >>__path__ = ['.'] >> >>This would not only reserve the name in the global namespace, >>but also enable applications to start using 'from python import x' >>now without much fuzz. > > > Then I have to ask the question I originally wanted to ask: what > problem would that solve? And is this the right solution? It solves the namespace issue. Every time we add a module or package to the standard lib, there is a chance that we break someones code out there by overriding his/her own module/package (e.g. take the addition of the email package -- such generic names tend to be used a lot). Whether it's the right solution depends on how you see it. IMHO it would be ideal to move the complete std lib under a single package. You might want to use a more diverse hierarchy but I don't think that is really needed for the existing code base. Using a single package also makes the transition from non-package imports to python-package imports a lot easier. > Also, it would make *all* standard modules accessible through the > python package -- surely this isn't what we want (not if we use the > Java example at least). Are you sure that you want to make things complicated ? (see above) > Also, for some modules (that keep some global state) it's a bad idea > if they are imported twice, since their initialization code would be > run twice, and there would be two separate instances of the module. That's true for the trick I proposed above since the modules are reachable in two ways with the standard way of writing 'import ' being used in tons of code. Now there is also a different way to approach this problem, though: that of directing Python to the right package by providing stubs for all current standard lib modules. I have used such a stub for my mx stuff when I moved everything from top-level to under the 'mx' umbrella: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * This works great -- it even let's you load pickles which store the old import names and automagically converts them to the new names. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From just@letterror.com Thu Jul 11 08:44:13 2002 From: just@letterror.com (Just van Rossum) Date: Thu, 11 Jul 2002 09:44:13 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020711061528.GA6367@hishome.net> Message-ID: Oren Tirosh wrote: > Xreadlines is buffered and therefore leaves the file position of the file > in an unexpected state. If you use xreadlines explicitly you should expect > that. The fact that file.__iter__ returns an xreadlines object implicitly is > therefore a bit surprising. > > What's the reason for using xreadlines as a file iterator? Was it > performance or was it just the easiest way to implement it using an existing > object? The rationale was something like "the simple most way to iterate over the lines in a file should be the fastest". I'd agree with that, but not at the expense of the surprises mentioned in the bug. I would perhaps help if the file object would cache the xreadlines iterator, that would limit the scope of the problem to the case where iteration and explicit .read() calls are mixed. > "Files support the iterator protocol. Each iteration returns the same > result as file.readline()" > > This is not correct. Files support what I call the iterable protocol. Objects > supporting the iterator protocol have a .next() method, files don't. While > it's true that each iteration has the same result as readline it doesn't > have the same side effects. > > Proposal: make files really support the iterator protocol. __iter__ would > return self and next() would call readline and raise StopIteration if ''. > If anyone wants the xreadline performance improvement it should be explicit. +1 (But, since the bug is closed as "won't fix" I doubt this has a big chance of happening.) Just From mwh@python.net Thu Jul 11 10:05:56 2002 From: mwh@python.net (Michael Hudson) Date: 11 Jul 2002 10:05:56 +0100 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Guido van Rossum's message of "Wed, 10 Jul 2002 14:26:28 -0400" References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2my9cihbwr.fsf@starship.python.net> Guido van Rossum writes: > OTOH, Michael, is this really something you cannot live with? Or is > it simply a surprise? Here's where the problem came up. A user posted to pygame-users saying that when he tried to import pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined. This obviously made a light go on in my head, and I asked where he'd got his Python and his pygame. He'd got his Python from the Redhat 7.3 RPM and his pygame from pygame.org. I suggested building pygame from source, which he did and everything worked[*]. Prediction: this is going to cause pain. For instance, if this user decides that he wants to upgrade to 2.2.1, he might download Sean's RPMs from python.org which are narrow unicode builds -- and then his extensions will break. The problem here is that the kind of users this is going to trouble are exactly the users who will not know what's going on. We can't prevent this sort of thing totally, but I think it should be possible to carry out simple unicode manipulations (like this example of returning a buffer) without incurring this kind of binary compatibility worry. Maybe a "safe" api, plastered with warning signs in the docs about poking into the internal structure of the objects. I wonder why Redhat distribute wide unicode builds? That's the immediate cause of the problem. Maybe we could ask them... Cheers, M. [*] actually, I think pygame might break with a wide unicode build. -- For every complex problem, there is a solution that is simple, neat, and wrong. -- H. L. Mencken From guido@python.org Thu Jul 11 11:41:45 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 06:41:45 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 11 Jul 2002 02:15:28 EDT." <20020711061528.GA6367@hishome.net> References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> <20020711002636.GA6958@panix.com> <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net> <20020711061528.GA6367@hishome.net> Message-ID: <200207111041.g6BAfjg29839@pcp02138704pcs.reston01.va.comcast.net> > What's the reason for using xreadlines as a file iterator? Was it > performance or was it just the easiest way to implement it using an > existing object? I thought this was answered adequately by my last entry in the SF bug report. The short answer is performance in the common case. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 11 11:47:53 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 06:47:53 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 11 Jul 2002 09:44:13 +0200." References: Message-ID: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> > > Proposal: make files really support the iterator > > protocol. __iter__ would return self and next() would call > > readline and raise StopIteration if ''. If anyone wants the > > xreadline performance improvement it should be explicit. No. I won't have "for line in file" be slower than attainable. The only solution I accept is a complete rewrite of the I/O system without using stdio, so xreadlines can be integrated. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 11 11:56:59 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 06:56:59 -0400 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Your message of "11 Jul 2002 10:05:56 BST." <2my9cihbwr.fsf@starship.python.net> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> Message-ID: <200207111056.g6BAuxR29969@pcp02138704pcs.reston01.va.comcast.net> > > OTOH, Michael, is this really something you cannot live with? Or is > > it simply a surprise? > > Here's where the problem came up. > > A user posted to pygame-users saying that when he tried to import > pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined. > This obviously made a light go on in my head, and I asked where he'd > got his Python and his pygame. He'd got his Python from the Redhat > 7.3 RPM and his pygame from pygame.org. I suggested building pygame > from source, which he did and everything worked[*]. > > Prediction: this is going to cause pain. For instance, if this user > decides that he wants to upgrade to 2.2.1, he might download Sean's > RPMs from python.org which are narrow unicode builds -- and then his > extensions will break. The problem here is that the kind of users > this is going to trouble are exactly the users who will not know > what's going on. > > We can't prevent this sort of thing totally, but I think it should be > possible to carry out simple unicode manipulations (like this example > of returning a buffer) without incurring this kind of binary > compatibility worry. Maybe a "safe" api, plastered with warning signs > in the docs about poking into the internal structure of the objects. That might work. Or you could call the Python APIs from C. :-) > I wonder why Redhat distribute wide unicode builds? That's the > immediate cause of the problem. Maybe we could ask them... I've had little luck trying to communicate with RedHat about their Python releases. Anyway, I think it's obvious why they do this: because it's there, and because they don't want surprises with customers who use wide Unicode characters. > Cheers, > M. > [*] actually, I think pygame might break with a wide unicode build. Hm, so maybe you should fix that first before you start complaining. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Thu Jul 11 12:09:14 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 11 Jul 2002 13:09:14 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Thursday 11 July 2002 12:47 pm, Guido van Rossum wrote: > > > Proposal: make files really support the iterator > > > protocol. __iter__ would return self and next() would call > > > readline and raise StopIteration if ''. If anyone wants the > > > xreadline performance improvement it should be explicit. > > No. I won't have "for line in file" be slower than attainable. +1. I _intensely_ want to be able to teach beginners to use "for line in file" and have it be fast in the common case. "Nice" behavior for rarer cases of prematurely interrupted loops is OK, if feasible, but secondary. Having "for line in file" play nicely with other method calls on 'file' has no importance to me in this context -- no more than, e.g., having "for item in alist" play nicely with calls to mutating methods of object alist. > The only solution I accept is a complete rewrite of the I/O system > without using stdio, so xreadlines can be integrated. I thought Just's suggestion (about having the file object remember the xreadlines object in use, so that another for loop would continue right where the first one exited) seemed like a reasonable hack -- a compromise of reasonably little effort for some small secondary gain. Guess I must be missing something? Of course the "complete rewrite" is an alluring prospect -- for many other reasons, such as enabling user control of file buffering in cross-platform ways, *yum* -- but it's not going to happen in time for 2.3 anyway, is it? Alex From mal@lemburg.com Thu Jul 11 12:46:28 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 11 Jul 2002 13:46:28 +0200 Subject: [Python-Dev] The C API and wide unicode support References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> Message-ID: <3D2D7014.6050609@lemburg.com> Michael Hudson wrote: > Guido van Rossum writes: > > >>OTOH, Michael, is this really something you cannot live with? Or is >>it simply a surprise? > > > Here's where the problem came up. > > A user posted to pygame-users saying that when he tried to import > pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined. > This obviously made a light go on in my head, and I asked where he'd > got his Python and his pygame. He'd got his Python from the Redhat > 7.3 RPM and his pygame from pygame.org. I suggested building pygame > from source, which he did and everything worked[*]. > > Prediction: this is going to cause pain. For instance, if this user > decides that he wants to upgrade to 2.2.1, he might download Sean's > RPMs from python.org which are narrow unicode builds -- and then his > extensions will break. The problem here is that the kind of users > this is going to trouble are exactly the users who will not know > what's going on. It's a pain, yes, but still better than having seg faults due to memory corruption afterwords. > We can't prevent this sort of thing totally, but I think it should be > possible to carry out simple unicode manipulations (like this example > of returning a buffer) without incurring this kind of binary > compatibility worry. Maybe a "safe" api, plastered with warning signs > in the docs about poking into the internal structure of the objects. Perhaps we need an additional abstract API PyObject_UnicodeEx() which provides a way to additionally define the encoding to assume for decoding string objects ? (PyObject_Unicode() always assumes the default encoding) > I wonder why Redhat distribute wide unicode builds? That's the > immediate cause of the problem. Maybe we could ask them... > > Cheers, > M. > [*] actually, I think pygame might break with a wide unicode build. Why's that ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Thu Jul 11 12:58:11 2002 From: mwh@python.net (Michael Hudson) Date: 11 Jul 2002 12:58:11 +0100 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Guido van Rossum's message of "Thu, 11 Jul 2002 06:56:59 -0400" References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> <200207111056.g6BAuxR29969@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2melea8oj0.fsf@starship.python.net> Guido van Rossum writes: > > We can't prevent this sort of thing totally, but I think it should be > > possible to carry out simple unicode manipulations (like this example > > of returning a buffer) without incurring this kind of binary > > compatibility worry. Maybe a "safe" api, plastered with warning signs > > in the docs about poking into the internal structure of the objects. > > That might work. Or you could call the Python APIs from C. :-) That's what I'm doing for pygame. It's probably the best option, really -- complaining ain't gonna get this changed for the 2.2 series, for one thing. Better docs would help; I'll put that on my list, and stop moaning about this. > > I wonder why Redhat distribute wide unicode builds? That's the > > immediate cause of the problem. Maybe we could ask them... > > I've had little luck trying to communicate with RedHat about their > Python releases. There's an email address in the spec file; teg (at) obvious.domain. I might ask him. [...] > > [*] actually, I think pygame might break with a wide unicode build. > > Hm, so maybe you should fix that first before you start complaining. Hey, I can do two things at once! Patches are on their way to Pete. Cheers, M. -- While preceding your entrance with a grenade is a good tactic in Quake, it can lead to problems if attempted at work. -- C Hacking -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From mwh@python.net Thu Jul 11 13:01:46 2002 From: mwh@python.net (Michael Hudson) Date: 11 Jul 2002 13:01:46 +0100 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: "M.-A. Lemburg"'s message of "Thu, 11 Jul 2002 13:46:28 +0200" References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> <3D2D7014.6050609@lemburg.com> Message-ID: <2mbs9e8od1.fsf@starship.python.net> "M.-A. Lemburg" writes: > > Prediction: this is going to cause pain. For instance, if this user > > decides that he wants to upgrade to 2.2.1, he might download Sean's > > RPMs from python.org which are narrow unicode builds -- and then his > > extensions will break. The problem here is that the kind of users > > this is going to trouble are exactly the users who will not know > > what's going on. > > It's a pain, yes, but still better than having seg faults > due to memory corruption afterwords. Probably true. At least the tracebacks make the problem obvious. > > We can't prevent this sort of thing totally, but I think it should be > > possible to carry out simple unicode manipulations (like this example > > of returning a buffer) without incurring this kind of binary > > compatibility worry. Maybe a "safe" api, plastered with warning signs > > in the docs about poking into the internal structure of the objects. > > Perhaps we need an additional abstract API PyObject_UnicodeEx() > which provides a way to additionally define the encoding to assume > for decoding string objects ? (PyObject_Unicode() always assumes > the default encoding) That would be nice, yes. Beats digging "unicode" out of __builtin__... > > [*] actually, I think pygame might break with a wide unicode build. > > Why's that ? Oh, the obvious thing: assuming sizeof(Py_UNICODE) == 2; or rather assuming that Python's idea of what a unicode buffer is is the same as SDL's idea (why I can't find written down anywhere, but I assume it's the same kind of UCS-2 thing narrow builds use). So, I retract my complaint, and propose to write some docs on the subject. Cheers, M. -- Two things I learned for sure during a particularly intense acid trip in my own lost youth: (1) everything is a trivial special case of something else; and, (2) death is a bunch of blue spheres. -- Tim Peters, 1 May 1998 From guido@python.org Thu Jul 11 13:19:31 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 08:19:31 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 11 Jul 2002 13:09:14 +0200." References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> > > No. I won't have "for line in file" be slower than attainable. > > +1. I _intensely_ want to be able to teach beginners to use "for line in > file" and have it be fast in the common case. "Nice" behavior for rarer > cases of prematurely interrupted loops is OK, if feasible, but secondary. > Having "for line in file" play nicely with other method calls on 'file' has > no importance to me in this context -- no more than, e.g., having "for item > in alist" play nicely with calls to mutating methods of object alist. Exactly. > > The only solution I accept is a complete rewrite of the I/O system > > without using stdio, so xreadlines can be integrated. > > I thought Just's suggestion (about having the file object remember > the xreadlines object in use, so that another for loop would continue > right where the first one exited) seemed like a reasonable hack -- a > compromise of reasonably little effort for some small secondary gain. Oops, I missed that. That seems reasonable indeed. > Guess I must be missing something? Of course the "complete rewrite" > is an alluring prospect -- for many other reasons, such as enabling > user control of file buffering in cross-platform ways, *yum* -- but it's not > going to happen in time for 2.3 anyway, is it? I'm not going to hold up the 2.3 release, but if a patch lands in the SF patch manager, I'm not going to reject it. Hint, hint. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Thu Jul 11 16:31:47 2002 From: ark@research.att.com (Andrew Koenig) Date: 11 Jul 2002 11:31:47 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> Message-ID: David> I keep running into the problem that there is no reliable way David> to introspect about whether a type supports multi-pass David> iterability (in the sense that an input stream might support David> only a single pass, but a list supports multiple passes). I David> suppose you could check for __getitem__, but that wouldn't David> cover linked lists, for example. Here's a suggestion for a long-term strategy for solving this problem, should it be deemed desirable to do so: Right now, every iterator, and every object that supports iteration, must have an __iter__() method. Suppose we augment that with the following: A new kind of iterator, called a multiple iterator, that supports multiple iterations over the same sequence. A requirement that every object that supports multiple iteration have a __multiter__() method that yields a multiple iterator over its sequence, in addition to an __iter__() method that yields a (multiple or single) iterator (so that every sequence that supports multiple iteration also supports single iteration). A requirement that every multiple iterator support the following methods: __multiter__() yields the iterator object itself __iter__() also yields the iterator object itself (so that every multiple iterator is also an iterator) __next__() return the next item from the container or raise StopIteration __copy__() return a distinct, newly created multiple iterator that iterates over the same sequence as the original, starting from the current element. Note that when the last multiple iterator has left an element, there is no possibility of going back to that element again unless the sequence itself provides a way of doing so. Therefore, for example, it might be possible for files to provide multiple iterators without undue space inefficiency. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From David Abrahams" Message-ID: <12e101c228f5$42908020$6601a8c0@boostconsulting.com> From: "Andrew Koenig" > David> I keep running into the problem that there is no reliable way > David> to introspect about whether a type supports multi-pass > David> iterability (in the sense that an input stream might support > David> only a single pass, but a list supports multiple passes). I > David> suppose you could check for __getitem__, but that wouldn't > David> cover linked lists, for example. > > Here's a suggestion for a long-term strategy for solving this > problem, should it be deemed desirable to do so: > > Right now, every iterator, and every object that supports > iteration, must have an __iter__() method. Suppose we augment > that with the following: > > A new kind of iterator, called a multiple iterator, that > supports multiple iterations over the same sequence. > > A requirement that every object that supports multiple > iteration have a __multiter__() method that yields a > multiple iterator over its sequence, in addition to > an __iter__() method that yields a (multiple or single) > iterator (so that every sequence that supports multiple > iteration also supports single iteration). > > A requirement that every multiple iterator support the > following methods: > > __multiter__() yields the iterator object itself > __iter__() also yields the iterator object itself > (so that every multiple iterator is > also an iterator) > __next__() return the next item from the container > or raise StopIteration > __copy__() return a distinct, newly created multiple > iterator that iterates over the same > sequence as the original, starting from > the current element. Why bother with __multiter__? If you can distinguish a multiple iterator by the presence of __copy__, You can always do hasattr(x.__iter__(),"__copy__") to find out whether something is multi-iteratable. -Dave From ark@research.att.com Thu Jul 11 17:16:15 2002 From: ark@research.att.com (Andrew Koenig) Date: Thu, 11 Jul 2002 12:16:15 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <12e101c228f5$42908020$6601a8c0@boostconsulting.com> (david.abrahams@rcn.com) References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <12e101c228f5$42908020$6601a8c0@boostconsulting.com> Message-ID: <200207111616.g6BGGFB16385@europa.research.att.com> David> Why bother with __multiter__? If you can distinguish a multiple David> iterator by the presence of __copy__, You can always do David> hasattr(x.__iter__(),"__copy__") to find out whether something David> is multi-iteratable. Because explicit is better than implicit :-) More seriously, I can imagine distinguishing a multiple iterator by the presence of __copy__, but I can't imagine using the presence of __copy__ to determine whether a *container* supports multiple iteration. For example, there surely exist containers today that support __copy__ but whose __iter__ methods yield iterators that do not themselves support __copy__. Another reason is that I can imagine this idea extended to encompass, say, ambidextrous iterators that support prev() as well as next(), and I would want to use __ambiter__ as a marker for those rather than having to create an iterator and see if it has prev(). From guido@python.org Thu Jul 11 17:40:12 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 12:40:12 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 09 Jul 2002 05:37:47 EDT." <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> Message-ID: <200207111640.g6BGeCb13218@odiug.zope.com> > I don't know if we need them, but I'm certainly finding that not having > some more information is difficult for me. If I need to make multiple > passes over the information in a generalized iterable object, the only > solution AFAICT is to unconditionally copy all the information into a list > first. Or you could just document "this argument must support multiple independent iterators." --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> <200207111640.g6BGeCb13218@odiug.zope.com> Message-ID: <13be01c22917$aa306480$6601a8c0@boostconsulting.com> The real reason to be able to introspect is so that you can handle both kinds. Even if you're willing to destroy the data by examining it, if you know you have a single-pass sequence, you might need to copy its elements into a multi-pass sequence (e.g. file.lines()) in order to get your work done. From: "Guido van Rossum" > > I don't know if we need them, but I'm certainly finding that not having > > some more information is difficult for me. If I need to make multiple > > passes over the information in a generalized iterable object, the only > > solution AFAICT is to unconditionally copy all the information into a list > > first. > > Or you could just document "this argument must support multiple > independent iterators." From guido@python.org Thu Jul 11 22:48:22 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 17:48:22 -0400 Subject: [Python-Dev] Re: *Simpler* string substitutions In-Reply-To: Your message of "Sat, 22 Jun 2002 21:36:31 EDT." <15637.9759.111784.481102@anthem.wooz.org> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com> <15637.9759.111784.481102@anthem.wooz.org> Message-ID: <200207112148.g6BLmMw14591@odiug.zope.com> > PM> 4. Access to variables is also problematic. Without > PM> compile-time support, access to nested scopes is impossible > PM> (AIUI). > > Is this really true? I think it was two IPC's ago that Jeremy and I > discussed the possibility of adding a method to frame objects that > would basically yield you the equivalent of globals+freevars+locals. If f is a function and g is a function nested inside f, only those locals of f that are also used in g get turned into cells. So if f has a local variable x that isn't used by g (as far as the compiler can see), there's no way for g to find f's value for x. Remember that f may not be on g's call stack at all! --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Thu Jul 11 22:59:28 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 00:59:28 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Thu, Jul 11, 2002 at 08:19:31AM -0400 References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712005928.A9833@hishome.net> On Thu, Jul 11, 2002 at 08:19:31AM -0400, Guido van Rossum wrote: > > Guess I must be missing something? Of course the "complete rewrite" > > is an alluring prospect -- for many other reasons, such as enabling > > user control of file buffering in cross-platform ways, *yum* -- but it's not > > going to happen in time for 2.3 anyway, is it? > > I'm not going to hold up the 2.3 release, but if a patch lands in the > SF patch manager, I'm not going to reject it. Hint, hint. :-) http://www.python.org/sf/580331 No, it's not a complete rewrite of file buffering. This patch implements Just's idea of xreadlines caching in the file object. It also makes a file into an iterator: __iter__ returns self and next calls the next method of the cached xreadlines object. See my previous postings for why I think a file should be an iterator. With this patch any combination of multiple xreadlines and iterator protocol operations on a file object is safe. Using xreadlines/iterator followed by regular readline has the same buffering problem as before. Oren From oren-py-d@hishome.net Thu Jul 11 23:07:25 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 01:07:25 +0300 Subject: [Python-Dev] Alternative implementation of interning, take 2 Message-ID: <20020712010725.A10686@hishome.net> Thanks for all feedback on the previous version. This one supports both immortal interned strings created with PyString_InternInPlace and mortal interned strings created with the new function PyString_Intern. Places that might affect compatibility still use immortals but most interned strings are now mortal. This version, like the previous one, does not support indirect interning of strings. Is there any evidence that this optimization is still important? Nothing in the Python distribution itself needs it. http://www.python.org/sf/576101 Oren From martin@v.loewis.de Thu Jul 11 23:16:29 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Jul 2002 00:16:29 +0200 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <20020712010725.A10686@hishome.net> References: <20020712010725.A10686@hishome.net> Message-ID: Oren Tirosh writes: > This version, like the previous one, does not support indirect interning of > strings. Is there any evidence that this optimization is still important? > Nothing in the Python distribution itself needs it. That is still factually incorrect; the code is triggered in a test case. Regards, Martin From oren-py-d@hishome.net Thu Jul 11 23:20:00 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 01:20:00 +0300 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: ; from martin@v.loewis.de on Fri, Jul 12, 2002 at 12:16:29AM +0200 References: <20020712010725.A10686@hishome.net> Message-ID: <20020712012000.A11330@hishome.net> On Fri, Jul 12, 2002 at 12:16:29AM +0200, Martin v. Loewis wrote: > Oren Tirosh writes: > > > This version, like the previous one, does not support indirect interning of > > strings. Is there any evidence that this optimization is still important? > > Nothing in the Python distribution itself needs it. > > That is still factually incorrect; the code is triggered in a test case. A few indirectly interned strings are still created, but I couldn't find any case where one was actually used as a key to PyDict_GetItem. Oren From pinard@iro.umontreal.ca Fri Jul 12 00:56:30 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 11 Jul 2002 19:56:30 -0400 Subject: [Python-Dev] Re: PendingDeprecationWarning In-Reply-To: References: <00e001c2072f$7b8d2460$5d61accf@othello> <200205291642.g4TGgaf18754@odiug.zope.com> Message-ID: [Alex Martelli] > On Wednesday 29 May 2002 06:42 pm, Guido van Rossum wrote: > ... > > > oct() > > > hex() > > > > Why? I use these a lot... > I assume the duplication of oct and hex wrt '%o'% and '%x'% was the > reason to suggest silently-deprecating the former (trying to have 'just > one obvious way' and all that). Hi, people. I'm revising many accumulated notes, while writing the draft of a Python style and migration guide (in French) for a small team of Python programmers, here. By the way, I thank you all for the richness of the exchanged ideas in that area, lately. Also, poking around, I see even a bit deeper than before how beautiful the Python project is! Stumbling on the above message, I feel like making a further comment. When I was learning Python, I found elegant to discover that Python had all that is required so one could rewrite the `FORMAT % THINGS' operator, if one wanted to. If we deprecate built-ins (like `repr', `hex' and `oct') in favour of leaving `%' as the only way, we would loose that elegance. Moreover, it might be more speedy not having to go through the interpretation of a format string, and this might matter in some circumstances. -- François Pinard http://www.iro.umontreal.ca/~pinard From David Abrahams" <200207092050.g69KoxW04101@odiug.zope.com> <31E5E26A.BE7C5DC@home.se> <31E5E3C3.569EBAFA@home.se> Message-ID: <146001c22939$a3335440$6601a8c0@boostconsulting.com> From: "Sverker Nilsson" > Sverker Nilsson wrote: > > > > Guido van Rossum wrote: > > > > types-sig > > > +1 > > > > -1. I think there may come interesting discussions on this list, > > when the time is due and things come up. Why dismiss it? It is > > a good place to have. > > > > Sverker Nilsson Plus, if you dismiss it I might be tempted to bring my thread about multimethods/overload resolution back to this list, and that would be messy... it was so neatly killed by diverting it to types-sig. ;-) -Dave From guido@python.org Fri Jul 12 01:41:20 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 20:41:20 -0400 Subject: [Python-Dev] Re: [Types-sig] Re: [meta-sig] SIG charters In-Reply-To: Your message of "Thu, 11 Jul 2002 20:18:20 EDT." <146001c22939$a3335440$6601a8c0@boostconsulting.com> References: <15657.62121.520364.556758@anthem.wooz.org> <200207092050.g69KoxW04101@odiug.zope.com> <31E5E26A.BE7C5DC@home.se> <31E5E3C3.569EBAFA@home.se> <146001c22939$a3335440$6601a8c0@boostconsulting.com> Message-ID: <200207120041.g6C0fLi30951@pcp02138704pcs.reston01.va.comcast.net> > > > > > types-sig > > > > +1 > > > > > > -1. I think there may come interesting discussions on this list, > > > when the time is due and things come up. Why dismiss it? It is > > > a good place to have. > > > > > > Sverker Nilsson > > Plus, if you dismiss it I might be tempted to bring my thread about > multimethods/overload resolution back to this list, and that would be > messy... it was so neatly killed by diverting it to types-sig. ;-) > > -Dave I apologize for that! I had expected that some people on the type-sig who would be interested. But it proves that the types-sig is dead. It's had its chance. If there's really a need to revive it, well, there's a procedure for reviving SIG, too. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 01:43:57 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 11 Jul 2002 20:43:57 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 11 Jul 2002 16:10:28 EDT." <13be01c22917$aa306480$6601a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> <200207111640.g6BGeCb13218@odiug.zope.com> <13be01c22917$aa306480$6601a8c0@boostconsulting.com> Message-ID: <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net> > The real reason to be able to introspect is so that you can handle both > kinds. > Even if you're willing to destroy the data by examining it, if you know you > have a single-pass sequence, you might need to copy its elements into a > multi-pass sequence (e.g. file.lines()) in order to get your work done. Hm. I think it's just as good to make it the responsibility of the caller to pass a multi-iterable. There could be a standard tool that takes a single-iterable and produces a multi-iterable. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" Hi, I recently came across a nasty configuration conflict between boost and python. In LongObject.h we have: #ifdef HAVE_LONG_LONG /* Hopefully this is portable... */ #ifndef ULONG_MAX #define ULONG_MAX 4294967295U #endif #ifndef LONGLONG_MAX #define LONGLONG_MAX 9223372036854775807LL #endif #ifndef ULONGLONG_MAX #define ULONGLONG_MAX 0xffffffffffffffffULL #endif Well, it turns out that boost detects whether the compiler supports long long by #including and looking for these macros: #include # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \ && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) || defined(ULONGLONG_MAX)) # define BOOST_HAS_LONG_LONG #endif So it turns out that on some platforms, Python's configuration sets HAVE_LONG_LONG even when limits.h doesn't include definitions of these macros. For example, there's MSVC6, where Python substitutes __int64 for long long using its LONG_LONG macro. However, I didn't actually notice the problem until I tried linking something at LLNL where they're using an older KCC. Two translation units had different ideas of BOOST_HAS_LONG_LONG, so linking failed when one of them was looking for the long long support supposedly provided by another. I'm surprised it wasn't a worse problem with MSVC6, because after all, it doesn't even supply a type called "long long". Is there any chance that something can be done to prevent this sort of conflict? Thanks, Dave From tim.one@comcast.net Fri Jul 12 05:54:15 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 00:54:15 -0400 Subject: [Python-Dev] long long configuration In-Reply-To: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > I recently came across a nasty configuration conflict between boost and > python. > > In LongObject.h we have: > > #ifdef HAVE_LONG_LONG > > /* Hopefully this is portable... */ > #ifndef ULONG_MAX > #define ULONG_MAX 4294967295U > #endif > #ifndef LONGLONG_MAX > #define LONGLONG_MAX 9223372036854775807LL > #endif > #ifndef ULONGLONG_MAX > #define ULONGLONG_MAX 0xffffffffffffffffULL > #endif > > Well, it turns out that boost detects whether the compiler supports long > long by #including and looking for these macros: > > #include > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \ > && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) || > defined(ULONGLONG_MAX)) > # define BOOST_HAS_LONG_LONG > #endif > > So it turns out that on some platforms, Python's configuration sets > HAVE_LONG_LONG even when limits.h doesn't include definitions of these > macros. Yes. Python cares about the conceptual type, not about how a platform spells it. > For example, there's MSVC6, where Python substitutes __int64 for > long long using its LONG_LONG macro. However, I didn't actually notice > the problem What problem? > until I tried linking something at LLNL where they're using an older KCC. > Two translation units had different ideas of BOOST_HAS_LONG_LONG, Why was that? Nothing you showed us for it, unless there's an implied #include of Python.h before the Boost limits.h block you did show us. > so linking failed when one of them was looking for the long long support > supposedly provided by another. I'm surprised it wasn't a worse problem > with MSVC6, because after all, it doesn't even supply a type called > "long long". > > Is there any chance that something can be done to prevent this sort of > conflict? Rather than try to extract a clear question out of this , let me turn it around: would your problem go away if this code in LongObject.h went away entirely? Python has no business defining ULONG_MAX anymore (that's left over from K&R C days), and I'm sure I got rid of all uses of LONGLONG_MAX and ULONGLONG_MAX in 2.2 (I vaguely recall some; they weren't really needed, and won't be needed again). From tim.one@comcast.net Fri Jul 12 06:18:39 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 01:18:39 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <20020712010725.A10686@hishome.net> Message-ID: [Oren Tirosh] > ... > This version, like the previous one, does not support indirect > interning of strings. Is there any evidence that this optimization is > still important? Nothing in the Python distribution itself needs it. We've already been thru the last part at length: indirect interning wasn't targeted at the core, so that the core doesn't need it is evidence of no more than that Guido's implementation worked as he intended it to in this respect. It would help if you could get Marc-Andre and /F to pronounce on whether their code benefits from it -- they're the most prolific extension authors we've got. From tim.one@comcast.net Fri Jul 12 06:39:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 01:39:47 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <13be01c22917$aa306480$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > The real reason to be able to introspect is so that you can handle both > kinds. Even if you're willing to destroy the data by examining it, if > you know you have a single-pass sequence, you might need to copy its > elements into a multi-pass sequence (e.g. file.lines()) in order to get > your work done. Note that Python uses PySequence_Fast() internally in such cases. This does whatever it takes to turn an iterable object into something that can be indexed at random via PySequence_Fast_GET_ITEM(fastseq, int_index). Under the covers it leaves lists and tuples alone, and materializes everything else into a temp tuple. I haven't felt a need for something fancier than that in practice; the lack of participation in this thread from other old-timers suggests they haven't either (piling on more protocols would allow to optimize some cases, but it's not clear such cases are important enough in Python Life to be worth the bother). From oren-py-d@hishome.net Fri Jul 12 06:43:32 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 01:43:32 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> Message-ID: <20020712054332.GA77883@hishome.net> On Thu, Jul 11, 2002 at 11:31:47AM -0400, Andrew Koenig wrote: > Right now, every iterator, and every object that supports > iteration, must have an __iter__() method. Suppose we augment > that with the following: > > A new kind of iterator, called a multiple iterator, that > supports multiple iterations over the same sequence. ... > __copy__() return a distinct, newly created multiple > iterator that iterates over the same > sequence as the original, starting from > the current element. There is no need for a new type of iterator. It's ok that iterators are disposable. If I need multiple iterations I don't want to copy the iterator - I prefer to ask the original iterable object for a new iterator. All I need is some way to know whether the iterable object (container) can produce multiple iterators that generate the same sequence. An object is re-iterable if its iterators do not modify its state. The iterator of an iterator is itself. Calling the next method, by definition, modifies the internal state of an object. Therefore anything that has a next method is not re-iterable. "hasattr(obj,'__iter__') and hasattr(obj, 'next')" is a good signature of a non re-iterable object. Unfortunately, the opposite is not true. One iterable object in Python produces iterators that modify its state when their .next() method is called - the file object. I have just submitted a patch that makes a file into an iterator (i.e. adds a .next method to files). With this change all Python objects that have an __iter__ method and no next method produce iterators that do not modify the container. Another possibility would be to make file iterators that use seek or re-open the file to avoid modifying the file position of the parent file object. I don't think that would be a good idea because files can be devices, pipes or sockets which are not seekable. I think it may be a good idea to add a note to the documentation pages about the iterator protocol that the iterators of a container should not modify the state of the container. If you think they must it's probably a good sign that your 'container' is not really a container and maybe it should be an iterator rather than produce iterators of itself. Oren From aleax@aleax.it Fri Jul 12 07:43:38 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 12 Jul 2002 08:43:38 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <13be01c22917$aa306480$6601a8c0@boostconsulting.com> <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 12 July 2002 02:43 am, Guido van Rossum wrote: > > The real reason to be able to introspect is so that you can handle both > > kinds. > > Even if you're willing to destroy the data by examining it, if you know > > you have a single-pass sequence, you might need to copy its elements into > > a multi-pass sequence (e.g. file.lines()) in order to get your work done. > > Hm. I think it's just as good to make it the responsibility of the > caller to pass a multi-iterable. There could be a standard tool that > takes a single-iterable and produces a multi-iterable. At the risk of sounding like a broken record -- doesn't protocol adaptation stand out as a good way to package up such a "standard tool"? Why should we keep inventing a variety of different ways to ask the same kind of service -- "Here is an object X, please return it or a wrapper on it in such a way that it satisfies protocol Y, if possible"...? In this specific case, Y is "a multi-iterable". Last time the subject came up in this list, as I recall, Y was "usable as an index on a sequence". Having protocol-adaptation machinery would not save the work of designing protocols and adapters, and there would still be the need to decide case by case "do we want to standardize this specific adaptation". However, it would save the work involved in "given that we DO want to standardize this adaptation, how do we dress it up" -- how do we present the service to client-code. The greatest benefits might be to authors of client code (and aren't we all, from time to time?-) -- reducing the amount of learning involved with each protocol-adaptation to "what is the protocol Y I want". I don't think it's strictly necessary for Y to be "an interface", and thus that protocol adaptation must necessarily wait for interfaces to become a recognized and formalized Python concept. I think that accepting any type as "a protocol" would be fine, and pragmatically equivalent to requiring a protocol to be "an interface". Alex From martin@v.loewis.de Fri Jul 12 08:08:36 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Jul 2002 09:08:36 +0200 Subject: [Python-Dev] long long configuration In-Reply-To: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com> References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > #include > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \ > && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) || > defined(ULONGLONG_MAX)) > # define BOOST_HAS_LONG_LONG > #endif [...] > I'm surprised it wasn't a > worse problem with MSVC6, because after all, it doesn't even supply a type > called "long long". Could that have resulted from defining BOOST_MSVC? Regards, Martin From mal@lemburg.com Fri Jul 12 09:24:10 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 10:24:10 +0200 Subject: [Python-Dev] Alternative implementation of interning, take 2 References: Message-ID: <3D2E922A.4040005@lemburg.com> Tim Peters wrote: > [Oren Tirosh] > >>... >>This version, like the previous one, does not support indirect >>interning of strings. Is there any evidence that this optimization is >>still important? Nothing in the Python distribution itself needs it. > > > We've already been thru the last part at length: indirect interning wasn't > targeted at the core, so that the core doesn't need it is evidence of no > more than that Guido's implementation worked as he intended it to in this > respect. > > It would help if you could get Marc-Andre and /F to pronounce on whether > their code benefits from it -- they're the most prolific extension authors > we've got. Gee, thanks :-) If you could spell out what exactly you mean by "indirect interning" that would help. What I do need and rely on is the fact that the Python compiler interns all constant strings and identifiers in Python programs. This makes switching like so: if a == 'x': elif a == 'y': else: also work like this (only faster): if a is 'x': elif a is 'y': else: provided that 'a' only uses interned strings. If that's what you mean by "indirect interning" then I do need this. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From David Abrahams" Message-ID: <14ea01c2298b$b91a59a0$6601a8c0@boostconsulting.com> From: "Tim Peters" > [David Abrahams] > > I recently came across a nasty configuration conflict between boost and > > python. > > > > In LongObject.h we have: > > > > #ifdef HAVE_LONG_LONG > > > > /* Hopefully this is portable... */ > > #ifndef ULONG_MAX > > #define ULONG_MAX 4294967295U > > #endif > > #ifndef LONGLONG_MAX > > #define LONGLONG_MAX 9223372036854775807LL > > #endif > > #ifndef ULONGLONG_MAX > > #define ULONGLONG_MAX 0xffffffffffffffffULL > > #endif > > > > Well, it turns out that boost detects whether the compiler supports long > > long by #including and looking for these macros: > > > > #include > > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \ > > && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) || > > defined(ULONGLONG_MAX)) > > # define BOOST_HAS_LONG_LONG > > #endif > > > > So it turns out that on some platforms, Python's configuration sets > > HAVE_LONG_LONG even when limits.h doesn't include definitions of these > > macros. > > Yes. Python cares about the conceptual type, not about how a platform > spells it. > > > For example, there's MSVC6, where Python substitutes __int64 for > > long long using its LONG_LONG macro. However, I didn't actually notice > > the problem > > What problem? Uh, sorry. Depending on the order of #includes, Python's headers can confuse Boost's configuration. > > until I tried linking something at LLNL where they're using an older KCC. > > Two translation units had different ideas of BOOST_HAS_LONG_LONG, > > Why was that? Nothing you showed us for it, unless there's an implied > #include of Python.h before the Boost limits.h block you did show us. Because one translation unit said (in effect): #include // defines ULONGLONG_MAX #include // decides long long is available and the other said: #include // decides long long is unavailable #include // defines ULONGLONG_MAX (harmless this time) > > so linking failed when one of them was looking for the long long support > > supposedly provided by another. I'm surprised it wasn't a worse problem > > with MSVC6, because after all, it doesn't even supply a type called > > "long long". > > > > Is there any chance that something can be done to prevent this sort of > > conflict? > > Rather than try to extract a clear question out of this , Too late (I hope!) > let me turn > it around: would your problem go away if this code in LongObject.h went > away entirely? Python has no business defining ULONG_MAX anymore (that's > left over from K&R C days), and I'm sure I got rid of all uses of > LONGLONG_MAX and ULONGLONG_MAX in 2.2 (I vaguely recall some; they weren't > really needed, and won't be needed again). Actually, that was the answer I was hoping you'd come up with. I'd also suggest prefixing HAVE_LONG_LONG with some kind of PYTHON_ grist to keep it out of the way of more-naive applications, but I don't want to push my luck \ -- I still remember what happened when I suggested that _Py_... names should be avoided! -Dave From David Abrahams" Message-ID: <14fb01c2298c$71a145b0$6601a8c0@boostconsulting.com> From: "Tim Peters" > Note that Python uses PySequence_Fast() internally in such cases. This does > whatever it takes to turn an iterable object into something that can be > indexed at random via PySequence_Fast_GET_ITEM(fastseq, int_index). Under > the covers it leaves lists and tuples alone, and materializes everything > else into a temp tuple. I haven't felt a need for something fancier than > that in practice; the lack of participation in this thread from other > old-timers suggests they haven't either (piling on more protocols would > allow to optimize some cases, but it's not clear such cases are important > enough in Python Life to be worth the bother). Yep, I know about PySequence_Fast(), annd we're currently using that. However I have a bunch of numerics users who will undoubtedly be working with some kind of array from NumPy or something -- they'll be really unimpressed with me when PySequence_Fast() copies their huge multi-pass sequence without individual Python objects for the elements into a tuple with each double expressed as a separate Python float. can-you-say-PySequence_SLOW?-ly y'rs, dave From David Abrahams" Message-ID: <151501c2298d$25186b00$6601a8c0@boostconsulting.com> From: "Martin v. Loewis" > "David Abrahams" writes: > > > #include > > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \ > > && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) || > > defined(ULONGLONG_MAX)) > > # define BOOST_HAS_LONG_LONG > > #endif > [...] > > I'm surprised it wasn't a > > worse problem with MSVC6, because after all, it doesn't even supply a type > > called "long long". > > Could that have resulted from defining BOOST_MSVC? Sorry, I don't understand the question. Could *what* have resulted from defining BOOST_MSVC? -Dave From David Abrahams" <20020712054332.GA77883@hishome.net> Message-ID: <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> Oren, I like the direction this is going in, but I have some reservations about any protocol which requires users to avoid using a simple method name like next() on their own multi-pass sequence types unless they intend their sequence to be treated as single-pass. One other possibility: if x.__iter__() is x, it's a single-pass sequence. I realize this involves actually invoking the __iter__ method and conjuring up a new iterator, but that's generally a lightweight operation... -Dave From: "Oren Tirosh" > There is no need for a new type of iterator. It's ok that iterators are > disposable. If I need multiple iterations I don't want to copy the > iterator - I prefer to ask the original iterable object for a new iterator. > All I need is some way to know whether the iterable object (container) can > produce multiple iterators that generate the same sequence. > > An object is re-iterable if its iterators do not modify its state. > > The iterator of an iterator is itself. Calling the next method, by > definition, modifies the internal state of an object. Therefore anything > that has a next method is not re-iterable. > > "hasattr(obj,'__iter__') and hasattr(obj, 'next')" is a good signature of > a non re-iterable object. Unfortunately, the opposite is not true. One > iterable object in Python produces iterators that modify its state when > their .next() method is called - the file object. > > I have just submitted a patch that makes a file into an iterator (i.e. adds > a .next method to files). With this change all Python objects that have > an __iter__ method and no next method produce iterators that do not modify > the container. Another possibility would be to make file iterators that > use seek or re-open the file to avoid modifying the file position of the > parent file object. I don't think that would be a good idea because files > can be devices, pipes or sockets which are not seekable. > > I think it may be a good idea to add a note to the documentation pages > about the iterator protocol that the iterators of a container should not > modify the state of the container. If you think they must it's probably > a good sign that your 'container' is not really a container and maybe it > should be an iterator rather than produce iterators of itself. > > Oren > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From oren-py-d@hishome.net Fri Jul 12 12:01:13 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 07:01:13 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> Message-ID: <20020712110113.GA13526@hishome.net> On Fri, Jul 12, 2002 at 06:22:05AM -0400, David Abrahams wrote: > Oren, > > I like the direction this is going in, but I have some reservations about > any protocol which requires users to avoid using a simple method name like > next() on their own multi-pass sequence types unless they intend their > sequence to be treated as single-pass. I'm not too thrilled about it, either, but I don't think it's too bad. If you implement an object with an __iter__ method you must be aware of the iteration protocol and the next method. If you put a next method on an iterable you are most probably confusing iterators and iterables and not just using the name 'next' for some other innocent purpose. > One other possibility: if x.__iter__() is x, it's a single-pass sequence. I > realize this involves actually invoking the __iter__ method and conjuring > up a new iterator, but that's generally a lightweight operation... I think it is critical that all protocols should be defined by something passive like presence of attributes and attributes of attributes and not by active probing. I don't see how a future typing system could be retrofitted to Python otherwise (pssst, don't tell anyone, but I'm working on such a system...) Oren From oren-py-d@hishome.net Fri Jul 12 12:15:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 07:15:03 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <3D2E922A.4040005@lemburg.com> References: <3D2E922A.4040005@lemburg.com> Message-ID: <20020712111503.GA16058@hishome.net> On Fri, Jul 12, 2002 at 10:24:10AM +0200, M.-A. Lemburg wrote: > >It would help if you could get Marc-Andre and /F to pronounce on whether > >their code benefits from it -- they're the most prolific extension authors > >we've got. > > Gee, thanks :-) > > If you could spell out what exactly you mean by "indirect interning" > that would help. That's how I call a string whose ob_sinterned is not NULL but doesn't point to itself, either. Such strings are relatively rare. In order to create one you need to call PyString_InternInPlace on a string that has more than one reference. The pointer used for the interning is replaced with a "true" interned string (i.e s->ob_sinterned == s). The other references still point to the original string which is now "indirectly interned". Indirectly interned strings can't be used to speed up comparisons using 'is' instead of '=='. Using them as dictionary keys does save an strcmp, though. If I understand this correctly they are used as an optimization for extension modules that use PyString_FromString instead of PyString_InternFromString for their string constants using a hack in PyDict_SetItem that interns the key it gets. Oren From David Abrahams" <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> <20020712110113.GA13526@hishome.net> Message-ID: <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com> From: "Oren Tirosh" > On Fri, Jul 12, 2002 at 06:22:05AM -0400, David Abrahams wrote: > > Oren, > > > > I like the direction this is going in, but I have some reservations about > > any protocol which requires users to avoid using a simple method name like > > next() on their own multi-pass sequence types unless they intend their > > sequence to be treated as single-pass. > > I'm not too thrilled about it, either, but I don't think it's too bad. If > you implement an object with an __iter__ method you must be aware of the > iteration protocol and the next method. If you put a next method on an > iterable you are most probably confusing iterators and iterables and not > just using the name 'next' for some other innocent purpose. People may have already written that innocent code, but I'm not sure the consequences of misinterpreting such sequences as single-pass are so terrible. Still, I would prefer if we were looking for "__next__" instead of next(). > > One other possibility: if x.__iter__() is x, it's a single-pass sequence. I > > realize this involves actually invoking the __iter__ method and conjuring > > up a new iterator, but that's generally a lightweight operation... > > I think it is critical that all protocols should be defined by something > passive like presence of attributes and attributes of attributes and not by > active probing. Isn't that passive/active distinction illusory though? What about __getattr__ methods? > I don't see how a future typing system could be retrofitted > to Python otherwise (pssst, don't tell anyone, but I'm working on such a > system...) Nifty! I'd love to get a preview, if possible. Types come into play at the Python/C++ boundary, and I'm interested in how our systems will interact (c.f. http://aspn.activestate.com/ASPN/Mail/Message/types-sig/1222793) -Dave From oren-py-d@hishome.net Fri Jul 12 12:50:10 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 14:50:10 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Fri, Jul 12, 2002 at 07:14:21AM -0400 References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> <20020712110113.GA13526@hishome.net> <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com> Message-ID: <20020712145010.A29279@hishome.net> On Fri, Jul 12, 2002 at 07:14:21AM -0400, David Abrahams wrote: > > I'm not too thrilled about it, either, but I don't think it's too bad. If > > you implement an object with an __iter__ method you must be aware of the > > iteration protocol and the next method. If you put a next method on an > > iterable you are most probably confusing iterators and iterables and not > > just using the name 'next' for some other innocent purpose. > > People may have already written that innocent code, but I'm not sure the > consequences of misinterpreting such sequences as single-pass are so > terrible. Still, I would prefer if we were looking for "__next__" instead > of next(). I'm not actually suggesting this as a reliable way to detect re-iterable objects, it's more of an observation. If you want something that can be relied upon for optimizations that would probably require a new __magic__ attribute. Any suggestions? > Isn't that passive/active distinction illusory though? What about > __getattr__ methods? I can't believe that any static or semi-static typing system will be able to handle __getattr__ virtual attributes. An object simply won't match a type predicate if any of the attributes checked by the predicate are virtual. > > I don't see how a future typing system could be retrofitted > > to Python otherwise (pssst, don't tell anyone, but I'm working on such a > > system...) > > Nifty! I'd love to get a preview, if possible. Types come into play at the > Python/C++ boundary, and I'm interested in how our systems will interact > (c.f. http://aspn.activestate.com/ASPN/Mail/Message/types-sig/1222793) I don't know what you're talking about. :-) Oren From ark@research.att.com Fri Jul 12 13:27:56 2002 From: ark@research.att.com (Andrew Koenig) Date: Fri, 12 Jul 2002 08:27:56 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020712054332.GA77883@hishome.net> (message from Oren Tirosh on Fri, 12 Jul 2002 01:43:32 -0400) References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> Message-ID: <200207121227.g6CCRtI24509@europa.research.att.com> Oren> There is no need for a new type of iterator. It's ok that Oren> iterators are disposable. If I need multiple iterations I don't Oren> want to copy the iterator - I prefer to ask the original Oren> iterable object for a new iterator. All I need is some way to Oren> know whether the iterable object (container) can produce Oren> multiple iterators that generate the same sequence. You are assuming that you still have access to the original iterable object. But what if all you have is an iterator? Then you need to be able to ask the iterator for a new iterator. Oren> An object is re-iterable if its iterators do not modify its state. Oren> The iterator of an iterator is itself. Calling the next method, Oren> by definition, modifies the internal state of an Oren> object. Therefore anything that has a next method is not Oren> re-iterable. That's not the only possible definition of an iterator. I'm thinking, in part, about how one might translate some of the C++ standard-library algorithms into Python. If that translation requires that the user always supply the original container, rather than using iterators only, then some algorithms become harder to express or less ueful. From guido@python.org Fri Jul 12 13:46:26 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 08:46:26 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 01:43:32 EDT." <20020712054332.GA77883@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> Message-ID: <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> > I have just submitted a patch that makes a file into an iterator > (i.e. adds a .next method to files). With this change all Python > objects that have an __iter__ method and no next method produce > iterators that do not modify the container. Another possibility > would be to make file iterators that use seek or re-open the file to > avoid modifying the file position of the parent file object. I > don't think that would be a good idea because files can be devices, > pipes or sockets which are not seekable. Cute trick, but I think it's too fragile. You don't know about 3rd party iterables that have the same problem as file. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 13:50:32 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 08:50:32 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 08:43:38 +0200." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <13be01c22917$aa306480$6601a8c0@boostconsulting.com> <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> > At the risk of sounding like a broken record -- doesn't protocol > adaptation stand out as a good way to package up such a "standard > tool"? Why should we keep inventing a variety of different ways to > ask the same kind of service -- "Here is an object X, please return > it or a wrapper on it in such a way that it satisfies protocol Y, if > possible"...? Protocol adaptation sounds like a great reason to be very conservative in inventing other ways to address such problems. I don't see protocol adaptation go into Python 2.3. As Tim channeled me just after I went on vacation, it's such a tremendous change in how users will view things that we need to be conservative in introducing it. I would encourage experimenting with protocol adaptation though. Maybe the next steps would be to (a) revise the PEP and (b) produce a more usable reference implementation as a 3rd party package? I think Alex is in a great position to become co-author of PEP 246. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 13:52:01 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 08:52:01 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: Your message of "Fri, 12 Jul 2002 10:24:10 +0200." <3D2E922A.4040005@lemburg.com> References: <3D2E922A.4040005@lemburg.com> Message-ID: <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net> > What I do need and rely on is the fact that the > Python compiler interns all constant strings and identifiers in > Python programs. This makes switching like so: > > if a == 'x': > elif a == 'y': > else: > > also work like this (only faster): > > if a is 'x': > elif a is 'y': > else: > > provided that 'a' only uses interned strings. Yuck. This is an implementation detail. While it's unlikely to go away in Python 2.0, please don't rely on this in portable Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 13:57:14 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 08:57:14 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 07:01:13 EDT." <20020712110113.GA13526@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> <20020712110113.GA13526@hishome.net> Message-ID: <200207121257.g6CCvEQ07265@pcp02138704pcs.reston01.va.comcast.net> > If you put a next method on an iterable you are most probably > confusing iterators and iterables and not just using the name 'next' > for some other innocent purpose. Quite to the contrary. You might have a multi-iterable class that was defined before the iterator protocol existed, and had a "built-in" iterator that keeps the iteration state in the object itself (a common design, e.g. BSD db files have this). This is OK for simple uses. But with iterators available the class might grow a proper iterator class that keeps state external from the object. But for backward compatibility reasons you cannot remove the next() method on the class itself. QED: you have a multi-iterable object that has both an __iter__ method and a next method. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 14:05:21 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 09:05:21 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 08:27:56 EDT." <200207121227.g6CCRtI24509@europa.research.att.com> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> Message-ID: <200207121305.g6CD5Lb07338@pcp02138704pcs.reston01.va.comcast.net> > I'm thinking, in part, about how one might translate some of the C++ > standard-library algorithms into Python. If that translation requires > that the user always supply the original container, rather than using > iterators only, then some algorithms become harder to express or less > ueful. Indeed. There's a whole slew of interesting things you can do with iterators that means you won't have a container, only an iterator. For example, you can define "iterator algebra" functions that take iterators and return iterators. A simple example is this generator, which yields alternating elements of a given iterator. def alternating(it): while 1: yield it.next() it.next() The nice thing is that you can combine these easily. For example alternating(alternating(it)) would yield every 4th element. It would be a pity if the results of iterator algebra operations would not be acceptable to Andrew's proposed algorithm library. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Fri Jul 12 14:04:18 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 16:04:18 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207121227.g6CCRtI24509@europa.research.att.com>; from ark@research.att.com on Fri, Jul 12, 2002 at 08:27:56AM -0400 References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> Message-ID: <20020712160418.A412@hishome.net> On Fri, Jul 12, 2002 at 08:27:56AM -0400, Andrew Koenig wrote: > Oren> There is no need for a new type of iterator. It's ok that > Oren> iterators are disposable. If I need multiple iterations I don't > Oren> want to copy the iterator - I prefer to ask the original > Oren> iterable object for a new iterator. All I need is some way to > Oren> know whether the iterable object (container) can produce > Oren> multiple iterators that generate the same sequence. > > You are assuming that you still have access to the original iterable > object. But what if all you have is an iterator? Then you need to > be able to ask the iterator for a new iterator. Here are two cases I can think of where I don't have access to the iterable object: 1. There is no iterable object. An iterator object was created directly. For example, the result of a generator function is an iterator which isn't the result of some container's __iter__ method. 2. The iterator was received as an argument and the caller sent iter(x) instead of x. In that case I guess it means that the caller doesn't *want* to give me access to x. > Oren> An object is re-iterable if its iterators do not modify its state. > > Oren> The iterator of an iterator is itself. Calling the next method, > Oren> by definition, modifies the internal state of an > Oren> object. Therefore anything that has a next method is not > Oren> re-iterable. > > That's not the only possible definition of an iterator. It isn't a definition of an iterator. It isn't even a definition of a re-iterable object, it's a sufficient (but not required) condition for objects to be re-iterable. > I'm thinking, in part, about how one might translate some of the C++ > standard-library algorithms into Python. Why not translate *what* they do instead of *how* they do it? I'm pretty sure the Python way would be shorter and simpler anyway. Oren From oren-py-d@hishome.net Fri Jul 12 14:17:31 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 16:17:31 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 08:46:26AM -0400 References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712161731.A977@hishome.net> On Fri, Jul 12, 2002 at 08:46:26AM -0400, Guido van Rossum wrote: > > I have just submitted a patch that makes a file into an iterator > > (i.e. adds a .next method to files). With this change all Python > > objects that have an __iter__ method and no next method produce > > iterators that do not modify the container. Another possibility > > would be to make file iterators that use seek or re-open the file to > > avoid modifying the file position of the parent file object. I > > don't think that would be a good idea because files can be devices, > > pipes or sockets which are not seekable. > > Cute trick, but I think it's too fragile. You don't know about 3rd > party iterables that have the same problem as file. I don't understand what you mean by fragile. I'm not suggesting anything that actually depends on this behavior so I don't see what could break. I think it's semantically cleaner for iterable objects to produce iterators that do not modify the state of the original iterable object. There's no way to force extension writers to adhere to this but Python should at least set a good example. Python file objects are not a good example. The xrange object that was its own iterator was not a good example. Oren From guido@python.org Fri Jul 12 14:26:37 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 09:26:37 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Thu, 11 Jul 2002 09:43:28 +0200." <3D2D3720.9040100@lemburg.com> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> Message-ID: <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> I have thought some more about the idea of moving the entire stdlib into a package named "python" and I reject the idea. Think of the impact the change would have on the tutorial. Think of the amount of needless changes to perfectly working code it would entail. If you want to avoid 3rd party module/package names to be invalidated by additions to the standard library, you might just as well introduce a "nonstd" package into which all 3rd party extensions must be placed. This at least doesn't require people who don't use 3rd party code to change their programs. Maybe we should create a standard package hierarchy; Eric Raymond once started working on such a proposal but I have discouraged him because I think it would cause too much upheaval. But for Python 3 I would consider it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 14:31:36 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 09:31:36 -0400 Subject: [Python-Dev] The C API and wide unicode support In-Reply-To: Your message of "Wed, 10 Jul 2002 23:21:30 +0200." <3D2CA55A.6080803@lemburg.com> References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <3D2CA55A.6080803@lemburg.com> Message-ID: <200207121331.g6CDVa707544@pcp02138704pcs.reston01.va.comcast.net> [me] > > Maybe because other macros are often disallowed in (3rd party) > > extensions, the reason being that the macros dig in the internal > > representation which isn't guaranteed to be binary compatible? It > > would make sense that the same rules applies to the Unicode macros in > > 3rd party extensions. > > Which macros would that be ? I modelled the macros in the > Unicode implementation after those of the string > implementation. And those macros are certainly used in > a lot of 3rd party extensions. I take it back. We're anal about binary compatibility in part because of this. There are (or were? it's changed so much!) a few macros in the memory allocator API that were not supposed to be used except in core code; I think I was thinking of those. > I guess, having the macros in the header files without an > explicit warning marks them as public interface. That's how > I have used them in tons of code and I think that I'm not > alone in using this approach. If there was a warning in the docs, that would prove you wrong, but fortunately for you there isn't. :-) > I think that the fact that Michael is seeing breakage is > a good thing. Otherwise, he would probably not have noticed > that RedHat chose to use the wide build as default. Exactly. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Fri Jul 12 14:32:36 2002 From: ark@research.att.com (Andrew Koenig) Date: Fri, 12 Jul 2002 09:32:36 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020712160418.A412@hishome.net> (message from Oren Tirosh on Fri, 12 Jul 2002 16:04:18 +0300) References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <20020712160418.A412@hishome.net> Message-ID: <200207121332.g6CDWaE25713@europa.research.att.com> Oren> 1. There is no iterable object. An iterator object was created Oren> directly. For example, the result of a generator function is an Oren> iterator which isn't the result of some container's __iter__ Oren> method. Yes. Oren> 2. The iterator was received as an argument and the caller sent Oren> iter(x) instead of x. In that case I guess it means that the Oren> caller doesn't *want* to give me access to x. 3. The caller sent an iterator that refers to an element of the container other than the initial one. For example: def findafter(it, x): it = iter(it) while it.next() != x: pass return it This function locates the first element equal to x in the sequence denoted by iter, and returns an iterator that refers to the element after the one equal to x. It raises StopIteration if no such element exists. Now, suppose you want to use this function to find all of the elements in a sequence that are equal to x. On the second and subsequent calls, you're going to have to pass an iterator as the first argument, because passing the container isn't going to give you the right answer. For another, more detailed example of how sensitive library design is to the details of iterator behavior, please look at http://www.research.att.com/~ark/design.pdf (I hope I have uttered the right incantations to make it available outside our firewall; if I haven't, please let me know) From guido@python.org Fri Jul 12 14:36:22 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 09:36:22 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 16:17:31 +0300." <20020712161731.A977@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> <20020712161731.A977@hishome.net> Message-ID: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net> > > > I have just submitted a patch that makes a file into an iterator > > > (i.e. adds a .next method to files). With this change all Python > > > objects that have an __iter__ method and no next method produce > > > iterators that do not modify the container. Another possibility > > > would be to make file iterators that use seek or re-open the file to > > > avoid modifying the file position of the parent file object. I > > > don't think that would be a good idea because files can be devices, > > > pipes or sockets which are not seekable. > > > > Cute trick, but I think it's too fragile. You don't know about 3rd > > party iterables that have the same problem as file. > > I don't understand what you mean by fragile. I'm not suggesting anything > that actually depends on this behavior so I don't see what could break. If nothing depends on it, what's the point? > I think it's semantically cleaner for iterable objects to produce > iterators that do not modify the state of the original iterable > object. Too bad. Files are the only first but certainly not the only example, and saying it's cleaner doesn't make it so. > There's no way to force extension writers to adhere to this but > Python should at least set a good example. Python file objects are > not a good example. The xrange object that was its own iterator was > not a good example. That version of the xrange object was broken. I don't see what's wrong with the file object. Iterating over a file changes the file's state, that's just a fact of life. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Fri Jul 12 14:37:39 2002 From: ark@research.att.com (Andrew Koenig) Date: 12 Jul 2002 09:37:39 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020712160418.A412@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <20020712160418.A412@hishome.net> Message-ID: Oh yes -- one more use for being able to copy an iterator: If you can't copy an iterator, how can you determine the value to which the iterator refers without losing access to that value? Oren> Why not translate *what* they do instead of *how* they do it? Oren> I'm pretty sure the Python way would be shorter and simpler Oren> anyway. Maybe yes, maybe no. It would certainly be different, because the C++ algorithms generally assume that iterators support comparison operations. That assumption makes possible algorithms in C++ that are difficult to express at all using Python iterators as they stand. On the other hand, the availability of garbage collection in Python, combined with the dynamic nature of its type system, makes it possible to express algorithms in Python that cannot be expressed easily in C++ using C++ iterators as they now stand. Details about language design can have a profound effect on usage, which in turn has a profound effect on future design. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From aleax@aleax.it Fri Jul 12 14:46:10 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 12 Jul 2002 15:46:10 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 12 July 2002 02:50 pm, Guido van Rossum wrote: ... > I think Alex is in a great position to become co-author of PEP 246. Aye aye, cap'n. What's the procedure for "becoming co-author" -- edit python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ? Alex From aahz@pythoncraft.com Fri Jul 12 15:07:57 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 12 Jul 2002 10:07:57 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712140757.GB24795@panix.com> On Fri, Jul 12, 2002, Alex Martelli wrote: > On Friday 12 July 2002 02:50 pm, Guido van Rossum wrote: >> >> I think Alex is in a great position to become co-author of PEP 246. > > Aye aye, cap'n. What's the procedure for "becoming co-author" -- edit > python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ? Get the original author's permission first, if possible. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Fri Jul 12 15:15:23 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 10:15:23 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 15:46:10 +0200." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121415.g6CEFNr07738@pcp02138704pcs.reston01.va.comcast.net> > > I think Alex is in a great position to become co-author of PEP 246. > > Aye aye, cap'n. What's the procedure for "becoming co-author" -- edit > python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ? I expect Barry won't accept your changes unless the original author agrees. This just happened to the logging PEP (wich was completely transferred to the new author). (Barry: maybe PEP 1 should discuss transfer of PEP ownership? I think that Trent should actually have remained co-author of PEP 282, even if he intends not to contribute another line.) --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Fri Jul 12 15:16:26 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 17:16:26 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 09:36:22AM -0400 References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> <20020712161731.A977@hishome.net> <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712171626.A2253@hishome.net> On Fri, Jul 12, 2002 at 09:36:22AM -0400, Guido van Rossum wrote: > > I don't understand what you mean by fragile. I'm not suggesting anything > > that actually depends on this behavior so I don't see what could break. > > If nothing depends on it, what's the point? To satisfy my perverted obsession for semantic hygiene, of course! > That version of the xrange object was broken. That's exactly my point. There will be more broken code like this as long as people keep confusing iterators and iterables. Making the language semantically cleaner should help prevent things like this in the long run. I remember it was pretty hard to actually convince anyone that xrange was broken. When I pointed out that the xrange 'iterator' modified the state of the xrange 'container' people responded that it's ok because this happens with file objects, too... > I don't see what's wrong with the file object. Iterating over a file > changes the file's state, that's just a fact of life. A file object is an iterator pretending to be a container. For historical reasons it uses 'readline' instead of 'next' and an empty string instead of StopIteration but it basically does the same job. A file object is not really a container that can produce iterators of itself. Oren From guido@python.org Fri Jul 12 15:30:08 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 10:30:08 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 17:16:26 +0300." <20020712171626.A2253@hishome.net> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> <20020712161731.A977@hishome.net> <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net> <20020712171626.A2253@hishome.net> Message-ID: <200207121430.g6CEU8t07911@pcp02138704pcs.reston01.va.comcast.net> I think this thread is ready to die. > > That version of the xrange object was broken. > > That's exactly my point. There will be more broken code like this > as long as people keep confusing iterators and iterables. Making the > language semantically cleaner should help prevent things like this > in the long run. I don't think that the language can help this. There's nothing oyu can do to remove the wart from file objects. > I remember it was pretty hard to actually convince anyone that > xrange was broken. Huh? IIRC I said it was broken right away and pushed Raymond to fix it. > When I pointed out that the xrange 'iterator' modified the state of > the xrange 'container' people responded that it's ok because this > happens with file objects, too... A confusion that you don't stamp out by "fixing" files. > > I don't see what's wrong with the file object. Iterating over a file > > changes the file's state, that's just a fact of life. > > A file object is an iterator pretending to be a container. In what sense does it pretend to be a container? File objects are what they are; they have rich semantics for a reason. > For historical reasons it uses 'readline' instead of 'next' and an > empty string instead of StopIteration but it basically does the same > job. A file object is not really a container that can produce > iterators of itself. I think this thread is ready to die. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 15:40:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 10:40:45 -0400 Subject: [Python-Dev] Status of various Python branches Message-ID: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> I believe the PBF has reached consensus that Python 2.2 will be the tie-wearing release. IMO this means that backporting fixes to 2.2 will continue to be valuable; I don't see the PBF coming up with a volunteer to do this right away. If you can't backport a fix yourself, at least add something like "bugfix candidate" to the checkin message. I think that backporting fixes to 2.1 is *not* worth our time any more, with the exception of (a) critical security fixes, and (b) fixes for severe problems that we know affect Python 2.1 users who cannot upgrade to 2.2. Example: Zope 2.5 requires Python 2.1. I'm not aware of any such fixes now. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 15:47:34 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 10:47:34 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: Your message of "Sun, 23 Jun 2002 15:16:30 -0300." <20020623181630.GN25927@laranja.org> References: <20020623181630.GN25927@laranja.org> Message-ID: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> > Guido, can you please, for our enlightenment, tell us what are the > reasons you feel %(foo)s was a mistake? Because of the trailing 's'. It's very easy to leave it out by mistake, and because the definition of printf formats skips over spaces (don't ask me why), the first character of the following word is used as the type indicator. (FWIW, I agree with your other observations -- this was why I support exploring an alternative in PEP 292.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 16:01:22 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:01:22 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: Your message of "Mon, 24 Jun 2002 23:01:40 +0300." <20020624230140.B3555@hishome.net> References: <20020624230140.B3555@hishome.net> Message-ID: <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> I'd like to reject PEP 294. Adding the type names that are already builtins to types.py is definitely a bad idea (the patch is full of lines like "int = int" -- this can only serve to confuse). I propose to leave types.py alone. If we need a place to name types that don't deserve being builtins, perhaps new.py is a better place? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 16:12:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 11:12:07 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: <15662.61124.842141.265751@slothrop.zope.com> Message-ID: > Speaking of maintenance branches, the test suite currently fails on > the release22-maint branch. test_descr encounters a fatal Python > error. The tail of the output is: > > Testing deepcopy of recursive objects... > Testing uninitialized module objects... > Testing pickling of classes with __slots__ ... > Testing __doc__ descriptor... > Testing for __imul__ problems... > Testing that copy.*copy() correctly uses __setstate__... > Testing resurrection of new-style instance... > Fatal Python error: GC object already in linked list Did you do an update and a fresh build? That's exactly how the current branch test_gc would fail if you're using the released 2.2.1 Python, or anything after that older than about yesterday. From jeremy@zope.com Fri Jul 12 15:59:16 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 12 Jul 2002 10:59:16 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15662.61124.842141.265751@slothrop.zope.com> Speaking of maintenance branches, the test suite currently fails on the release22-maint branch. test_descr encounters a fatal Python error. The tail of the output is: Testing deepcopy of recursive objects... Testing uninitialized module objects... Testing pickling of classes with __slots__ ... Testing __doc__ descriptor... Testing for __imul__ problems... Testing that copy.*copy() correctly uses __setstate__... Testing resurrection of new-style instance... Fatal Python error: GC object already in linked list Jeremy From jeremy@zope.com Fri Jul 12 16:20:06 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 12 Jul 2002 11:20:06 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15662.62374.495006.119684@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: GvR> I believe the PBF has reached consensus that Python 2.2 will be GvR> the tie-wearing release. Did we ever establish what kind of tie? I was thinking a string tie would be distinctive! GvR> IMO this means that backporting fixes to 2.2 will continue to GvR> be valuable; I don't see the PBF coming up with a volunteer to GvR> do this right away. If you can't backport a fix yourself, at GvR> least add something like "bugfix candidate" to the checkin GvR> message. It would be helpful if the Snake Farm was more accessible to developers. Specifically, I see that they are running regular builds of Python and, apparently, collecting the output of "make test." It is hard, however, to find the actual results of these test runs. I've got a bunch of concrete suggestions, but I don't know who to make them to. The test results we get from the Zope CVS are quite helpful, and I'd find similar results for the Python CVS equally helpful. The results could show several branches, debug vs. normal build, and different platforms. Getting those results every night would notify us of errors much more reliabily than depending on individual developers checking in changes to run all those various tests. GvR> I think that backporting fixes to 2.1 is *not* worth our time GvR> any more, with the exception of (a) critical security fixes, GvR> and (b) fixes for severe problems that we know affect Python GvR> 2.1 users who cannot upgrade to 2.2. Example: Zope 2.5 GvR> requires Python 2.1. I'm not aware of any such fixes now. I'm going to make one more change on the release21-maint branch, because my earlier httplib bug fix had a few bugs of its own. Jeremy From guido@python.org Fri Jul 12 16:26:41 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:26:41 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: Your message of "Fri, 21 Jun 2002 12:00:27 EDT." <3D134D9B.7030601@stsci.edu> References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> <3D132D2A.7080801@stsci.edu> <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net> <3D134D9B.7030601@stsci.edu> Message-ID: <200207121526.g6CFQfI08263@pcp02138704pcs.reston01.va.comcast.net> I've thought about this more, and I think I don't want to make the requested change (accept objects which implement __int__ as valid sequence indices). I also don't want to add a new protocol (the proposed __index__). I suggest that you try to find a solution that works without requiring changes to Python -- that way you have a much better chance that your code will work with Python 2.2, which will very likely have a lifetime comparable to that of Python 1.5.2 (in parallel with 2.3, for sure). I understand your desire to equate 0-D arrays and scalars, but I'm afraid that's not how the rest of Python works. I don't think we should change Python's semantic framework with APL's. I'm neutral on what you should do instead; personally, I'd continue to return Python scalars for 0-D arrays, but you could switch to 0-D arrays if you think the advantages outweigh the disadvantages. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 16:30:55 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:30:55 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: Your message of "Fri, 12 Jul 2002 11:20:06 EDT." <15662.62374.495006.119684@slothrop.zope.com> References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> <15662.62374.495006.119684@slothrop.zope.com> Message-ID: <200207121530.g6CFUtA08320@pcp02138704pcs.reston01.va.comcast.net> > It would be helpful if the Snake Farm was more accessible to > developers. Specifically, I see that they are running regular builds > of Python and, apparently, collecting the output of "make test." It > is hard, however, to find the actual results of these test runs. I've > got a bunch of concrete suggestions, but I don't know who to make them > to. Subscribe to http://lists.lysator.liu.se/mailman/listinfo/snake-farm --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jul 12 16:37:56 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 12 Jul 2002 08:37:56 -0700 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 08:52:01AM -0400 References: <3D2E922A.4040005@lemburg.com> <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712083756.A18124@glacier.arctrix.com> Guido van Rossum wrote: > Yuck. This is an implementation detail. While it's unlikely to go > away in Python 2.0, please don't rely on this in portable Python. The time machine in action. :-) Neil From guido@python.org Fri Jul 12 16:36:52 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:36:52 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Sun, 23 Jun 2002 15:22:09 PDT." <20020623222209.62675.qmail@web40105.mail.yahoo.com> References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> Message-ID: <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> I'm a little surprised. Raymond Hettinger checked in a change that makes all slices of buffer objects return strings. His comments on SF bug 546434 say that only one person replied and that they agreed returning strings was the better solution. But that's not how I read the only response to his query that I see in python-dev, from Scott Gilbert: > Since the array module already has a way to create a ByteArray (and a > ShortArray, and...), buffer objects don't really need to duplicate that > effort. Except creating an array from your own "special memory" (mmap, > DMA, third party API), and backwards compatibility in general. :-) > > > > BTW: I chuckled when I saw you post this the first time. This topic seems > to draw a lot of silence. > > I know that I would suggest deprecating the PyBufferObject to just being a > BufferInspector, and taking what little extra functionality was in there > and stuffing it into arraymodule.c. Another solution would be to factor > PyBufferObject into PyBufferInspector and a "bytes" object. A few months > ago, I was tempted to submit a PEP saying as much, but I think that would > have quietly fallen to the floor. Nobody seems to like this topic too > much... I read this as a recommendation to forget about returning strings. Am I mistaken? Also, I wish you'd submitted that PEP. IMO the reason that nobody likes this topic is that there is much confusion about why we have buffer objects in the first place. Any attempt at clarifying this (e.g. proposing separate byte arrays and buffer inspectors) would be welcome. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Fri Jul 12 16:44:28 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 18:44:28 +0300 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 11:01:22AM -0400 References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712184428.A4777@hishome.net> On Fri, Jul 12, 2002 at 11:01:22AM -0400, Guido van Rossum wrote: > I'd like to reject PEP 294. > > Adding the type names that are already builtins to types.py is > definitely a bad idea (the patch is full of lines like "int = int" -- > this can only serve to confuse). > > I propose to leave types.py alone. > > If we need a place to name types that don't deserve being builtins, > perhaps new.py is a better place? The new. prefix is natural enough for m = new.module('name') type but it looks pretty awkward in if isinstance(obj, new.generator): What's the meaning of 'new' in this context? The idea of using the types module turned out to have more problems than appeared at first but new doesn't look much better to me. Anyone has other suggestions? Oren From guido@python.org Fri Jul 12 16:51:27 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:51:27 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: Your message of "Fri, 12 Jul 2002 18:44:28 +0300." <20020712184428.A4777@hishome.net> References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> Message-ID: <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> > > If we need a place to name types that don't deserve being builtins, > > perhaps new.py is a better place? > > The new. prefix is natural enough for > > m = new.module('name') > > type but it looks pretty awkward in > > if isinstance(obj, new.generator): > > What's the meaning of 'new' in this context? Sometimes you ask too many questions. :-) Let's just say that this is a historically available name. I don't expect that isinstance(obj, generator) is a very common question to ask, so I don't mind if you have to ask it in a somewhat awkward way. > The idea of using the types module turned out to have more problems than > appeared at first but new doesn't look much better to me. Using new.py looks much better to me because it already works. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 16:44:38 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 11:44:38 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: Your message of "Fri, 12 Jul 2002 10:59:16 EDT." <15662.61124.842141.265751@slothrop.zope.com> References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net> <15662.61124.842141.265751@slothrop.zope.com> Message-ID: <200207121544.g6CFick10578@pcp02138704pcs.reston01.va.comcast.net> > Speaking of maintenance branches, the test suite currently fails on > the release22-maint branch. test_descr encounters a fatal Python > error. The tail of the output is: > [...] > Fatal Python error: GC object already in linked list I dont see this. But for me, two tests fail in the "release22-maint" branch: test_httplib test_pyclbr --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Fri Jul 12 17:06:25 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 12 Jul 2002 12:06:25 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15662.65153.90462.540450@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> > If we need a place to name types that don't deserve being >> > builtins, perhaps new.py is a better place? >> >> The new. prefix is natural enough for >> >> m = new.module('name') >> >> type but it looks pretty awkward in >> >> if isinstance(obj, new.generator): >> >> What's the meaning of 'new' in this context? GvR> Sometimes you ask too many questions. :-) GvR> Let's just say that this is a historically available name. I GvR> don't expect that isinstance(obj, generator) is a very common GvR> question to ask, so I don't mind if you have to ask it in a GvR> somewhat awkward way. I recently wrote some code that needed to look for functions. I wrote it this way: from new import function # ... if isinstance(obj, function): # ... It didn't look odd at all. And I don't care much where I import function from. I wouldn't mind if all the type objects defined in new where available in types. IOW, the names exported by new could also be exported by types. This means types would fall into two categories: types with builtin names and types available in the types module. I expect the current set of types with builtin names is sufficient. Jeremy From barry@zope.com Fri Jul 12 16:48:57 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 11:48:57 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> Message-ID: <15662.64105.997215.294990@anthem.wooz.org> >>>>> "AK" == Andrew Koenig writes: AK> You are assuming that you still have access to the original AK> iterable object. But what if all you have is an iterator? AK> Then you need to be able to ask the iterator for a new AK> iterator. Would it be useful to add to the interator "interface" a method which would retrieve the original iterable object? I've no idea what that method should be called, but it seems like it would be trivial to add since most (all?) iterators have a pointer to their underlying object anyway, don't they? -Barry From tim.one@comcast.net Fri Jul 12 17:10:30 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 12:10:30 -0400 Subject: [Python-Dev] long long configuration In-Reply-To: <14ea01c2298b$b91a59a0$6601a8c0@boostconsulting.com> Message-ID: >>> In LongObject.h we have: >>> >>> #ifdef HAVE_LONG_LONG >>> >>> /* Hopefully this is portable... */ >>> #ifndef ULONG_MAX >>> #define ULONG_MAX 4294967295U >>> #endif >>> #ifndef LONGLONG_MAX >>> #define LONGLONG_MAX 9223372036854775807LL >>> #endif >>> #ifndef ULONGLONG_MAX >>> #define ULONGLONG_MAX 0xffffffffffffffffULL >>> #endif Note that I already removed all this from current CVS (except for the #ifdef HAVE_LONG_LONG, which is still needed for code following the quoted block). That's for 2.3. Would it be of value to remove it from 2.2.2 too? >> What problem? > Uh, sorry. Depending on the order of #includes, Python's headers can > confuse Boost's configuration. > ... > Because one translation unit said (in effect): > > #include // defines ULONGLONG_MAX > #include // decides long long is available > > and the other said: > > #include // decides long long is unavailable > #include // defines ULONGLONG_MAX (harmless this time) OK, that's what I figured -- blatant user error, and probably a deliberate and malicious one too . > ... > I'd also suggest prefixing HAVE_LONG_LONG with some kind of PYTHON_ > grist to keep it out of the way of more-naive applications, but I don't > want to push my luck \ -- I still remember what happened when I > suggested that _Py_... names should be avoided! IIRC, we said we wouldn't avoid them, and I agree that if you were to suggest it, you'd likely get the same kind of response to suggesting we slap PYTHON_ in front of HAVE_XYZ names. A problem is that those more-naive applications are at least equally likely to *rely* on Python.h continuing to expose the same set of names it currently exposes, advertised or not. Indeed, I'm afraid there's a real chance I broke someone's extension by removing the unadvertised LONGLONG_MAX name. In any case, it's too much fiddling just to save you the effort of ordering a pair of includes consistently <0.9 wink>. From guido@python.org Fri Jul 12 17:12:58 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 12:12:58 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 11:48:57 EDT." <15662.64105.997215.294990@anthem.wooz.org> References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> Message-ID: <200207121612.g6CGCwF12306@pcp02138704pcs.reston01.va.comcast.net> > Would it be useful to add to the interator "interface" a method which > would retrieve the original iterable object? I've no idea what that > method should be called, but it seems like it would be trivial to add > since most (all?) iterators have a pointer to their underlying object > anyway, don't they? No. The (important!) class of generator-iterators does not have an underlying container object. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jul 12 17:12:00 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 12:12:00 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15662.65488.741894.155099@anthem.wooz.org> >>>>> "AM" == Alex Martelli writes: >> I think Alex is in a great position to become co-author of PEP >> 246. AM> Aye aye, cap'n. What's the procedure for "becoming co-author" AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff AM> to Barry, or ... ? That would work fine, although I would like to get /some/ acknowledgement from Clark Evans that passing the torch (or sharing the flame as it were) was okay with him. -Barry From jeremy@zope.com Fri Jul 12 17:11:03 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 12 Jul 2002 12:11:03 -0400 Subject: [Python-Dev] Status of various Python branches In-Reply-To: References: <15662.61124.842141.265751@slothrop.zope.com> Message-ID: <15662.65431.129271.558320@slothrop.zope.com> >>>>> "TP" == Tim Peters writes: >> Speaking of maintenance branches, the test suite currently fails >> on the release22-maint branch. test_descr encounters a fatal >> Python error. The tail of the output is: >> >> Testing deepcopy of recursive objects... Testing uninitialized >> module objects... Testing pickling of classes with __slots__ ... >> Testing __doc__ descriptor... Testing for __imul__ problems... >> Testing that copy.*copy() correctly uses __setstate__... Testing >> resurrection of new-style instance... Fatal Python error: GC >> object already in linked list TP> Did you do an update and a fresh build? That's exactly how the TP> current branch test_gc would fail if you're using the released TP> 2.2.1 Python, or anything after that older than about yesterday. I thought I was, but apparently not. Another round of update and build and the problem went away. Jeremy From guido@python.org Fri Jul 12 17:15:02 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 12:15:02 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: Your message of "Fri, 12 Jul 2002 12:06:25 EDT." <15662.65153.90462.540450@slothrop.zope.com> References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> <15662.65153.90462.540450@slothrop.zope.com> Message-ID: <200207121615.g6CGF2q12335@pcp02138704pcs.reston01.va.comcast.net> > from new import function > > # ... > > if isinstance(obj, function): > # ... > > It didn't look odd at all. And I don't care much where I import > function from. I wouldn't mind if all the type objects defined in new > where available in types. IOW, the names exported by new could also > be exported by types. No, the docs for types.py promises that it only exports names ending in 'Type'. That's not a promise I want to break lightly. > This means types would fall into two categories: types with builtin > names and types available in the types module. I expect the current > set of types with builtin names is sufficient. This is already the case, but the names exported by types.py don't match the __name__ attribute of those types. Is that a problem? --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Fri Jul 12 17:16:54 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 12 Jul 2002 18:16:54 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <15662.65488.741894.155099@anthem.wooz.org> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <15662.65488.741894.155099@anthem.wooz.org> Message-ID: On Friday 12 July 2002 06:12 pm, Barry A. Warsaw wrote: > >>>>> "AM" == Alex Martelli writes: > >> I think Alex is in a great position to become co-author of PEP > >> 246. > > AM> Aye aye, cap'n. What's the procedure for "becoming co-author" > AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff > AM> to Barry, or ... ? > > That would work fine, although I would like to get /some/ > acknowledgement from Clark Evans that passing the torch (or sharing > the flame as it were) was okay with him. Makes sense (& thanks to the others who suggested the same thing). I mailed Clark and I'll wait to hear from him. Alex From guido@python.org Fri Jul 12 17:23:37 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 12:23:37 -0400 Subject: [Python-Dev] String substitution: compile-time versus runtime In-Reply-To: Your message of "Thu, 20 Jun 2002 20:05:17 PDT." <3D1297ED.3990C30F@prescod.net> References: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net> <3D1297ED.3990C30F@prescod.net> Message-ID: <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net> [Paul] > I think that what I hear you saying is that interpolation should ideally > be done at a compile time for simple uses and at runtime for i18n. The > compile-time version should have the ability to do full expressions > (array indexes and self.members at the very least) and will have access > to nested scopes. The runtime version should only work with > dictionaries. Yes. > I think you also said that they should both use named parameters instead > of positional parameters. And presumably just for simplicity they would > use similar syntax although one would be triggered at compile time and > one at runtime. Yes. > If "%" survives, it would be used for positional parameters, instead of > named parameters. Yes (in Python 3). I can also see the viewpoint that the printf syntax should be abandoned entirely (in Python 3), in favor of a different (and probably more verbose) way to spell things like "%6.3f" or "%04x". Although there may be application areas (like producing output from numeric programs) where the formatting options are very convenient. In that case Python 3 could retain the positional % syntax but drop the by-name syntax. I'm undecided on this. > Is that your current thinking on the matter? Yes. But based on a lot of feedback (e.g. Alex's anecdote) I'm inclined to let the matter rest rather than rush to add a new language feature. > I think we are making progress if we're coming to understand that the > two different problem domains (simple scripts versus i18n) have > different needs and that there is probably no one solution that fits > both. OTOH, there's François's position: [François] > The mantra I repeated all along had two key points: > > 1) internationalisation will only be successful if designed to be > unobtrusive, otherwise average maintainers and implementors will > resist it. > > 2) programmer duties and translation duties are to be kept separate, > so these activities could be done asynchronously from one > another.[1] > > I really, really think that with enough and proper care, Python > could be set so internationalisation of Python scripts is just > unobtrusive routine. There should not be one way to write Python > when one does not internationalise, and another different way to use > it when one internationalises. The full power and facilities of > Python should be available at all times, unrelated to > internationalisation intents. Non-English people should not have to > pay a penalty, or if they do, the penalty should be minimised As > Much As Possible. However, he fails to suggest even a glimpse of a solution that satisfies his requirements, so I'm intended to write him off as the crank he usually is. ;-) > Our BDFL, Guido, should favour internationalisation as a principle > in the evolution for the language, that is, more than a random > negligible feature. I sincerely hope he will do. For many people, > internationalisation issues cannot be separated out that simply, or > otherwise dismissed. We should rather learn to collaborate at > properly addressing and solving them at each evolutionary step, so > Python really remains a language for everybody. To the contrary, I think most users don't care about writing code that can be switched easily from one language to the next. They only care about being able to write code that prints text in their own language (and perhaps about being able to use words in their own language as identifiers). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 17:37:10 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 12:37:10 -0400 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error In-Reply-To: Your message of "Tue, 25 Jun 2002 09:29:34 MDT." <3D188C5D.D519DD90@3captus.com> References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> Message-ID: <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> [Skip Montanaro] > > I just noticed in the development docs that when a timeout on a socket > > occurs, socket.error is raised. I rather liked the idea that a different > > exception was raised for timeouts (I used Tim O'Malley's timeout_socket > > module). Making a TimeoutError exception a subclass of socket.error would > > be fine so you can catch it with existing code, but I could see recovering > > differently for a timeout as opposed to other possible errors: > > > > sock.settimeout(5.0) > > try: > > data = sock.recv(8192) > > except socket.TimeoutError: > > # maybe requeue the request > > ... > > except socket.error, codes: > > # some more drastic solution is needed > > ... > > [Bernard Yue] > +1 on your suggestion. Anyway, under windows, the current > implementation returns incorrect socket.error code for timeout. I am > working on the test suite as well as a fix for problem found. Once the > code is bug free maybe we can put the TimeoutError in. > > I will leave it to Guido for the approval of the change. When he comes > back from his holiday. The way I restructured the code it is impossible to distinguish a timeout error from other errors; you simply get the "no data available" error from the socket operation. This is the same error you'd get in non-blocking mode. Before I recomplicate the code so that it can raise a separate error when the select fails, I'd like to understand the use case better. Why would you want to make this distinction? Requeueing the request (as in Skip's example) doesn't make sense IMO: you set the timeout for a reason, and that reason is that you want to give up if it takes too long. If you really intend to retry you're better of disabling the timeout! If you really want to, you can already distinguish the timeout case, because you get an EAGAIN error then (maybe something else on Windows -- Bernard, if you have a fix for that, please send it to me). So a -0 unless more evidence is brought forward. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Fri Jul 12 17:40:03 2002 From: ark@research.att.com (Andrew Koenig) Date: Fri, 12 Jul 2002 12:40:03 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <15662.64105.997215.294990@anthem.wooz.org> (barry@zope.com) References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> Message-ID: <200207121640.g6CGe3O26775@europa.research.att.com> Barry> Would it be useful to add to the interator "interface" a method Barry> which would retrieve the original iterable object? What if there isn't one? From oren-py-d@hishome.net Fri Jul 12 17:41:51 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 19:41:51 +0300 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <15662.65153.90462.540450@slothrop.zope.com>; from jeremy@alum.mit.edu on Fri, Jul 12, 2002 at 12:06:25PM -0400 References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> <15662.65153.90462.540450@slothrop.zope.com> Message-ID: <20020712194151.A6406@hishome.net> On Fri, Jul 12, 2002 at 12:06:25PM -0400, Jeremy Hylton wrote: > I recently wrote some code that needed to look for functions. I wrote > it this way: > > from new import function > > # ... > > if isinstance(obj, function): > # ... > > It didn't look odd at all. And I don't care much where I import > function from. I wouldn't mind if all the type objects defined in new > where available in types. IOW, the names exported by new could also > be exported by types. That's exactly what PEP 294 proposed. The primary objection was that the documentation for the types module says that names exported by future versions will all end in "Type". People that do 'from types import *' based on this promise will tend to get offended if a variable called 'code' is clobbered. Anyway, my mother also told me that breaking promises is not a nice thing to do so I try to keep that in mind when I design programming interfaces. Oren From barry@zope.com Fri Jul 12 17:48:16 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 12:48:16 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> <200207121612.g6CGCwF12306@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15663.2128.143056.795328@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> Would it be useful to add to the interator "interface" a method >> which would retrieve the original iterable object? I've no >> idea what that method should be called, but it seems like it >> would be trivial to add since most (all?) iterators have a >> pointer to their underlying object anyway, don't they? GvR> No. The (important!) class of generator-iterators does not GvR> have an underlying container object. Yup, but in that case I think it would be fine if it.gimme_the_underlying_iteratable_object() returned None. It still may be useless. ;) -Barry From martin@v.loewis.de Fri Jul 12 17:49:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Jul 2002 18:49:59 +0200 Subject: [Python-Dev] long long configuration In-Reply-To: <151501c2298d$25186b00$6601a8c0@boostconsulting.com> References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com> <151501c2298d$25186b00$6601a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > > > I'm surprised it wasn't a > > > worse problem with MSVC6, because after all, it doesn't even supply a > type > > > called "long long". > > > > Could that have resulted from defining BOOST_MSVC? > > Sorry, I don't understand the question. Could *what* have resulted from > defining BOOST_MSVC? That it (the Python long long configuration) wasn't a worse problem with MSVC6, even though it doesn't even supply a type called "long long". Regards, Martin From mal@lemburg.com Fri Jul 12 17:58:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 18:58:41 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F0AC1.508@lemburg.com> Guido van Rossum wrote: > I'm a little surprised. Raymond Hettinger checked in a change that > makes all slices of buffer objects return strings. His comments on SF > bug 546434 say that only one person replied and that they agreed > returning strings was the better solution. But that's not how I read > the only response to his query that I see in python-dev, from Scott > Gilbert: Interesting. I must have skipped that message. IMHO, all slices of buffer object should return buffer objects, but since all Python releases return strings, I guess this is too late to change. Note that the only case where a buffer object is returned in Python 2.x (x < 3) is if you write buffer()[:], i.e. you want a copy of the buffer object. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From David Abrahams" Message-ID: <178d01c229c3$b27e5320$6601a8c0@boostconsulting.com> From: "Tim Peters" > In any case, it's too much > fiddling just to save you the effort of ordering a pair of includes > consistently <0.9 wink>. Just addressing the <0.1 wink> you left out: even if I get the include order "right", my users are still screwed if they don't do it the same way. -Dave From guido@python.org Fri Jul 12 18:00:10 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:00:10 -0400 Subject: [Python-Dev] New Subscriber Introduction In-Reply-To: Your message of "Tue, 25 Jun 2002 12:06:26 PDT." References: Message-ID: <200207121700.g6CH0AO12617@pcp02138704pcs.reston01.va.comcast.net> > Ah, OK. Well, that is handy, but since this is meant to be a > drop-in replacement for strptime, I don't think it is warranted > here. Perhaps something like that could be put into Python when > Guido starts putting in new fxns for the forthcoming new datetime > type? No, parsing dates is specifically not part of the datetime proposal. The examples shown of mxDateTime.Parser behavior here reinforce my desire to stay out of the time parsing business. :-) > And I do agree that strptime is not need most of the time. But it is > there so might as well fix that non-portable wart. Exactly. Brett: I'm reviewing your SF patch 474274, but I'm finding problems. I've added a comment to the SF page. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jul 12 17:58:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 12:58:59 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> <200207121640.g6CGe3O26775@europa.research.att.com> Message-ID: <15663.2771.621927.778230@anthem.wooz.org> >>>>> "AK" == Andrew Koenig writes: Barry> Would it be useful to add to the interator "interface" a Barry> method which would retrieve the original iterable object? AK> What if there isn't one? The method would return None. -Barry From ark@research.att.com Fri Jul 12 18:02:15 2002 From: ark@research.att.com (Andrew Koenig) Date: Fri, 12 Jul 2002 13:02:15 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <15663.2771.621927.778230@anthem.wooz.org> (barry@zope.com) References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> <200207121640.g6CGe3O26775@europa.research.att.com> <15663.2771.621927.778230@anthem.wooz.org> Message-ID: <200207121702.g6CH2F827735@europa.research.att.com> >>>>>> "AK" == Andrew Koenig writes: Barry> Would it be useful to add to the interator "interface" a Barry> method which would retrieve the original iterable object? AK> What if there isn't one? Barry> The method would return None. But then you can't rely on it. That is, if you want to write code that depends on the ability to retrieve the original iterable, you have to give up the ability for that code to work on generators, for example. I'm not saying it's not a useful thing to have; I'm just saying it might not be as useful as it appears at first. From barry@zope.com Fri Jul 12 18:06:57 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 13:06:57 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org> <200207121640.g6CGe3O26775@europa.research.att.com> <15663.2771.621927.778230@anthem.wooz.org> <200207121702.g6CH2F827735@europa.research.att.com> Message-ID: <15663.3249.952554.249795@anthem.wooz.org> >>>>> "AK" == Andrew Koenig writes: Barry> Would it be useful to add to the interator "interface" a Barry> method which would retrieve the original iterable object? AK> What if there isn't one? Barry> The method would return None. AK> But then you can't rely on it. That is, if you want to write AK> code that depends on the ability to retrieve the original AK> iterable, you have to give up the ability for that code to AK> work on generators, for example. AK> I'm not saying it's not a useful thing to have; I'm just AK> saying it might not be as useful as it appears at first. I'm not sure it's even useful at all, e.g. I've never had a use for it. But if you have code that depends on the ability to retrieve the original iterable, and you have iterators for which there /is no/ original iterable, it doesn't matter how you spell it, you're going to have to special case around that fact. -Barry From guido@python.org Fri Jul 12 18:09:32 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:09:32 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: Your message of "Sun, 30 Jun 2002 13:39:03 EDT." <20020630173903.GA37045@hishome.net> References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> Message-ID: <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> [Raymond Hettinger] > > Merge the code for xrange() into slice(). [Oren Tirosh] > There's a patch pending for this: www.python.org/sf/575515 I've rejected this. It's better to let these two be different, so that it's clear what the intended use is. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <151501c2298d$25186b00$6601a8c0@boostconsulting.com> Message-ID: <17a901c229c5$1a54dae0$6601a8c0@boostconsulting.com> Sorry, I just wasn't looking. Yes, that's probably the right explanation. Thanks, Dave From: "Martin v. Loewis" > "David Abrahams" writes: > > > > > I'm surprised it wasn't a > > > > worse problem with MSVC6, because after all, it doesn't even supply a > > type > > > > called "long long". > > > > > > Could that have resulted from defining BOOST_MSVC? > > > > Sorry, I don't understand the question. Could *what* have resulted from > > defining BOOST_MSVC? > > That it (the Python long long configuration) wasn't a worse problem > with MSVC6, even though it doesn't even supply a type called "long > long". > > Regards, > Martin From barry@zope.com Fri Jul 12 17:42:14 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 12:42:14 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> <200207121415.g6CEFNr07738@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15663.1766.101214.339822@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> I expect Barry won't accept your changes unless the original GvR> author agrees. This just happened to the logging PEP (wich GvR> was completely transferred to the new author). Right. I sent a previous message about this, but my email's been flakey lately. GvR> (Barry: maybe PEP 1 should discuss transfer of PEP ownership? Yes, good idea. I've added a paragrpah. GvR> I think that Trent should actually have remained co-author of GvR> PEP 282, even if he intends not to contribute another line.) I'll leave that up to the original author for each specific transfer. In this case, Trent, let me know if you'd like to remain a co-author of PEP 282 (it's not like this stuff is set in stone. :). -Barry From guido@python.org Fri Jul 12 18:17:57 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:17:57 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Your message of "Wed, 26 Jun 2002 03:36:21 EDT." <008101c21ce4$2b504fc0$91d8accf@othello> References: <008101c21ce4$2b504fc0$91d8accf@othello> Message-ID: <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net> > Second wild idea of the day: > > The dict constructor currently accepts sequences where each element has > length 2, interpreted as a key-value pair. > > Let's have it also accept sequences with elements of length 1, interpreted > as a key:None pair. > > The benefit is that it provides a way to rapidly construct sets: > > lowercase = dict('abcdefghijklmnopqrstuvwxyz') > if char in lowercase: ... > > dict([key1, key2, key3, key1]).keys() # eliminate duplicate keys Rejecting (even in the modified form you showed after prompring from Tim). I think the dict() constructor is already overloaded to the brim. Let's do a set module instead. There's only one hurdle to take for a set module, and that's the issue with using mutable sets as keys. Let's just pick one solution and implement it (my favorite being that sets simply cannot be used as keys, since it's the simplest, and matches dicts and lists). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 18:22:18 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 13:22:18 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <3D2E922A.4040005@lemburg.com> Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ) Content-type: text/plain; charset=Windows-1252 Content-transfer-encoding: 7BIT [M.-A. Lemburg] > If you could spell out what exactly you mean by "indirect interning" > that would help. Actually, I don't think it would -- the issue is whether the possibility for the ob_sinterned member of a PyStringObject not to *be* the string object itself ever saves time in your extensions, and it's darned hard to guess that. If you apply the attached patch to current CVS, though, it will tell you whenever your code benefits from it. AFAICT, there are only 3 routines where it *might* save cycles (but note that checking for the possibility costs cycles whether or not it pays; it's a net loss when it doesn't pay): + PyDict_SetItem: I believe this is the only real possibility for gain. If it ever helps you here, the patch arranges to print ii paid on a setitem to stderr whenever it does pay. I haven't yet seen that get printed. + PyString_InternInPlace: Whenever it pays here, the patch spits ii paid on an InternInPlace That triggers 6 times in the Python test suite, all from test_descr. Since this one is an optimization *of* setting ob_sinterned, it's a snake-eating-its-tail kind of thing -- it's of no real benefit unless ob_sintered pays off somewhere else too. + string_hash: The patch spits ii paid on a hash??? The question marks are there because I don't see how it's possible for this to get printed. > What I do need and rely on is the fact that the > Python compiler interns all constant strings and identifiers in > Python programs. This makes switching like so: Ya, while that's evil, it's not affected by indirect interning. --Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ) Content-type: text/plain; name=ii.txt Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=ii.txt Index: Objects/dictobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/dictobject.c,v retrieving revision 2.126 diff -c -c -r2.126 dictobject.c *** Objects/dictobject.c 13 Jun 2002 20:32:57 -0000 2.126 --- Objects/dictobject.c 12 Jul 2002 17:14:19 -0000 *************** *** 512,517 **** --- 512,519 ---- mp = (dictobject *)op; if (PyString_CheckExact(key)) { if (((PyStringObject *)key)->ob_sinterned != NULL) { + if (key != ((PyStringObject *)key)->ob_sinterned) + fprintf(stderr, "ii paid on a setitem\n"); key = ((PyStringObject *)key)->ob_sinterned; hash = ((PyStringObject *)key)->ob_shash; } Index: Objects/stringobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/stringobject.c,v retrieving revision 2.169 diff -c -c -r2.169 stringobject.c *** Objects/stringobject.c 11 Jul 2002 06:23:50 -0000 2.169 --- Objects/stringobject.c 12 Jul 2002 17:14:20 -0000 *************** *** 925,933 **** if (a->ob_shash != -1) return a->ob_shash; ! if (a->ob_sinterned != NULL) return (a->ob_shash = ((PyStringObject *)(a->ob_sinterned))->ob_shash); len = a->ob_size; p = (unsigned char *) a->ob_sval; x = *p << 7; --- 925,940 ---- if (a->ob_shash != -1) return a->ob_shash; ! if (a->ob_sinterned != NULL) { ! if ((PyObject *)a != a->ob_sinterned) ! /* This shouldn't be possible? 'a' would have ! * had its ob_shash set as part of a->ob_sinterned ! * getting set. ! */ ! fprintf(stderr, "ii paid on a hash???\n"); return (a->ob_shash = ((PyStringObject *)(a->ob_sinterned))->ob_shash); + } len = a->ob_size; p = (unsigned char *) a->ob_sval; x = *p << 7; *************** *** 3829,3834 **** --- 3836,3842 ---- if ((t = s->ob_sinterned) != NULL) { if (t == (PyObject *)s) return; + fprintf(stderr, "ii paid on an InternInPlace\n"); Py_INCREF(t); *p = t; Py_DECREF(s); --Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ)-- From guido@python.org Fri Jul 12 18:24:31 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:24:31 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 18:58:41 +0200." <3D2F0AC1.508@lemburg.com> References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> <3D2F0AC1.508@lemburg.com> Message-ID: <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum wrote: > > I'm a little surprised. Raymond Hettinger checked in a change that > > makes all slices of buffer objects return strings. His comments on SF > > bug 546434 say that only one person replied and that they agreed > > returning strings was the better solution. But that's not how I read > > the only response to his query that I see in python-dev, from Scott > > Gilbert: > > Interesting. I must have skipped that message. You blink, and you find that the world has changed. > IMHO, all slices of buffer object should return buffer objects, > but since all Python releases return strings, I guess this is too > late to change. That was my preference too, but Raymond disagreed and somehow tried to find support for his position :-). Since buffer objects (of course :-) support the C-level buffer protocol, they can still be used in most places where strings are needed. But it would be incompatible. But so is Raymond's solution (because it changes buffer()[:] to also return a string). > Note that the only case where a buffer object > is returned in Python 2.x (x < 3) is if you write > buffer()[:], i.e. you want a copy of the buffer object. What does a copy of a buffer object buy you? It's not too late to revert Raymond's changes. --Guido van Rossum (home page: http://www.python.org/~guido/) From mclay@nist.gov Fri Jul 12 18:24:21 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 12 Jul 2002 13:24:21 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121324.21609.mclay@nist.gov> On Friday 12 July 2002 10:47 am, Guido van Rossum wrote: > (FWIW, I agree with your other observations -- this was why I > support exploring an alternative in PEP 292.) The syntax rules of PEP 292 are likely to cause confusion for newbies who have never used sh or perl. They will ask why Python have two syntaxes for doing string substitutions? Why not always spell the substitution string with ${identifier} or %(identifier)? The third rule of PEP292 in particular look like a patch to fix a kludge when an unanticipated exception was discovered. 3. ${identifier} is equivalent to $identifier. It is required for when valid identifier characters follow the placeholder but are not part of the placeholder, e.g. "${noun}ification". > On Sunday 23 June 2002 02:16 pm, Lalo Martins wrote: > > More, I'm completely opposed to "<> is <> years old" > > because it's still cryptic and invasive. This should instead read similar > > to "<> is <> years old".sub({'name': x.name, 'age': > > x.age.format(None, 0)}) > > > Guido, can you please, for our enlightenment, tell us what are the > > reasons you feel %(foo)s was a mistake? > > Because of the trailing 's'. It's very easy to leave it out by > mistake, and because the definition of printf formats skips over > spaces (don't ask me why), the first character of the following word > is used as the type indicator. It's easy to leave it out by mistake, but the error is almost always immediately obvious. In the interest of keeping the language as simple as possible, I hope no changes are made. If a method based .sub() capability is to be added, why not reuse the %(identifier) syntax instead of introducing $ and ${} syntax? The .sub() string method would use the %(identifier) syntax without the 's' to spell the new substitution format. Instead of the proposed: '$name was born in ${country}'.sub() the phrase would be spelled: '%(name) was born in %(country)'.sub() This approach would introduce one new string method with a small variation on the existing '%' substitution syntax. From mal@lemburg.com Fri Jul 12 18:30:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 19:30:45 +0200 Subject: [Python-Dev] python package References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F1245.1030804@lemburg.com> Guido van Rossum wrote: > I have thought some more about the idea of moving the entire stdlib > into a package named "python" and I reject the idea. > > Think of the impact the change would have on the tutorial. > > Think of the amount of needless changes to perfectly working code it > would entail. > > If you want to avoid 3rd party module/package names to be invalidated > by additions to the standard library, you might just as well introduce > a "nonstd" package into which all 3rd party extensions must be placed. > This at least doesn't require people who don't use 3rd party code to > change their programs. Uhm, the point I was trying to make was to provide a long running upgrade path from the current situation (everthing is top-level) to the single package structure. It is fairly easy to move from 'import os' to 'from python import os', but I understand that people will not want to do this until Python 3. I was not suggesting to start breaking code by enforcing this strategy in some way, I just though it would be a good idea to start providing means to work with the single python package approach now to make the transition less painful in Python 3. > Maybe we should create a standard package hierarchy; Eric Raymond once > started working on such a proposal but I have discouraged him because > I think it would cause too much upheaval. But for Python 3 I would > consider it. That's what I was targetting :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jul 12 18:36:38 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:36:38 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Fri, 12 Jul 2002 19:30:45 +0200." <3D2F1245.1030804@lemburg.com> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> Message-ID: <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum wrote: > > I have thought some more about the idea of moving the entire stdlib > > into a package named "python" and I reject the idea. > > > > Think of the impact the change would have on the tutorial. > > > > Think of the amount of needless changes to perfectly working code it > > would entail. > > > > If you want to avoid 3rd party module/package names to be invalidated > > by additions to the standard library, you might just as well introduce > > a "nonstd" package into which all 3rd party extensions must be placed. > > This at least doesn't require people who don't use 3rd party code to > > change their programs. [MAL] > Uhm, the point I was trying to make was to provide a long > running upgrade path from the current situation (everthing is > top-level) to the single package structure. And my suggestion of a "nonstd" toplevel package had the same goal. :-) > It is fairly easy to move from 'import os' to 'from python import os', > but I understand that people will not want to do this until > Python 3. > > I was not suggesting to start breaking code by enforcing this > strategy in some way, I just though it would be a good idea > to start providing means to work with the single python package > approach now to make the transition less painful in Python 3. Two problems. First, your proposal has lots of practical warts that I already pointed out; your suggestion to fix one of them by making all the old names stubs would require a massive set of changes to the CVS repository. Second, I don't think a 'python' toplevel package is the right solution. > > Maybe we should create a standard package hierarchy; Eric Raymond once > > started working on such a proposal but I have discouraged him because > > I think it would cause too much upheaval. But for Python 3 I would > > consider it. > > That's what I was targetting :-) Then please think about a proper solution rather than proposing something whose only virtue seems to be that you can implement a poor approximation of it in two lines. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 18:36:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 13:36:22 -0400 Subject: [Python-Dev] long long configuration In-Reply-To: <178d01c229c3$b27e5320$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > Just addressing the <0.1 wink> you left out: even if I get the include > order "right", my users are still screwed if they don't do it the > same way. Give them a pythonboost.h instead that contains the includes in the right order, or make your boost.h smarter, or ... From barry@zope.com Fri Jul 12 18:33:35 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 13:33:35 -0400 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15663.4847.187726.359608@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Let's do a set module instead. +1 GvR> There's only one hurdle to take for a set module, and that's GvR> the issue with using mutable sets as keys. Let's just pick GvR> one solution and implement it (my favorite being that sets GvR> simply cannot be used as keys, since it's the simplest, and GvR> matches dicts and lists). +1 -Barry From mal@lemburg.com Fri Jul 12 18:47:09 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 19:47:09 +0200 Subject: [Python-Dev] python package References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F161D.40005@lemburg.com> Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>I have thought some more about the idea of moving the entire stdlib >>>into a package named "python" and I reject the idea. >>> >>>Think of the impact the change would have on the tutorial. >>> >>>Think of the amount of needless changes to perfectly working code it >>>would entail. >>> >>>If you want to avoid 3rd party module/package names to be invalidated >>>by additions to the standard library, you might just as well introduce >>>a "nonstd" package into which all 3rd party extensions must be placed. >>>This at least doesn't require people who don't use 3rd party code to >>>change their programs. >> > > [MAL] > >>Uhm, the point I was trying to make was to provide a long >>running upgrade path from the current situation (everthing is >>top-level) to the single package structure. > > > And my suggestion of a "nonstd" toplevel package had the same goal. :-) With the exception that we have control over the Python core code while we don't over third party extensions, so providing means to simplify the transition for the standard lib is easier than trying to enforce your proposed 'nonstd' package. >>It is fairly easy to move from 'import os' to 'from python import os', >>but I understand that people will not want to do this until >>Python 3. >> >>I was not suggesting to start breaking code by enforcing this >>strategy in some way, I just though it would be a good idea >>to start providing means to work with the single python package >>approach now to make the transition less painful in Python 3. > > > Two problems. First, your proposal has lots of practical warts that I > already pointed out; your suggestion to fix one of them by making all > the old names stubs would require a massive set of changes to the CVS > repository. Second, I don't think a 'python' toplevel package is the > right solution. > > >>>Maybe we should create a standard package hierarchy; Eric Raymond once >>>started working on such a proposal but I have discouraged him because >>>I think it would cause too much upheaval. But for Python 3 I would >>>consider it. >> >>That's what I was targetting :-) > > > Then please think about a proper solution rather than proposing > something whose only virtue seems to be that you can implement a poor > approximation of it in two lines. Just testing waters here... there's no point in trying to find a solution to something which is not regarded as problem anyway. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From David Abrahams" Message-ID: <181901c229ca$ee810280$6601a8c0@boostconsulting.com> From: "Tim Peters" > [David Abrahams] > > Just addressing the <0.1 wink> you left out: even if I get the include > > order "right", my users are still screwed if they don't do it the > > same way. > > Give them a pythonboost.h instead that contains the includes in the right > order, or make your boost.h smarter, or ... You fixed the LONGLONG_MAX stuff already, so I don't think there's anything to discuss here, is there? None of my code is confused by HAVE_LONG_LONG. -Dave From guido@python.org Fri Jul 12 18:54:20 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 13:54:20 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Fri, 12 Jul 2002 19:47:09 +0200." <3D2F161D.40005@lemburg.com> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> <3D2F161D.40005@lemburg.com> Message-ID: <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> > With the exception that we have control over the Python core > code while we don't over third party extensions, so providing > means to simplify the transition for the standard lib is easier > than trying to enforce your proposed 'nonstd' package. I think you could get a long way with minor changes along the lines of making site-packages a package itself. > > Then please think about a proper solution rather than proposing > > something whose only virtue seems to be that you can implement a poor > > approximation of it in two lines. > > Just testing waters here... there's no point in trying to > find a solution to something which is not regarded as problem > anyway. You started by claiming that there's a problem: expansion of the stdlib could conflict with 3rd party module/package names. I don't regard it as a problem that's so bad that we need to make big changes to solve it. If you still think a solution is desired, you could start by proposing a new standard package hierarchy. Then new standard modules could be placed in that new hierarchy rather than at the top level. I'm rejecting the proposal of a single top-level package named "python". --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 18:57:59 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 13:57:59 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D2F161D.40005@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Just testing waters here... there's no point in trying to > find a solution to something which is not regarded as problem > anyway. There is something to be solved here. Anecdote: I sucked an early version of Greg's textwrap.py module into my build directory. After he checked it in, I changed regrtest.py to use textwrap. This kept failing with baffling errors, until I realized I was still picking up an incompatible textwrap.py from the build directory. So I got rid of the latter. Somewhere in between, I synched my desktop and laptop machines and so got another copy on my laptop that way, which I didn't notice. When I got home and synched the laptop back to the desktop, it then restored the deleted testwrap.py to the desktop machine, and I got the same round of impossible errors all over again. I deleted it from home machine again, but the next time I used my laptop to run the test suite got the impossible errors yet another time -- and had synched the machines again in the meantime so that it once again showed up on the desktop disk. So there's one use case . From guido@python.org Fri Jul 12 19:02:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 14:02:45 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Fri, 12 Jul 2002 13:57:59 EDT." References: Message-ID: <200207121802.g6CI2j113182@pcp02138704pcs.reston01.va.comcast.net> > There is something to be solved here. Anecdote: I sucked an early > version of Greg's textwrap.py module into my build directory. After > he checked it in, I changed regrtest.py to use textwrap. This kept > failing with baffling errors, until I realized I was still picking > up an incompatible textwrap.py from the build directory. So I got > rid of the latter. Somewhere in between, I synched my desktop and > laptop machines and so got another copy on my laptop that way, which > I didn't notice. When I got home and synched the laptop back to the > desktop, it then restored the deleted testwrap.py to the desktop > machine, and I got the same round of impossible errors all over > again. I deleted it from home machine again, but the next time I > used my laptop to run the test suite got the impossible errors yet > another time -- and had synched the machines again in the meantime > so that it once again showed up on the desktop disk. This just shows that having the current directory on sys.path (especially at the front) causes problems. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 19:01:58 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 14:01:58 -0400 Subject: [Python-Dev] long long configuration In-Reply-To: <181901c229ca$ee810280$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] >>> Just addressing the <0.1 wink> you left out: even if I get the include >>> order "right", my users are still screwed if they don't do it the >>> same way. >> Give them a pythonboost.h instead that contains the includes in >> the right order, or make your boost.h smarter, or ... > You fixed the LONGLONG_MAX stuff already, so I don't think > there's anything to discuss here, is there? I thought you thought there was, else there was no apparent reason for the "Just addressing" message I replied to. BTW, do you need this in 2.2.2 too, or is 2.3 good enough? I didn't change anything on the 2.2 branch. From tim.one@comcast.net Fri Jul 12 19:11:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 14:11:28 -0400 Subject: [Python-Dev] python package In-Reply-To: <200207121802.g6CI2j113182@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > This just shows that having the current directory on sys.path > (especially at the front) causes problems. I thought it showed I shouldn't be so careless when synching machines, but I'll take an excuse to blame Python instead . Still, it's something that would not have happened had I needed to prefix the import of the standard textwrap with a "standard" name -- or of my private textwrap with a "non-standard" name. Putting the current directory in sys.path is just too useful to give up. I suspect that putting it specifically at the front is only "a feature" for Python library developers, though, and "a bug" for others -- end users stumble into this a lot by unhappy accident, like when creating a random.py to hold their initial experiments with Python's random-number facilities. From xscottg@yahoo.com Fri Jul 12 19:17:06 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Fri, 12 Jul 2002 11:17:06 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712181706.63704.qmail@web40101.mail.yahoo.com> --- Guido van Rossum wrote: > > I'm a little surprised. Raymond Hettinger checked in a change that > makes all slices of buffer objects return strings. His comments on SF > bug 546434 say that only one person replied and that they agreed > returning strings was the better solution. But that's not how I read > the only response to his query that I see in python-dev, from Scott > Gilbert: > After the message you're referring to, Raymond Hettinger and I corresponded a little bit off of the list. I think these are probably the most relevant snippets: --- Raymond Hettinger: > > For the problem at hand, do you recommend returning buffer objects or > strings? > --- To which I responded: > > I wish I could give you an easy A or B answer. What I would like to see > is for the PyBufferObject to be nothing more than a BufferInspector. As > such, it would make more sense to have slices return another BufferObject > that is inspecting the same data. In other words, "View Behavior". In > this context, repetition of buffer objects doesn't make any sense and > should raise an exception. > > However, that's going to break somebody's code somewhere, so I can't see > Guido allowing that for a problem he doesn't really care about. I think > you're stuck returning strings until Python 3000. So the best bet would > be to have it just always return a string... > Forgive the bit about "Guido not caring about it", it seemed that way to me at the time. Silence comes off as disinterest or annoyance. So my suggestion was that since taking away the implicit promotion of buffer slices/repetitions/concatenations to strings was going to break someone's code, that just can't be done. If we want sane behavior, then any slice, be it buf[1:2] or buf[:], ought to at least return the same type of object. Those two in conjunction mean they ought to always returns strings. --- Raymond Hettinger also wrote: > > Thanks for your input, this topic doesn't seem to interest anyone, > --- To which I responded: > > I think there are others that are interested, but it's pretty tough to get > anything done without breaking backwards compatibility. Mark Hammond > indicated he wants a usable buffer object for some asynchronous I/O > stuff, and the Numarray stuff addresses the shortcomings of the buffer > object by reinventing yet another wheel. > > I've said this before, but I think the problem basically boils down to the > following - once you realize what the limitations of the buffer object > are, you realize that even if you fixed it, it isn't useful for what you > wanted to use it for. > --- Back to Guido van Rossum: > > I read this as a recommendation to forget about returning strings. Am > I mistaken? > Only if breaking backwards compatibility is an option. I'd like to see that happen, but I think that would take a pronouncement from someone in authority. --- More of Guido van Rossum: > > Also, I wish you'd submitted that PEP. IMO the reason that nobody > likes this topic is that there is much confusion about why we have > buffer objects in the first place. Any attempt at clarifying this > (e.g. proposing separate byte arrays and buffer inspectors) would be > welcome. > I'm glad to hear this. I'll submit the PEP sometime in the next week. Cheers, -Scott Gilbert __________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com From oren-py-d@hishome.net Fri Jul 12 19:21:05 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 21:21:05 +0300 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 01:09:32PM -0400 References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712212105.A8666@hishome.net> On Fri, Jul 12, 2002 at 01:09:32PM -0400, Guido van Rossum wrote: > [Raymond Hettinger] > > > Merge the code for xrange() into slice(). > > [Oren Tirosh] > > There's a patch pending for this: www.python.org/sf/575515 > > I've rejected this. It's better to let these two be different, so > that it's clear what the intended use is. When I was going through the sources of sliceobject.c I found the function PySlice_GetIndicesEx. It performs the magic of trimming a slice into the range of indices of a sequence, including negative indices and intervals with None as start or stop value. A comment in this function says: /* this is harder to get right than you might think */ Wouldn't it be a good idea to expose this nontrivial functionality to Python code as a method of slice objects? The method would take an integer argument (length) and return an xrange object. It should make it much easier to implement user types that support extended slicing: def __getitem__(self, index): if isinstance(index, slice): return [get_item_at(i) for i in index.trim(len(self))] else: return get_item_at(index) Suggestions for a better name than trim? Any reason why this API should stay exposed only to C and not to Python? Oren From guido@python.org Fri Jul 12 19:34:34 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 14:34:34 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 11:17:06 PDT." <20020712181706.63704.qmail@web40101.mail.yahoo.com> References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> Message-ID: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> It seems we're still in the same boat. It would be saner to change buffer slices to return buffer objects, except for backward compatibility. I was hoping to hear from someone who uses buffer objects and knows that this would break his code. Scott apparently doesn't have this problem with his own code, so his opinion doesn't help. :-( Raymond's change still breaks compatibility, though, for slices without begin and end points. So we have a contradiction: out of fear of breaking compatibility, we make a change that breaks compatibility. Maybe we should do the same with the buffer object as we did with xrange(), and plan to remove all functionality that we aren't sure is useful? In 2.3, we would have to maintain compatibility but we could warn about features that will go away; in 2.4, we could remove unwanted features. Maybe the name 'buffer' suggests false expectations? It's not a buffer, it's an alias for a memory area. Maybe we should do something stronger, and deprecate the buffer type altogether. --Guido van Rossum (home page: http://www.python.org/~guido/) From mclay@nist.gov Fri Jul 12 19:31:30 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 12 Jul 2002 14:31:30 -0400 Subject: [Python-Dev] python package In-Reply-To: <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121431.30722.mclay@nist.gov> On Friday 12 July 2002 01:54 pm, Guido van Rossum wrote: > If you still think a solution is desired, you could start by proposing > a new standard package hierarchy. Then new standard modules could be > placed in that new hierarchy rather than at the top level. > > I'm rejecting the proposal of a single top-level package named "python". I've read the entire thread and still do not understand why you are suggesting the new standard package hirearchy should be named "new". The contents will eventually will grow old and they will still be in something called "new". Why not use a name like "std", "misc", "core", or "sph" for the top of the standard package hiearchy? It doesn't matter what the name will be, but I hope it will be something that isn't confusing. From guido@python.org Fri Jul 12 19:38:31 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 14:38:31 -0400 Subject: [Python-Dev] Python version of PySlice_GetIndicesEx In-Reply-To: Your message of "Fri, 12 Jul 2002 21:21:05 +0300." <20020712212105.A8666@hishome.net> References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> Message-ID: <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net> (I changed the subject) > When I was going through the sources of sliceobject.c I found the function > PySlice_GetIndicesEx. It performs the magic of trimming a slice into the > range of indices of a sequence, including negative indices and intervals > with None as start or stop value. A comment in this function says: > > /* this is harder to get right than you might think */ > > Wouldn't it be a good idea to expose this nontrivial functionality to > Python code as a method of slice objects? I dunno. It seems that most code that actually uses slices is written in C anyway. > The method would take an integer argument (length) and return an > xrange object. Why an xrange object? That's not inspectable. *If* we were to do this (which I doubt) it should return a tuple of three ints. > It should make it much > easier to implement user types that support extended slicing: > > def __getitem__(self, index): > if isinstance(index, slice): > return [get_item_at(i) for i in index.trim(len(self))] > else: > return get_item_at(index) > > Suggestions for a better name than trim? getindices() > Any reason why this API should stay exposed only to C and not to > Python? Have you got a real use case? I'm a bit weary of hypothetical use cases (that's what got us xrange repetition in the first place). --Guido van Rossum (home page: http://www.python.org/~guido/) From mclay@nist.gov Fri Jul 12 19:34:26 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 12 Jul 2002 14:34:26 -0400 Subject: [Python-Dev] String substitution: compile-time versus runtime In-Reply-To: <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net> References: <3D1297ED.3990C30F@prescod.net> <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121434.26976.mclay@nist.gov> On Friday 12 July 2002 12:23 pm, Guido van Rossum wrote: > (and perhaps about being able to use words in their own language as > identifiers). Beware of possible lookalike characters. I recently learned that it is possible to register for domain name with Unicode characters and since there are indistinguishable character symbols on different code pages (for instance, the Cyrillic 'o' is indistinguishable from the English 'o') this has created an interesting opportunity for domain name exploits. It probably isn't dangerous in the Python source code, but limiting the character set of identifiers to a small number of characters seems prudent. From guido@python.org Fri Jul 12 19:42:26 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 14:42:26 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Fri, 12 Jul 2002 14:31:30 EDT." <200207121431.30722.mclay@nist.gov> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> <200207121431.30722.mclay@nist.gov> Message-ID: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> [me] > > If you still think a solution is desired, you could start by > > proposing a new standard package hierarchy. Then new standard > > modules could be placed in that new hierarchy rather than at the > > top level. > > > > I'm rejecting the proposal of a single top-level package named "python". [Michael] > I've read the entire thread and still do not understand why you are > suggesting the new standard package hirearchy should be named > "new". The contents will eventually will grow old and they will > still be in something called "new". Why not use a name like "std", > "misc", "core", or "sph" for the top of the standard package > hiearchy? It doesn't matter what the name will be, but I hope it > will be something that isn't confusing. Uh? Who is proposing to name it "new"? Not me! Maybe you should read the entire thread again? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jul 12 19:44:31 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 12 Jul 2002 20:44:31 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> I'm not too interested in this anymore (I _was_ a year ago, IIRC). I have given up using the buffer object myself, I've written my own (maybe in the same way as others). > Maybe the name 'buffer' suggests false expectations? It's not a > buffer, it's an alias for a memory area. > Hm. The name could be right (and I cold give up my own memory object) if there were a way to create a buffer owning it's own memory. > Maybe we should do something stronger, and deprecate the buffer type > altogether. Or this. Thomas From guido@python.org Fri Jul 12 19:51:06 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 14:51:06 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 20:44:31 +0200." <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> Message-ID: <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net> > I'm not too interested in this anymore (I _was_ a year ago, IIRC). > I have given up using the buffer object myself, I've written > my own (maybe in the same way as others). Right. > > Maybe the name 'buffer' suggests false expectations? It's not a > > buffer, it's an alias for a memory area. > > > Hm. The name could be right (and I cold give up my own memory > object) if there were a way to create a buffer owning it's > own memory. Maybe your memory object could become a standard Python extension. Extra points if it works well with the memmap and the array modules. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" Message-ID: <187601c229d4$fd7a2000$6601a8c0@boostconsulting.com> I'm trying to work around it for 2.2. I'll let you know if there are insurmountable problems. -Dave ----- Original Message ----- From: "Tim Peters" To: "David Abrahams" Cc: Sent: Friday, July 12, 2002 2:01 PM Subject: RE: [Python-Dev] long long configuration > [David Abrahams] > >>> Just addressing the <0.1 wink> you left out: even if I get the include > >>> order "right", my users are still screwed if they don't do it the > >>> same way. > > >> Give them a pythonboost.h instead that contains the includes in > >> the right order, or make your boost.h smarter, or ... > > > You fixed the LONGLONG_MAX stuff already, so I don't think > > there's anything to discuss here, is there? > > I thought you thought there was, else there was no apparent reason for the > "Just addressing" message I replied to. > > BTW, do you need this in 2.2.2 too, or is 2.3 good enough? I didn't change > anything on the 2.2 branch. > From thomas.heller@ion-tof.com Fri Jul 12 20:03:58 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 12 Jul 2002 21:03:58 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook> > > > Maybe the name 'buffer' suggests false expectations? It's not a > > > buffer, it's an alias for a memory area. > > > > > Hm. The name could be right (and I cold give up my own memory > > object) if there were a way to create a buffer owning it's > > own memory. > > Maybe your memory object could become a standard Python extension. > Extra points if it works well with the memmap and the array modules. > What do you mean by 'works well with the mmap and array modules'? Thomas From guido@python.org Fri Jul 12 20:07:06 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 15:07:06 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: Your message of "Fri, 12 Jul 2002 13:24:21 EDT." <200207121324.21609.mclay@nist.gov> References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> <200207121324.21609.mclay@nist.gov> Message-ID: <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net> > The syntax rules of PEP 292 are likely to cause confusion for > newbies who have never used sh or perl. They will ask why Python > have two syntaxes for doing string substitutions? Why not always > spell the substitution string with ${identifier} or %(identifier)? > The third rule of PEP292 in particular look like a patch to fix a > kludge when an unanticipated exception was discovered. > > 3. ${identifier} is equivalent to $identifier. It is required for > when valid identifier characters follow the placeholder but are > not part of the placeholder, e.g. "${noun}ification". > > > On Sunday 23 June 2002 02:16 pm, Lalo Martins wrote: > > > More, I'm completely opposed to "<> is <> years > > > old" because it's still cryptic and invasive. This should > > > instead read similar to "<> is <> years > > > old".sub({'name': x.name, 'age': x.age.format(None, 0)}) > > > > > Guido, can you please, for our enlightenment, tell us what are the > > > reasons you feel %(foo)s was a mistake? > > > > Because of the trailing 's'. It's very easy to leave it out by > > mistake, and because the definition of printf formats skips over > > spaces (don't ask me why), the first character of the following word > > is used as the type indicator. > > It's easy to leave it out by mistake, but the error is almost always > immediately obvious. In the interest of keeping the language as > simple as possible, I hope no changes are made. If a method based > .sub() capability is to be added, why not reuse the %(identifier) > syntax instead of introducing $ and ${} syntax? The .sub() string > method would use the %(identifier) syntax without the 's' to spell > the new substitution format. Instead of the proposed: > > '$name was born in ${country}'.sub() > > the phrase would be spelled: > > '%(name) was born in %(country)'.sub() > > This approach would introduce one new string method with a small > variation on the existing '%' substitution syntax. An argument can be made that since this works rather different than the current % operator, it's better to avoid confusion by using a different character. One can also argue that many Perl and shell programmers are migrating to Python, for whom this would be helpful -- for others, $ or % makes little difference (DOS batch file programmers aren't that common, most Windows users never get to this). But the exact syntax to use in the template is a relatively trivial detail IMO. Whether to pick `name`, <>, $name, $(name), ${name}, %name, %{name}, or %(name), is a choice we can make later. Ditto about whether to allow full expressions, dotted names only, or simple names only, and whether to allow leaving off the brackets for simple names (or even for dotted names, as in PEP 215). User testing would be good. User testing has already shown that the current %(name)s notation causes too many mistakes, because of the odd trailing 's'. These errors may be immediately obvious when you run the code, but constructs that are easily mistyped should still be avoided if possible. Also, I believe that the error has actually been puzzling for many people (e.g. sometimes no error is raised but on close inspection a few characters appear to be omitted from the output). The real issues are IMO: - Compile-time vs. run-time parsing. I've become convinced that the compiler should do the parsing: this is the only way to make access to variables in nested scopes work, avoids security issues, and makes it easier to diagnose errors (e.g. in PyChecker). - How to support translation. Here the template must be replaced at run-time, but it is still desirable that the collection of available names is known at compile time (to avoid the security issues). - Optional formatting specifiers. I agree with Lalo that these should not be part of the interpolation syntax but need to be dealt with at a different level. I think these are only relevant for numeric data. Funny, there's still a (now-deprecated) module fpformat.py that supports arbitrary floating point formatting, and string.zfill() supports a bit of integer formatting. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 20:08:54 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 15:08:54 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 21:03:58 +0200." <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook> References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net> <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook> Message-ID: <200207121908.g6CJ8sx13530@pcp02138704pcs.reston01.va.comcast.net> > What do you mean by 'works well with the mmap and array modules'? I'm not sure, since I don't know what your memory object does (and frankly, I don't really understand what the mmap module does either :-). I was just mentioning these because they are other modules that have been used and/or proposed for buffering needs. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jul 12 20:19:53 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 12 Jul 2002 21:19:53 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net> <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook> <200207121908.g6CJ8sx13530@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0b8501c229d9$19a34e10$e000a8c0@thomasnotebook> > > What do you mean by 'works well with the mmap and array modules'? > > I'm not sure, since I don't know what your memory object does (and > frankly, I don't really understand what the mmap module does either > :-). > "Memory-mapped file objects behave like both strings and like file objects. Unlike normal string objects, however, these are mutable." More in the Python manual... Optionally they can be backed up by files in the file system, and optionally they can be shared between processes. At least that's what they are under Windows. > I was just mentioning these because they are other modules that have > been used and/or proposed for buffering needs. Now that you mention this, mmap could be used as a 'memory' object, although it would have to be converted into a new style class. My own memory object currently supports a private protocol which dosn't make sense for core Python. But that can be fixed. Thomas From oren-py-d@hishome.net Fri Jul 12 20:23:26 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 12 Jul 2002 22:23:26 +0300 Subject: [Python-Dev] Re: Python version of PySlice_GetIndicesEx In-Reply-To: <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 02:38:31PM -0400 References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712222326.A10011@hishome.net> On Fri, Jul 12, 2002 at 02:38:31PM -0400, Guido van Rossum wrote: > Have you got a real use case? I'm a bit weary of hypothetical use > cases (that's what got us xrange repetition in the first place). Umm.. implementing slicable user types? I've written some indexable objects with a __getitem__ magic method. Making them fully slicable with extended slicing format almost for free would have been really nice. Yes, you're right. It's just a nice-to-have. I don't care about it that much. Oren From aahz@pythoncraft.com Fri Jul 12 20:49:54 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 12 Jul 2002 15:49:54 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net> References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> <200207121324.21609.mclay@nist.gov> <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020712194954.GA18925@panix.com> On Fri, Jul 12, 2002, Guido van Rossum wrote: > > - Optional formatting specifiers. I agree with Lalo that these should > not be part of the interpolation syntax but need to be dealt with at > a different level. I think these are only relevant for numeric > data. I've used "%20s" * 5 frequently enough in the past to do crude tables. That's not a feature I'd like to lose. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Fri Jul 12 21:00:21 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 16:00:21 -0400 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error In-Reply-To: Your message of "Fri, 12 Jul 2002 13:29:17 MDT." <3D2F2E0D.2C4FD92F@3captus.com> References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> <3D2F2E0D.2C4FD92F@3captus.com> Message-ID: <200207122000.g6CK0Lw13863@pcp02138704pcs.reston01.va.comcast.net> > > The way I restructured the code it is impossible to distinguish a > > timeout error from other errors; you simply get the "no data > > available" error from the socket operation. This is the same error > > you'd get in non-blocking mode. > > > > To distinguish a timeout error, the caller can check s->sock_timeout > when a non-blocking mode error occured, or just return an error code > from internal_select() (I guess you must have your reason to taken it > out in the first place) I don't understand your first suggestion. Not all errors mean that the timeout triggered! I took it out because it is much less code this way. > > Before I recomplicate the code so that it can raise a separate error > > when the select fails, I'd like to understand the use case better. > > Why would you want to make this distinction? Requeueing the request > > (as in Skip's example) doesn't make sense IMO: you set the timeout for > > a reason, and that reason is that you want to give up if it takes too > > long. If you really intend to retry you're better of disabling the > > timeout! > > > > How about the following (assume we have socket.setDefaultTimeout()): > > import socket > import urllib > > socket.setDefaultTimeout(5.0) > retry = 0 > url = 'some url' > > while retry < 3: > try: > file = urllib.urlretrieve(url) > except socket.TimeoutError: > if retry == 2: > print "Server too busy, given up!" > raise > else: > print "Server busy, retry!" > retry += 1 > else: > break > > MS IIS behave strangely to http request. When the server is very busy, > it will randomly drop some requests without disconnecting the client. > So the best approach for the client is to timeout and retry. I guess > that might be the reason why people needed timeoutsocket in the first > place. One of the reasons (there are lots of reasons why a connect or receive attempt may be very slow to time out, or even never time out). Of course, this stll doesn't distinguish between a timeout from connect() and one from recv(). Have you ever written code like this? > > If you really want to, you can already distinguish the timeout case, > > because you get an EAGAIN error then (maybe something else on Windows > > -- Bernard, if you have a fix for that, please send it to me). > > I am struggling with the test case for the new socket code. The timeout > test case I've send you works with the old socketmodule.c (attached), > but not with the lastest version (on linux or windows). It's strange, > your new implementation looks much cleaner. No need to attach copies of old versions -- just give me the CVS revision number. :-) > Please bear with me a bit longer for a patch :.( OK. Anyway, I have no time to play with this right now, so I'm glad you aren't giving up just yet. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jul 12 21:13:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 22:13:11 +0200 Subject: [Python-Dev] python package References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F3857.3010700@lemburg.com> Guido van Rossum wrote: >>With the exception that we have control over the Python core >>code while we don't over third party extensions, so providing >>means to simplify the transition for the standard lib is easier >>than trying to enforce your proposed 'nonstd' package. > > > I think you could get a long way with minor changes along the lines of > making site-packages a package itself. This wouldn't work since in that case you'd have the problem of having to fix class names in e.g. pickles for objects which you don't know anything about. We do know about objects in the Python standard lib, so we could take care to have mechanisms like pickle deal with them properly. >>>Then please think about a proper solution rather than proposing >>>something whose only virtue seems to be that you can implement a poor >>>approximation of it in two lines. >> >>Just testing waters here... there's no point in trying to >>find a solution to something which is not regarded as problem >>anyway. > > > You started by claiming that there's a problem: expansion of the > stdlib could conflict with 3rd party module/package names. > > I don't regard it as a problem that's so bad that we need to make big > changes to solve it. I believe that the more Python grows (not only the core, but the complete set of available modules and packages in the Python universe), the less likely we are going to hit a problem. > If you still think a solution is desired, you could start by proposing > a new standard package hierarchy. Then new standard modules could be > placed in that new hierarchy rather than at the top level. > > I'm rejecting the proposal of a single top-level package named "python". You've written that before, but you still haven't given any explanation of why a single package would be worse than a multi-level hierarchy of modules (e.g. grouped by application space). I think that simply moving to one package would cause less breakage and make the whole transition process much easier than having to tweak code into using some complicated multi-package structure. FWIW, I've been through all this with the mx packages and using a single new package caused the least amount of work. Even better: it turned out to be easy to provide backwards compatibility code so that applications still using the old layout continue to run, but start using the new structure in their pickles. No need to get heated, though. I just thought that it would be a good time to start thinking about this option again. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Fri Jul 12 21:11:29 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 16:11:29 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <20020712194954.GA18925@panix.com> Message-ID: [Aahz] > I've used "%20s" * 5 frequently enough in the past to do crude tables. > That's not a feature I'd like to lose. So has Guido -- he'll remember that before it's too late . Ditto "-" to switch string justification. Prediction: the $(name:optional_format) notation will win in the end. From tim.one@comcast.net Fri Jul 12 21:15:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 16:15:42 -0400 Subject: [Python-Dev] Re: Python version of PySlice_GetIndicesEx In-Reply-To: <20020712222326.A10011@hishome.net> Message-ID: Just to be helpfully irritating, I'll note that Zope's C implementation of slice index normalization for BTreeItems objects was off in nearly every way possible, until a few weeks ago. It really is difficult to get this right. From gmcm@hypernet.com Fri Jul 12 21:26:29 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 12 Jul 2002 16:26:29 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: References: <20020712194954.GA18925@panix.com> Message-ID: <3D2F0335.25365.5FE8BAB@localhost> On 12 Jul 2002 at 16:11, Tim Peters wrote: [Aahz] > I've used "%20s" * 5 frequently enough in the past to > do crude tables. That's not a feature I'd like to > lose. [Tim] > So has Guido -- he'll remember that before it's too > late . Ditto "-" to switch string > justification. Prediction: the > $(name:optional_format) notation will win in the > end. Good. I use both a just enough that I'd really miss them, but not frequently enough to remember exactly what each modifier does what with each data type. -- Gordon http://www.mcmillan-inc.com/ From mal@lemburg.com Fri Jul 12 21:31:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 22:31:41 +0200 Subject: [Python-Dev] Alternative implementation of interning, take 2 References: Message-ID: <3D2F3CAD.801@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>If you could spell out what exactly you mean by "indirect interning" >>that would help. > > > Actually, I don't think it would -- the issue is whether the possibility for > the ob_sinterned member of a PyStringObject not to *be* the string object > itself ever saves time in your extensions, and it's darned hard to guess > that. If you apply the attached patch to current CVS, though, it will tell > you whenever your code benefits from it. Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3 though ;-) > AFAICT, there are only 3 routines where it *might* save cycles (but note > that checking for the possibility costs cycles whether or not it pays; it's > a net loss when it doesn't pay): > > + PyDict_SetItem: I believe this is the only real possibility for gain. If > it ever helps you here, the patch arranges to print > > ii paid on a setitem Scanning the source code: I hardly use PyDict_SetItem(); most usages are PyDict_SetItemString(). > to stderr whenever it does pay. I haven't yet seen that get printed. > > + PyString_InternInPlace: Whenever it pays here, the patch spits > > ii paid on an InternInPlace I do use this API, but only in mxURL and mxXMLTools (which is closed source and works with the evil code below I mentioned ;-). > That triggers 6 times in the Python test suite, all from test_descr. Since > this one is an optimization *of* setting ob_sinterned, it's a > snake-eating-its-tail kind of thing -- it's of no real benefit unless > ob_sintered pays off somewhere else too. > > + string_hash: The patch spits > > ii paid on a hash??? > > The question marks are there because I don't see how it's possible for this > to get printed. > > >>What I do need and rely on is the fact that the >>Python compiler interns all constant strings and identifiers in >>Python programs. This makes switching like so: > > > Ya, while that's evil, it's not affected by indirect interning. Cool :-) If Guido should ever decide to rip this out, I can always switch to a different technique, e.g. use my own interning token type. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Fri Jul 12 21:33:51 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 16:33:51 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> Message-ID: {Guido, to Scott Gilbert] > It seems we're still in the same boat. It would be saner to change > buffer slices to return buffer objects, except for backward > compatibility. I was hoping to hear from someone who uses buffer > objects and knows that this would break his code. Raymond did a survey on c.l.py, asking anyone who used buffer objects at *all* to speak up. IIRC, he got no replies. On Python-Dev, apart from musing whether they might conceivably use them, the only person who eventually said they actually used them was Marc-Andre. Fredrik pressed for details, but we haven't seen any concrete use cases. In the absence of the latter, it's impossible to guess what would be backward compatible for MAL's purposes. > ... > Maybe we should do something stronger, and deprecate the buffer type > altogether. I told everyone you forgot the essay you wrote suggesting this the last time this rose above everyone's pain threshold. It's a comfort to know that my channeling powers have not diminished with exponentially advancing age : http://mail.python.org/pipermail/python-dev/2000-October/009974.html From guido@python.org Fri Jul 12 21:37:41 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 16:37:41 -0400 Subject: [Python-Dev] python package In-Reply-To: Your message of "Fri, 12 Jul 2002 22:13:11 +0200." <3D2F3857.3010700@lemburg.com> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> <3D2F3857.3010700@lemburg.com> Message-ID: <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net> > > I think you could get a long way with minor changes along the lines of > > making site-packages a package itself. > > This wouldn't work since in that case you'd have the problem > of having to fix class names in e.g. pickles for objects > which you don't know anything about. We do know about objects > in the Python standard lib, so we could take care to have mechanisms > like pickle deal with them properly. IOW you're suggesting we do a near-infinite amount of work to the core just so that others can be sloppy in their choice of names for their modules. Bah. > I believe that the more Python grows (not only the core, > but the complete set of available modules and packages in > the Python universe), the less likely we are going to > hit a problem. I would say, OK, so it will go away by itself, but I guess you made a typo there, and really meant "the more likely...". :-) But making the core go away doesn't reduce the problem enough: the more likely problem is two 3rd parties unaware of each other each picking the same name. > > I'm rejecting the proposal of a single top-level package named "python". > > You've written that before, but you still haven't given any > explanation of why a single package would be worse than a > multi-level hierarchy of modules (e.g. grouped by application > space). Because a single package doesn't have any other benefits besides getting out of the way from 3rd party developers. At least a proper hierarchy would have the other benefits of grouping. (But better make it a shallow hierarchy! remember "Flat is better than nested.") > I think that simply moving to one package would cause less > breakage and make the whole transition process much easier > than having to tweak code into using some complicated > multi-package structure. Given that you now want us to add special counter-measure to pickle, I doubt that very much. > FWIW, I've been through all this with the mx packages > and using a single new package caused the least amount > of work. Even better: it turned out to be easy to provide > backwards compatibility code so that applications still > using the old layout continue to run, but start using the > new structure in their pickles. So it's no big deal for 3rd party developers to do what they should do to deal with this problem. Good to hear. Given that when we change the standard library, *every* Python user (and developer) is affected, I prefer the status quo. > No need to get heated, though. I just thought that it would > be a good time to start thinking about this option again. And this would be a good time to end this thread. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 21:39:12 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 16:39:12 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: Your message of "Fri, 12 Jul 2002 22:31:41 +0200." <3D2F3CAD.801@lemburg.com> References: <3D2F3CAD.801@lemburg.com> Message-ID: <200207122039.g6CKdD314156@pcp02138704pcs.reston01.va.comcast.net> > If Guido should ever decide to rip this out, I can always switch > to a different technique, e.g. use my own interning token type. Why wait? Rip it out now! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 12 21:41:29 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 16:41:29 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 16:33:51 EDT." References: Message-ID: <200207122041.g6CKfTt14197@pcp02138704pcs.reston01.va.comcast.net> > Raymond did a survey on c.l.py, asking anyone who used buffer objects at > *all* to speak up. IIRC, he got no replies. On Python-Dev, apart from > musing whether they might conceivably use them, the only person who > eventually said they actually used them was Marc-Andre. Fredrik pressed for > details, but we haven't seen any concrete use cases. In the absence of the > latter, it's impossible to guess what would be backward compatible for MAL's > purposes. > > > ... > > Maybe we should do something stronger, and deprecate the buffer type > > altogether. > > I told everyone you forgot the essay you wrote suggesting this the last time > this rose above everyone's pain threshold. It's a comfort to know that my > channeling powers have not diminished with exponentially advancing age > : > > http://mail.python.org/pipermail/python-dev/2000-October/009974.html But at least I didn't change my mind. :-) So let's deprecate buffer(). I also suggest to roll back Raymond's changes to make slices more consistent -- there's no point in changing something that's only kept for backwards compatibility reasons. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jul 12 21:39:23 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 22:39:23 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> <3D2F0AC1.508@lemburg.com> <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F3E7B.5030101@lemburg.com> Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>I'm a little surprised. Raymond Hettinger checked in a change that >>>makes all slices of buffer objects return strings. His comments on SF >>>bug 546434 say that only one person replied and that they agreed >>>returning strings was the better solution. But that's not how I read >>>the only response to his query that I see in python-dev, from Scott >>>Gilbert: >> >>Interesting. I must have skipped that message. > > > You blink, and you find that the world has changed. Indeed :-) >>IMHO, all slices of buffer object should return buffer objects, >>but since all Python releases return strings, I guess this is too >>late to change. > > > That was my preference too, but Raymond disagreed and somehow tried to > find support for his position :-). > > Since buffer objects (of course :-) support the C-level buffer > protocol, they can still be used in most places where strings are > needed. But it would be incompatible. But so is Raymond's solution > (because it changes buffer()[:] to also return a string). > >>Note that the only case where a buffer object >>is returned in Python 2.x (x < 3) is if you write >>buffer()[:], i.e. you want a copy of the buffer object. > > What does a copy of a buffer object buy you? Nothing... since you only get a new reference, not an independent copy. > It's not too late to revert Raymond's changes. Why not try the buffer slice returns buffer logic for a few alphas, then betas, and then if noone complains the final release ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jul 12 21:45:39 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 12 Jul 2002 16:45:39 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Fri, 12 Jul 2002 22:39:23 +0200." <3D2F3E7B.5030101@lemburg.com> References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> <3D2F0AC1.508@lemburg.com> <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net> <3D2F3E7B.5030101@lemburg.com> Message-ID: <200207122045.g6CKje914283@pcp02138704pcs.reston01.va.comcast.net> > Why not try the buffer slice returns buffer logic for > a few alphas, then betas, and then if noone complains > the final release ? Since nobody cares, we won't get complaints. But it's a waste of time. I'm going to deprecate it. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 12 21:48:09 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 16:48:09 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <3D2F3CAD.801@lemburg.com> Message-ID: [M.-A. Lemburg, on my "does ii help at all?" ii patch] > Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3 > though ;-) Your codebase doesn't run under current CVS? If so, I would have guessed you would have mentioned that before this . > Scanning the source code: I hardly use PyDict_SetItem(); most usages > are PyDict_SetItemString(). That's why you shouldn't try to guess. The latter calls the former, and the real target here is actually indirect optimization of different ways to spell setattr. They all end up in PyDict_SetItem; it doesn't matter whether you call that directly. >> + PyString_InternInPlace: Whenever it pays here, the patch spits >> >> ii paid on an InternInPlace > I do use this API, but only in mxURL and mxXMLTools (which is > closed source and works with the evil code below I mentioned ;-). As mentioned before, the optimization in this doesn't do you any good overall unless it triggers in PyDict_SetItem() later. If it doesn't trigger in the latter, your code will run faster overall if we removed the optimization from PyString_InternInPlace (although probably not measurably faster in this routine; a never-pays anti-optimization in PyDict_SetItem is a much more serious matter). >> Ya, while that's evil, it's not affected by indirect interning. > Cool :-) > > If Guido should ever decide to rip this out, He won't, but it's quite likely to either not do you any good, or actually do you harm, in an alternate implementation of Python (e.g., I doubt-- but don't know --that Jython bothers with this_). > I can always switch to a different technique, e.g. use my own interning > token type. Or you could call intern() explicitly. That's what I usually do. IF_TOKEN, ELSE_TOKEN, ... = map(intern, "if else ...". split()) From mclay@nist.gov Fri Jul 12 21:46:13 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 12 Jul 2002 16:46:13 -0400 Subject: [Python-Dev] python package In-Reply-To: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207121431.30722.mclay@nist.gov> <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207121646.13992.mclay@nist.gov> On Friday 12 July 2002 02:42 pm, Guido van Rossum wrote: > [me] ... > Uh? Who is proposing to name it "new"? Not me! Maybe you should > read the entire thread again? :-) Ok, I guess I'm just a bit more confused than usual today. I had also read the following message and made the unfortunate assumption that you were proposing "new" as the name of a new top level module to contain all the standard python modules. Opps I merged the threads in my head. On Friday 12 July 2002 11:51 am, Guido van Rossum wrote: > > > If we need a place to name types that don't deserve being builtins, > > > perhaps new.py is a better place? > > > > The new. prefix is natural enough for > > > > m = new.module('name') > > > > type but it looks pretty awkward in > > > > if isinstance(obj, new.generator): > > > > What's the meaning of 'new' in this context? > > Sometimes you ask too many questions. :-) > > Let's just say that this is a historically available name. I don't > expect that isinstance(obj, generator) is a very common question to > ask, so I don't mind if you have to ask it in a somewhat awkward way. Now back to the issue of moving all the top level names in the standard distribution into a "python" namespace. For the remainder of the 2.X release cycle it is important to not remove the existing names from the top level namespace. However, it might be reasonable to move all standard distribution names into a single top level namespace and grandfather the existing top level names into the top level namespace for the remainder of the 2.x series. The existing set of names would be available from either namespace. All new names for the standard distribution would only be placed in the new top level standard package namespace. With this approach all old names would still be accessible to the existing code base as top level names and introducing new names to the standard distribution will not clobber third party modules and packages. For the remainder of 2.X the rules will be messy because some standard names will be accessible from either the top level namespace or from the standard "python" namespace. Then for Python 3.0 the grandfathered names would be removed from the top level namespace. This approach should enable a smoother transition in the documentation and coding practices. The preferred coding style guide, the tutorial, and other documentation would be used to explain the transition plan. The new guidelines would promote the use of the new namespace for all cases, but it would not preclude the use of the older coding style. I"m not keen on the use the name "python" for the top level namespace. Perhaps the name "std" would be more desirable (and shorter to type). From mal@lemburg.com Fri Jul 12 21:41:42 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 22:41:42 +0200 Subject: [Python-Dev] Incompatible changes to xmlrpclib References: <3D240FF2.3060708@lemburg.com> Message-ID: <3D2F3F06.1060800@lemburg.com> Any news on this one ? M.-A. Lemburg wrote: > I noticed yesterday that the xmlrcplib.py version in CVS > is incompatible with the version in Python 2.2: all the > .dump_XXX() interfaces changed and now include a third > argument. > > Since the Marshaller can be subclassed, this breaks all > existing application space subclasses extending or changing > the default xmlrpclib behaviour. > > I'd opt for moving back to the previous style of calling the > write method via self.write. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jul 12 21:54:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 22:54:11 +0200 Subject: [Python-Dev] Fw: Behavior of buffer() References: Message-ID: <3D2F41F3.9070805@lemburg.com> Tim Peters wrote: > {Guido, to Scott Gilbert] > >>It seems we're still in the same boat. It would be saner to change >>buffer slices to return buffer objects, except for backward >>compatibility. I was hoping to hear from someone who uses buffer >>objects and knows that this would break his code. > > > Raymond did a survey on c.l.py, asking anyone who used buffer objects at > *all* to speak up. IIRC, he got no replies. On Python-Dev, apart from > musing whether they might conceivably use them, the only person who > eventually said they actually used them was Marc-Andre. Fredrik pressed for > details, but we haven't seen any concrete use cases. In the absence of the > latter, it's impossible to guess what would be backward compatible for MAL's > purposes. For my purposes, the strategy buffer slice returns a buffer would be more appropriate because it would save the buffer type information across the slicing operation... I mean, you don't want to get bananas when you slice an apple in real life either ;-) I use buffers to mean: this is a chunk of binary data. The purpose is to recognize this type of data for pickling via xml-rpc, soap and other rpc mechanisms etc. Strings don't provide this information (since they can be a mix of text and binary data). Buffers are compatible enough with most tools working on strings that they represent a good alternative to tag data as being binary while not losing all the nice advantages of strings. The downside is that most of these tools return their results as strings :-( Now it would be nice if at least the type itself would behave in a sane way. >>Maybe we should do something stronger, and deprecate the buffer type >>altogether. > > > I told everyone you forgot the essay you wrote suggesting this the last time > this rose above everyone's pain threshold. It's a comfort to know that my > channeling powers have not diminished with exponentially advancing age > : > > http://mail.python.org/pipermail/python-dev/2000-October/009974.html Oh yeah, that was during the Unicode implementation wars... :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jul 12 22:04:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 12 Jul 2002 23:04:15 +0200 Subject: [Python-Dev] Alternative implementation of interning, take 2 References: Message-ID: <3D2F444F.8030203@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg, on my "does ii help at all?" ii patch] > >>Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3 >>though ;-) > > > Your codebase doesn't run under current CVS? If so, I would have guessed > you would have mentioned that before this . I don't test against the current CVS -- no time for that. >>Scanning the source code: I hardly use PyDict_SetItem(); most usages >>are PyDict_SetItemString(). > > > That's why you shouldn't try to guess. The latter calls the former, and the > real target here is actually indirect optimization of different ways to > spell setattr. They all end up in PyDict_SetItem; it doesn't matter whether > you call that directly. Sure, but SetItemString() does some extra magic: it interns the key for me. >>>+ PyString_InternInPlace: Whenever it pays here, the patch spits >>> >>> ii paid on an InternInPlace >> > >>I do use this API, but only in mxURL and mxXMLTools (which is >>closed source and works with the evil code below I mentioned ;-). > > > As mentioned before, the optimization in this doesn't do you any good > overall unless it triggers in PyDict_SetItem() later. If it doesn't trigger > in the latter, your code will run faster overall if we removed the > optimization from PyString_InternInPlace (although probably not measurably > faster in this routine; a never-pays anti-optimization in PyDict_SetItem is > a much more serious matter). I only use PyString_InternInPlace() on strings which will be used as dict keys or for string compares in tokenizers and parsers. >>>Ya, while that's evil, it's not affected by indirect interning. >> > >>Cool :-) >> >>If Guido should ever decide to rip this out, > > > He won't, but it's quite likely to either not do you any good, or actually > do you harm, in an alternate implementation of Python (e.g., I doubt-- but > don't know --that Jython bothers with this_). Jaja... as soon as PEP 275 is implemented I won't have to worry any more :-) >>I can always switch to a different technique, e.g. use my own interning >>token type. > > > Or you could call intern() explicitly. That's what I usually do. > > IF_TOKEN, ELSE_TOKEN, ... = map(intern, "if else ...". split()) True, but Python's compiler already does this for me. You right, though, I should make this explicit... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From bernie@3captus.com Fri Jul 12 21:56:11 2002 From: bernie@3captus.com (Bernard Yue) Date: Fri, 12 Jul 2002 14:56:11 -0600 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> <3D2F2E0D.2C4FD92F@3captus.com> <200207122000.g6CK0Lw13863@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F426A.E71B489B@3captus.com> Guido van Rossum wrote: > > To distinguish a timeout error, the caller can check s->sock_timeout > > when a non-blocking mode error occured, or just return an error code > > from internal_select() (I guess you must have your reason to taken it > > out in the first place) > > I don't understand your first suggestion. Not all errors mean that > the timeout triggered! > For example, when accept() fail with error code EAGAIN and s->sock_timeout = 5.0, it indicates timeout. Same for connect() fail with EINPROGRESS. Anyway, on second thought, it is messy. > > > > How about the following (assume we have socket.setDefaultTimeout()): > > > > import socket > > import urllib > > > > socket.setDefaultTimeout(5.0) > > retry = 0 > > url = 'some url' > > > > while retry < 3: > > try: > > file = urllib.urlretrieve(url) > > except socket.TimeoutError: > > if retry == 2: > > print "Server too busy, given up!" > > raise > > else: > > print "Server busy, retry!" > > retry += 1 > > else: > > break > > > > MS IIS behave strangely to http request. When the server is very busy, > > it will randomly drop some requests without disconnecting the client. > > So the best approach for the client is to timeout and retry. I guess > > that might be the reason why people needed timeoutsocket in the first > > place. > > One of the reasons (there are lots of reasons why a connect or receive > attempt may be very slow to time out, or even never time out). > > Of course, this stll doesn't distinguish between a timeout from > connect() and one from recv(). > I think you are right on the point. Client might not care if the call is timeouted on connect() or recv(). In this case a timeout error comes handy. > Have you ever written code like this? > Yes I did. > > I am struggling with the test case for the new socket code. The timeout > > test case I've send you works with the old socketmodule.c (attached), > > but not with the lastest version (on linux or windows). It's strange, > > your new implementation looks much cleaner. > > No need to attach copies of old versions -- just give me the CVS > revision number. :-) > socketmodule.c version 1.225 socketmodule.h version 1.7 > > Please bear with me a bit longer for a patch :.( > > OK. > > Anyway, I have no time to play with this right now, so I'm glad you > aren't giving up just yet. :-) > It is very painful indeed (Tim was so right). > --Guido van Rossum (home page: http://www.python.org/~guido/) Bernie From gmcm@hypernet.com Fri Jul 12 22:32:12 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 12 Jul 2002 17:32:12 -0400 Subject: [Python-Dev] python package In-Reply-To: <200207121646.13992.mclay@nist.gov> References: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F129C.12231.63AB67B@localhost> On 12 Jul 2002 at 16:46, Michael McLay wrote: > ... it might be reasonable to move all standard > distribution names into a single top level namespace > and grandfather the existing top level names into > the top level namespace for the remainder of > the 2.x series. Getting from import urllib and import urllib to return the same (is, not equals) object will require very delicate surgery on some very difficult code. And without it, most non-trivial scripts will break in very mysterious ways. -- Gordon http://www.mcmillan-inc.com/ From tim.one@comcast.net Fri Jul 12 22:33:18 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 17:33:18 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <3D2F444F.8030203@lemburg.com> Message-ID: [MAL] > Sure, but SetItemString() does some extra magic: it interns the > key for me. As a directly interned string. Indirect interning is irrelevant to this benefit. Don't argue about this, run the patched code : it will tell you directly whether ii is doing you any good. > ... > I only use PyString_InternInPlace() on strings which will be > used as dict keys or for string compares in tokenizers and > parsers. Again it doesn't really matter when you call it; if the indirect interning optimization is doing you any good, it will be because of stuff Python is doing under the covers. From tim.one@comcast.net Fri Jul 12 22:38:32 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 17:38:32 -0400 Subject: [Python-Dev] python package In-Reply-To: <200207121646.13992.mclay@nist.gov> Message-ID: [Michael McLay] > ... > I had also read the following message and made the unfortunate > assumption that you were proposing "new" as the name of a new top level > module to contain all the standard python modules. Note that "new" is already the name of a top-level module, and has been for years. That other thread was about drawing useless distinctions between the already-existing "new" and "types" modules with respect to where to house new type names that nobody needs <0.9 wink>. From fredrik@pythonware.com Fri Jul 12 22:53:16 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 12 Jul 2002 23:53:16 +0200 Subject: [Python-Dev] Alternative implementation of interning, take 2 References: Message-ID: <052101c229ee$880465a0$0900a8c0@spiff> tim wrote: > It would help if you could get Marc-Andre and /F to pronounce on = whether > their code benefits from it -- they're the most prolific extension = authors > we've got. no problem here, from what I can tell. we can live with or without this change. From barry@zope.com Sat Jul 13 00:22:01 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 12 Jul 2002 19:22:01 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> <200207121324.21609.mclay@nist.gov> <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15663.25753.999787.858627@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> The real issues are IMO: I've added these to the PEP, thanks. -Barry From tim.one@comcast.net Sat Jul 13 02:57:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 21:57:33 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207122041.g6CKfTt14197@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > But at least I didn't change my mind. :-) I would not have pointed out your previous position if you had . > So let's deprecate buffer(). I also suggest to roll back Raymond's > changes to make slices more consistent -- there's no point in changing > something that's only kept for backwards compatibility reasons. I expect Raymond will be agreeable, but he announced he'll be missing in action for about another month. If rollback can wait, I prefer that to electing me to do it just because I replied <0.9 wink>. From tim.one@comcast.net Sat Jul 13 03:15:19 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 22:15:19 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <3D2F41F3.9070805@lemburg.com> Message-ID: [Tim] > Fredrik pressed for details, but we haven't seen any concrete use cases. > In the absence of the latter, it's impossible to guess what would be > backward compatible for MAL's purposes. [M.-A. Lemburg] > For my purposes, the strategy buffer slice returns a buffer > would be more appropriate because it would save the buffer type > information across the slicing operation... I mean, you don't > want to get bananas when you slice an apple in real life either ;-) > > I use buffers to mean: this is a chunk of binary data. The purpose > is to recognize this type of data for pickling via xml-rpc, > soap and other rpc mechanisms etc. How do you use buffers? Do you stick to their C API? Do you use the Python-level buffer() function? If the latter, what do you do in Python code with a buffer object after you get one? The only use I've seen made of a buffer object in Python code is as a way to trick the interpreter into crashing (via recycling the memory the buffer object points to). And from where do you get a buffer? There are darned few types in Python that buffer() accepts as an argument. Do your extension types implement tp_as_buffer? I'm blindly casting for a reason why your appreciation of the buffer object seems unique. > Strings don't provide this information (since they can be a mix of > text and binary data). Buffers are compatible enough with most tools > working on strings that they represent a good alternative to tag data > as being binary while not losing all the nice advantages of > strings. The downside is that most of these tools return their > results as strings :-( > > Now it would be nice if at least the type itself would behave in a > sane way. Overall, this reinforces the repeated observation that we don't know why the buffer object exists -- it doesn't appear to do what you really want, but you've found some way to get it to do part of what you want, up until the point you actually use it <0.7 wink>. From tim.one@comcast.net Sat Jul 13 03:23:00 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 12 Jul 2002 22:23:00 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: <052101c229ee$880465a0$0900a8c0@spiff> Message-ID: [Tim, to Oren] > It would help if you could get Marc-Andre and /F to pronounce on whether > their code benefits from it -- they're the most prolific extension > authors we've got. [/F] > no problem here, from what I can tell. we can live with or > without this change. Note that there are (at least) two parts to Oren's agenda: 1. Removing the possibility for indirect interning. 2. Making interned strings mortal, via the usual refcount rules. In context, I was asking only about #1, and I'm sure your reply was meant to include #1. What I remain unclear about is whether you've also got no fear of #2. I'm also wondering whether we somehow broke indirect interning since it was introduced -- so far nobody has found a program or extension module where it even triggers (not counting the 6 instances in the Python test suite in intern-in-place, since no use of the indirect interning was made in those cases). From tim.one@comcast.net Sat Jul 13 08:12:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 03:12:44 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > Let's do a set module instead. There's only one hurdle to take > for a set module, and that's the issue with using mutable sets as > keys. Let's just pick one solution and implement it (my favorite > being that sets simply cannot be used as keys, since it's the > simplest, and matches dicts and lists). I want a set module, but if I finish Greg's abandoned work I want sets of sets too. Sets don't have "keys", they're conceptually collections of values, and it would be as odd not to allow sets containing sets as not to allow lists containing lists, or to ban dicts as dict values. Greg needed sets of sets for his work, and I've often faked them too. I'm not going to be paralyzed by that combining mutable sets with sets of sets requires that some uses of set-as-set-element will be expensive, fragile, and/or hard to explain. If you don't want that pain, don't play that game. If you do want sets of sets, though, and aren't willing to live with a purely functional (immutable) set type, it's non-trivial to implement correctly -- I don't want to leave it as a term project for the reader. There's also the Zope BTrees idea of sets of sets: >>> s1 = OISet() >>> s1 = OISet(range(10)) >>> s1.keys() [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> s2 = OISet([5]) >>> s2.keys() [5] >>> s1.insert(s2) 1 >>> s2 in s1 1 >>> OISet([5]) in s1 0 >>> That is, like sets of sets in Icon too, this is a notion of inclusion by object identity (although Icon does that on purpose, while the BTree-based set mostly inherits it from that BTrees don't implement any comparison slots). That's very easy to implement. It's braindead if you think of sets as collections of values, but that's what taking pain too seriously leads to. From aleax@aleax.it Sat Jul 13 09:51:59 2002 From: aleax@aleax.it (Alex Martelli) Date: Sat, 13 Jul 2002 10:51:59 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: References: Message-ID: On Saturday 13 July 2002 09:12 am, Tim Peters wrote: > [Guido] > > > ... > > Let's do a set module instead. There's only one hurdle to take > > for a set module, and that's the issue with using mutable sets as > > keys. Let's just pick one solution and implement it (my favorite > > being that sets simply cannot be used as keys, since it's the > > simplest, and matches dicts and lists). > > I want a set module, but if I finish Greg's abandoned work I want sets of > sets too. Sets don't have "keys", they're conceptually collections of > values, and it would be as odd not to allow sets containing sets as not to > allow lists containing lists, or to ban dicts as dict values. Greg needed > sets of sets for his work, and I've often faked them too. I'm not going to I agree that having sets without having sets of sets would not be anywhere as useful. > be paralyzed by that combining mutable sets with sets of sets requires that > some uses of set-as-set-element will be expensive, fragile, and/or hard to > explain. If you don't want that pain, don't play that game. If you do What about the following compromise: there are two set types, ImmutableSet and MutableSet, with a common supertype Set. ImmutableSet adds __hash__, while MutableSet adds insert and remove, to the common core of methods inherited from Set, such as __contains__ and __iter__. It's easy to make a MutableSet instance m from an ImmutableSet instance x, such that m == x, either by letting each __init__ accept an argument of the other kind (maybe just a special case of such an __init__ accepting any iterable), or, if that can afford very substantial performance improvements, via ad-hoc methods. The second part of the puzzle is that hash(x) tries to adapt x to the Hashable protocol before calling x.__hash__. Types that are already hashable adapt to Hashable by just returning the same instance, of course. A MutableSet instance adapts to Hashable by returning the equivalent ImmutableSet. Since it's apparently too wild an idea to say "adapt to protocol" when one means "adapt to protocol", at least for the next few releases (and that, in the optimistic hypothesis that my future rewrite of the adaptation PEP is favorably received), there will of course need to arise yet another special purpose way to express this same general idea, such as: class MutableSet(Set): ... def insert(self, item): try: item = item.asSetItem() except AttributeError: pass self.data[item] = True def asSetItem(self): return ImmutableSet(self) or the like. Alex From martin@v.loewis.de Sat Jul 13 10:25:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Jul 2002 11:25:40 +0200 Subject: [Python-Dev] PEP 11: unsupported platforms Message-ID: Following a recent discussion of the introduction of new platforms (AtheOS in this case), I've written a PEP on removing support for platforms that nobody is interested. If you find that specific platforms should be moved to "unsupported" status as well, please let me know. Likewise, if you think that some of the platforms I recommend to unsupport should see continued support, let me know as well. In this case, it would be good if you could name a user of Python on this platform. Regards, Martin PEP: 11 Title: Unsupported Platforms Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/07/12 22:31:47 $ Author: martin@v.loewis.de (Martin v. L=F6wis) Status: Active Type: Informational Created: 07-Jul-2002 Post-History: 07-Jul-2002 Abstract This PEP documents operating systems (platforms) which are not supported in Python anymore. For some of these systems, supporting code might be still part of Python, but will be removed in a future release - unless somebody steps forward as a volunteer to maintain this code. Rationale Over time, the Python source code has collected various pieces of platform-specific code, which, at some point in time, was considered necessary to use Python on a specific platform. Without access to this platform, it is not possible to determine whether this code is still needed. As a result, this code may either break during the Python evolution, or it may become unnecessary as the platforms evolve as well. The growing amount of these fragments poses the risk of unmaintainability: without having experts for a large number of platforms, it is not possible to determine whether a certain change to the Python source code will work on all supported platforms. To reduce this risk, this PEP proposes a procedure to remove code for platforms with no Python users. Unsupporting platforms If a certain platform that currently has special code in it is deemed to be without Python users, a note must be posted in this PEP that this platform is not longer actively supported. This note must include: - the name of the system - the first release number that does not support this platform anymore, and - the first release where the historical support code is actively removed In some cases, it is not possible to identify the specific list of systems for which some code is used (e.g. when autoconf tests for absence of some feature which is considered present on all supported systems). In this case, the name will give the precise condition (usually a preprocessor symbol) that will become unsupported. At the same time, the Python source code must be changed to produce a build-time error if somebody tries to install Python on this platform. On platforms using autoconf, configure must fail. This gives potential users of the platform a chance to step forward and offer maintenance. Resupporting platforms If a user of a platform wants to see this platform supported again, he may volunteer to maintain the platform support. Such an offer must be recorded in the PEP, and the user can submit patches to remove the build-time errors, and perform any other maintenance work for the platform. Unsupported platforms Name: SunOS 4 Unsupported in: Python 2.3 Code removed in: Python 2.4 Name: DYNIX Unsupported in: Python 2.3 Code removed in: Python 2.4 Name: dgux Unsupported in: Python 2.3 Code removed in: Python 2.4 Name: Systems defining __d6_pthread_create (configure.in) Unsupported in: Python 2.3 Code removed in: Python 2.4 Name: Systems defining PY_PTHREAD_D4, PY_PTHREAD_D6, or PY_PTHREAD_D7 in thread_pthread.h Unsupported in: Python 2.3 Code removed in: Python 2.4 Copyright This document has been placed in the public domain. From oren-py-d@hishome.net Sat Jul 13 12:04:09 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 13 Jul 2002 07:04:09 -0400 Subject: [Python-Dev] Alternative implementation of interning, take 2 In-Reply-To: References: <052101c229ee$880465a0$0900a8c0@spiff> Message-ID: <20020713110409.GA72037@hishome.net> On Fri, Jul 12, 2002 at 10:23:00PM -0400, Tim Peters wrote: > Note that there are (at least) two parts to Oren's agenda: > > 1. Removing the possibility for indirect interning. > > 2. Making interned strings mortal, via the usual refcount rules. In fact, #1 is only "indirectly" on my agenda. My goal was making interned strings mortal and indirectly interned strings kept messing up the reference counts so I ripped them out after I found out that they're not effective in the core. The current version of my patch supports both both mortal and immortal interned strings for backward compatibility. Anything that is silently interned by the Python core uses mortal interned strings. Explicit calls from Python code or extensions get immortal strings because they might depend on this behavior. Oren From guido@python.org Sat Jul 13 13:27:46 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 08:27:46 -0400 Subject: [Python-Dev] PEP 11: unsupported platforms In-Reply-To: Your message of "Sat, 13 Jul 2002 11:25:40 +0200." References: Message-ID: <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net> > Following a recent discussion of the introduction of new platforms > (AtheOS in this case), I've written a PEP on removing support for > platforms that nobody is interested. Did you post this to python-list too? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 13 13:34:57 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 08:34:57 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Your message of "Sat, 13 Jul 2002 03:12:44 EDT." References: Message-ID: <200207131234.g6DCYvj17144@pcp02138704pcs.reston01.va.comcast.net> > I want a set module, but if I finish Greg's abandoned work I want sets of > sets too. Sets don't have "keys", they're conceptually collections of > values, and it would be as odd not to allow sets containing sets as not to > allow lists containing lists, or to ban dicts as dict values. IMO it's no odder than disallowing dicts as dict keys: it's a hack that allows a much faster implementation. > That is, like sets of sets in Icon too, this is a notion of inclusion by > object identity (although Icon does that on purpose, while the BTree-based > set mostly inherits it from that BTrees don't implement any comparison > slots). That's very easy to implement. It's braindead if you think of sets > as collections of values, but that's what taking pain too seriously leads > to. I don't think it is acceptable to have sets-of-sets but test for membership (in that case) by object identity. If you really think object identity is all that's needed, I suggest we stick to disallowing sets of sets; algorithms needing sets-of-set-object-identities can use id() on the inner sets. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Sat Jul 13 13:35:24 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 13 Jul 2002 08:35:24 -0400 Subject: [Python-Dev] Re: python package In-Reply-To: <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net> References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net> <3D2F3857.3010700@lemburg.com> <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > And this would be a good time to end this thread. :-) Agreed. Yet, allow me for a tiny suggestion, that could solve the stated problem at a simple cost. Suffice to choose, then announce a convention about a set of names which the Python distribution agrees to never use. It could be anything. Like, Python could guarantee that it will never ever install a standard module with a name starting with capital `W', say. If a user wants to make absolutely sure his/her module does not and will not conflict with a standard module, just prepend a `W' to its name. It is likely that people will rarely resort to this convention, but it will be there for the paranoid, and should be easy to support. Yet, it will not solve the paranoia of users against the package name of each other. If we have been many years ago, the convention I would have preferred is that Python never uses any capital letter as the first letter of a module, but it seems to be a little late for this, and I'm not so sure of the benefit. :-) The most python could say from some `from python import ...' or a `W' convention is that it gets itself out of the name fight between users, it does not participate into it. it does not really solve the problem, anyway. I guess you are right, in that whatever the direction taken, this thread is probably doomed to fall into various dead-ends. -- François Pinard http://www.iro.umontreal.ca/~pinard From David Abrahams" Check it out: int PyList_Insert(PyObject *op, int where, PyObject *newitem) { if (!PyList_Check(op)) { PyErr_BadInternalCall(); return -1; } return ins1((PyListObject *)op, where, newitem); } Since the implementation of ins1 gives the subclasses' re-implementation of insert() no chance to execute, shouldn't this check be changed to PyList_CheckExact? If not, what needs to be added to the documentation to make it clear that these functions really do subclass slicing? -Dave From guido@python.org Sat Jul 13 14:04:40 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 09:04:40 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Your message of "Sat, 13 Jul 2002 10:51:59 +0200." References: Message-ID: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net> > What about the following compromise: there are two set types, > ImmutableSet and MutableSet, with a common supertype Set. > ImmutableSet adds __hash__, while MutableSet adds insert and remove, > to the common core of methods inherited from Set, such as > __contains__ and __iter__. Reasonable. > Since it's apparently too wild an idea to say "adapt to protocol" when one > means "adapt to protocol", at least for the next few releases (and that, in > the optimistic hypothesis that my future rewrite of the adaptation PEP is > favorably received), there will of course need to arise yet another special > purpose way to express this same general idea, such as: > > > class MutableSet(Set): > ... > def insert(self, item): > try: item = item.asSetItem() > except AttributeError: pass > self.data[item] = True > > def asSetItem(self): > return ImmutableSet(self) > > > or the like. This would run into similar problems as the PEP's auto-freeze approach when using "s1 in s2". If s1 is a mutable set, this creates an immutable copy for the test and then throws it away. The PEP's problem is that it's too easy to accidentally freeze a set; the problem with your proposal is "merely" one of performance. Yet I think both are undesirable, although I still prefer your solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 13 14:34:19 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 09:34:19 -0400 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: Your message of "Wed, 03 Jul 2002 07:07:35 EDT." <20020703110735.GA50268@hishome.net> References: <20020703095915.GA43336@hishome.net> <20020703110735.GA50268@hishome.net> Message-ID: <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net> > The warm fuzzy feeling that you have a real symbol type :-) Doesn't give me a warm fuzzy feeling at all. A symbol type is just another compiler implementation detail IMO. Strings are natural to designate identifiers. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Sat Jul 13 14:42:46 2002 From: aleax@aleax.it (Alex Martelli) Date: Sat, 13 Jul 2002 15:42:46 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net> References: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Saturday 13 July 2002 03:04 pm, Guido van Rossum wrote: ... > > What about the following compromise: there are two set types, > > ImmutableSet and MutableSet, with a common supertype Set. > > ImmutableSet adds __hash__, while MutableSet adds insert and remove, > > to the common core of methods inherited from Set, such as > > __contains__ and __iter__. > > Reasonable. ... > This would run into similar problems as the PEP's auto-freeze approach > when using "s1 in s2". If s1 is a mutable set, this creates an > immutable copy for the test and then throws it away. The PEP's > problem is that it's too easy to accidentally freeze a set; the > problem with your proposal is "merely" one of performance. Yet I > think both are undesirable, although I still prefer your solution. If performance is a problem (and I can well see it might be!) then Set.__contains__(self, x) needs to use a specialized version of the ad-hoc adaptation code I proposed for insertion: > def insert(self, item): > try: item = item.asSetItem() > except AttributeError: pass > self.data[item] = True One possible route to such optimization is to introduce another class, called _TemporarilyImmutableSet, able to wrap a MutableSet x, have the same hash value that x would have if x were immutable, and compare == to whatever x compares == to. Set would then expose a private method _asTemporarilyImmutable. ImmutableSet._asTemporarilyImmutable would just return self; MutableSet._asTemporarilyImmutable would return _TemporarlyImmutableSet(self). Then: class Set(object): ... def __contains__(self, item): try: item = item._asTemporarilyImmutable() except AttributeError: pass return item in self.data Alex From guido@python.org Sat Jul 13 14:56:06 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 09:56:06 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Your message of "Sat, 13 Jul 2002 15:42:46 +0200." References: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net> > > This would run into similar problems as the PEP's auto-freeze approach > > when using "s1 in s2". If s1 is a mutable set, this creates an > > immutable copy for the test and then throws it away. The PEP's > > problem is that it's too easy to accidentally freeze a set; the > > problem with your proposal is "merely" one of performance. Yet I > > think both are undesirable, although I still prefer your solution. > > If performance is a problem (and I can well see it might be!) then > Set.__contains__(self, x) needs to use a specialized version of > the ad-hoc adaptation code I proposed for insertion: > > > def insert(self, item): > > try: item = item.asSetItem() > > except AttributeError: pass > > self.data[item] = True > > One possible route to such optimization is to introduce another > class, called _TemporarilyImmutableSet, able to wrap a MutableSet x, > have the same hash value that x would have if x were immutable, and > compare == to whatever x compares == to. > > Set would then expose a private method _asTemporarilyImmutable. > ImmutableSet._asTemporarilyImmutable would just return self; > MutableSet._asTemporarilyImmutable would return > _TemporarlyImmutableSet(self). > > Then: > > class Set(object): > ... > def __contains__(self, item): > try: item = item._asTemporarilyImmutable() > except AttributeError: pass > return item in self.data Sounds reasonable. Who's gonna do an implementation? There's Greg Wilson's version, and there's an alternative by Aric Coady that could be used as a comparison. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Sat Jul 13 15:19:31 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 13 Jul 2002 17:19:31 +0300 Subject: [Python-Dev] Re: Alternative implementation of string interning In-Reply-To: <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 13, 2002 at 09:34:19AM -0400 References: <20020703095915.GA43336@hishome.net> <20020703110735.GA50268@hishome.net> <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020713171931.A5083@hishome.net> On Sat, Jul 13, 2002 at 09:34:19AM -0400, Guido van Rossum wrote: > > The warm fuzzy feeling that you have a real symbol type :-) > > Doesn't give me a warm fuzzy feeling at all. A symbol type is just > another compiler implementation detail IMO. Strings are natural to > designate identifiers. Making interned strings a type was just idle speculation, don't take it too seriously... Oren From aleax@aleax.it Sat Jul 13 15:58:00 2002 From: aleax@aleax.it (Alex Martelli) Date: Sat, 13 Jul 2002 16:58:00 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net> References: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Saturday 13 July 2002 03:56 pm, Guido van Rossum wrote: ... > Sounds reasonable. Who's gonna do an implementation? There's Greg > Wilson's version, and there's an alternative by Aric Coady > that could be used as a comparison. I'm gonna give it a try, unless somebody more qualified volunteers -- Greg's version's in nondist/sandbox/sets, right? Where's Aric's? What should I do with the modified set.py -- submit it as a patch, or ... ? Alex From guido@python.org Sat Jul 13 16:04:32 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 11:04:32 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Your message of "Sat, 13 Jul 2002 16:58:00 +0200." References: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net> > > Sounds reasonable. Who's gonna do an implementation? There's Greg > > Wilson's version, and there's an alternative by Aric Coady > > that could be used as a comparison. > > I'm gonna give it a try, unless somebody more qualified volunteers -- Greg's > version's in nondist/sandbox/sets, right? Where's Aric's? http://bent-arrow.com/python > > What should I do with the modified set.py -- submit it as a patch, or ... ? I forget -- do you have SF commit permission? If so, feel free to add a competing version to the sandbox. Otherwise, a SF submission would be good (and post a link to python-dev when you upload it). --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Sat Jul 13 17:07:26 2002 From: aleax@aleax.it (Alex Martelli) Date: Sat, 13 Jul 2002 18:07:26 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net> References: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Saturday 13 July 2002 05:04 pm, Guido van Rossum wrote: ... > > Greg's version's in nondist/sandbox/sets, right? Where's Aric's? > > http://bent-arrow.com/python Ah, a C implementation. It seems premature to me to consider such optimization -- for now, it appears, we're still looking around for the right architecture, and that's much more plastic and faster to experiment with in Python. So, I have not studied set.c in detail, just browsed the readme to get an idea of the interface -- and that seems even more peculiar to me than freeze-on-hashing, although generally similar. So, for now, I've stuck to Python, and I think it will be time to move to C once the Python-level part appears good. > > What should I do with the modified set.py -- submit it as a patch, or ... > > ? > > I forget -- do you have SF commit permission? If so, feel free to Nope -- I may be the only PSF member without commit permission, I suspect. > add a competing version to the sandbox. Otherwise, a SF submission > would be good (and post a link to python-dev when you upload it). Done -- it's patch 580995 (not sure how that translates to an URL -- the tracker's resulting URL is quite complicated:-). Alex From fredrik@pythonware.com Sat Jul 13 17:15:35 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 13 Jul 2002 18:15:35 +0200 Subject: [Python-Dev] Dict constructor References: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <001501c22a88$87d53150$ced241d5@hagrid> > Done -- it's patch 580995 (not sure how that translates to an URL -- > the tracker's resulting URL is quite complicated:-). just prepend http://python.org/sf/ to the patch/bug identify. the rest is magic (or perhaps barry dealing with 404 log entries in real time): http://python.org/sf/580995 From martin@v.loewis.de Sat Jul 13 18:29:42 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Jul 2002 19:29:42 +0200 Subject: [Python-Dev] PEP 11: unsupported platforms In-Reply-To: <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net> References: <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > Following a recent discussion of the introduction of new platforms > > (AtheOS in this case), I've written a PEP on removing support for > > platforms that nobody is interested. > > Did you post this to python-list too? Not yet. I'll post it on python-list when I get no more comments here, then I'll produce a patch to generate the build-time errors for the unsupported platforms. Regards, Martin From mal@lemburg.com Sat Jul 13 18:58:38 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 13 Jul 2002 19:58:38 +0200 Subject: [Python-Dev] python package References: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> <3D2F129C.12231.63AB67B@localhost> Message-ID: <3D306A4E.5050703@lemburg.com> Gordon McMillan wrote: > On 12 Jul 2002 at 16:46, Michael McLay wrote: > > >>... it might be reasonable to move all standard >>distribution names into a single top level namespace >>and grandfather the existing top level names into >>the top level namespace for the remainder of >>the 2.x series. > > > Getting > from import urllib > and > import urllib > > to return the same (is, not equals) object will > require very delicate surgery on some very difficult > code. And without it, most non-trivial scripts will > break in very mysterious ways. Not really. The following code does all it takes to make this work for e.g. having 'import DateTime' and 'from mx import DateTime' provide the same symbols: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * from mx.DateTime import __version__,_DT,_DTD The module objects would be different, but that's just about it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gmcm@hypernet.com Sat Jul 13 19:21:07 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 13 Jul 2002 14:21:07 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D306A4E.5050703@lemburg.com> Message-ID: <3D303753.27392.AB222A8@localhost> On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote: > Gordon McMillan wrote: > > Getting > > from import urllib > > and > > import urllib > > > > to return the same (is, not equals) object will > > require very delicate surgery on some very difficult > > code. And without it, most non-trivial scripts will > > break in very mysterious ways. > > Not really. The following code does all it takes to > make this work for e.g. having 'import DateTime' and > 'from mx import DateTime' provide the same symbols: [snip hackery] > The module objects would be different, but that's > just about it. Which was exactly my point. Much code that does *not* use "from ... import ..." in fact relies on having the same module object. -- Gordon http://www.mcmillan-inc.com/ From mal@lemburg.com Sat Jul 13 20:07:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 13 Jul 2002 21:07:05 +0200 Subject: [Python-Dev] python package References: <3D303753.27392.AB222A8@localhost> Message-ID: <3D307A59.5040707@lemburg.com> Gordon McMillan wrote: > On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote: > > >>Gordon McMillan wrote: > > >>>Getting >>> from import urllib >>>and >>> import urllib >>> >>>to return the same (is, not equals) object will >>>require very delicate surgery on some very difficult >>>code. And without it, most non-trivial scripts will >>>break in very mysterious ways. >> >>Not really. The following code does all it takes to >>make this work for e.g. having 'import DateTime' and >>'from mx import DateTime' provide the same symbols: > > > [snip hackery] > > >>The module objects would be different, but that's >>just about it. > > > Which was exactly my point. Much code that does > *not* use "from ... import ..." in fact relies on > having the same module object. You mean for e.g. hacking the module's globals ? To solve that, you'd probably need to manipulate sys.modules as well... I'm just not sure whether this is possible from within the module implementing the redirection. Hmm, running this: testmodload.py: import sys, os sys.modules['testmodload'] = os print 'worked' Python 2.1.3 (#1, May 16 2002, 18:59:26) >>> import testmodload worked >>> testmodload >>> Looks like this is possible, so you probably don't even need the 'from mx.DateTime import *' in the code I posted. A simple 'sys.modules['DateTime'] = mx.DateTime' would give you an even better solution. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Sat Jul 13 20:18:36 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 15:18:36 -0400 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings Message-ID: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> There's a full implementation for PEP 263. Martin von Loewis is ready to commit it. It's of course possible to let him do this and deal with the consequences once they're in CVS, I'd like to see if there's anyone who'd like to review the code before it goes in. The patch is at http://python.org/sf/534304. I like the PEP fine, I just don't have time to review the patch, and I'm not sure that review by just Martin and Hisao (the original patch author) is enough. If nobody comes forward, Martin will commit it. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Sat Jul 13 20:24:55 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 15:24:55 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <9aZX8.47377$n4.11526798@newsc.telia.net> Message-ID: A post on c.l.py raises an interesting issue, here illustrated in an all-Python example: """ class C: def __init__(self): self.i = 0 def get(self): self.i += 1 if self.i <= 5: return self.i self.i = 1 / 0 x = iter(C().get, 5) try: while 1: print x.next() except StopIteration: pass print x.next() """ That prints 1 thru 4, then dies with a ZeroDivisionError. This is because two-arg iter works as documented : The iterator created in this case [iter(o, sentinel)] will call o with no arguments for each call to its next() method; if the value returned is equal to sentinel, StopIteration will be raised, otherwise the value will be returned. The question is whether this is intentional: for all other iterators Python supplies, StopIteration is a "sink state": once an iterator raises StopIteration, calling its next() method any number of times again will just continue raising StopIteration. Python's calliterobject doesn't arrange for that, though. PEP 234 doesn't explicitly say what happens if next() is called after StopIteration has been raised, although it clearly has in mind a model where iteration eventually "ends". The use case from which two-arg iter() got generalized was iter(file.readline, "") and in that case file.readline returns "" forever after hitting EOF the first time. So in this specific case, StopIteration acts like a sink too, but for a reason that sheds no light on the question at hand. The base question: does the iteration protocol define what happens if an iterator's next() method is called after the iterator has raised StopIteration? Or is that left up to the discretion of the iterator? If the answer is that it's the iterator's choice, is 2-argument iter() making the best choice? The rub here is that 2-arg iter was (IMO) introduced to help iteration-ignorant callables fit into the iteration protocol, and *because* they're iteration-ignorant they may do something foolish if called again after their "sentinel" value is seen. From guido@python.org Sat Jul 13 20:37:06 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 15:37:06 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sat, 13 Jul 2002 15:24:55 EDT." References: Message-ID: <200207131937.g6DJb6K18523@pcp02138704pcs.reston01.va.comcast.net> > The base question: does the iteration protocol define what happens if an > iterator's next() method is called after the iterator has raised > StopIteration? Or is that left up to the discretion of the iterator? The latter. Believe it or not, I thought about this during the design of the protocol, and decided that if someone wanted to create an iterator that could somehow continue after raising StopIteration, that should be their problem. Basically, the effect of calling next() after StopIteration is raised is undefined. > If the answer is that it's the iterator's choice, is 2-argument iter() > making the best choice? The rub here is that 2-arg iter was (IMO) > introduced to help iteration-ignorant callables fit into the iteration > protocol, and *because* they're iteration-ignorant they may do something > foolish if called again after their "sentinel" value is seen. If the caller stops calling next(), nothing's wrong. I don't think the callable-iterator object should grow another state bit. But I'm willing to be convinced by information you withheld. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Sat Jul 13 20:31:53 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 13 Jul 2002 21:31:53 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <018901c22aa4$e0f41190$ced241d5@hagrid> guido wrote: > There's a full implementation for PEP 263. Martin von Loewis is ready > to commit it. It's of course possible to let him do this and deal > with the consequences once they're in CVS, I'd like to see if there's > anyone who'd like to review the code before it goes in. The patch is > at http://python.org/sf/534304. I like the PEP fine, I just don't > have time to review the patch hmm. I'm tempted to think that there's a major flaw in the PEP, caused by the fact that compile(unicode(script, extract_encoding(script))) will, from what I can tell, not compile to the same thing as: compile(script) but I've had too many holy [gr]ails [1] tonight to be sure if that's really a flaw at all... 1) see http://www.blacksheepbrewery.com/ From fredrik@pythonware.com Sat Jul 13 20:38:31 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 13 Jul 2002 21:38:31 +0200 Subject: [Python-Dev] Termination of two-arg iter() References: Message-ID: <018a01c22aa4$e1344ee0$ced241d5@hagrid> tim wrote: > The question is whether this is intentional: for all other iterators Python > supplies, StopIteration is a "sink state": once an iterator raises > StopIteration, calling its next() method any number of times again will just > continue raising StopIteration. except SRE's finditer method, that is (also reported on c.l.python) > Or is that left up to the discretion of the iterator? if you don't know, it probably is undefined, which means that SRE's finditer does the best thing possible: accept a few misakes, and then punish the poor fool who cannot follow instructions. (but to be nice, cut them a bit more slack if they're to cheap to buy a real operating system ;-) From tim.one@comcast.net Sat Jul 13 20:50:18 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 15:50:18 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207131937.g6DJb6K18523@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim] >> The base question: does the iteration protocol define what >> happens if an iterator's next() method is called after the iterator >> has raised StopIteration? Or is that left up to the discretion of the >> iterator? [Guido] > The latter. Believe it or not, I thought about this during the design > of the protocol, and decided that if someone wanted to create an > iterator that could somehow continue after raising StopIteration, that > should be their problem. I believe it -- I even vaguely recall discussions about it. Unfortunately, they don't seem to be recorded anywhere I can find now. > Basically, the effect of calling next() after StopIteration is raised is > undefined. That's consistent with a lawyer's reading of the relevant PEP . >> If the answer is that it's the iterator's choice, is 2-argument iter() >> making the best choice? The rub here is that 2-arg iter was (IMO) >> introduced to help iteration-ignorant callables fit into the iteration >> protocol, and *because* they're iteration-ignorant they may do something >> foolish if called again after their "sentinel" value is seen. > If the caller stops calling next(), nothing's wrong. I don't think > the callable-iterator object should grow another state bit. > > But I'm willing to be convinced by information you withheld. I entered the c.l.py report as a bug against re. The user provoked re into hanging via using re's new finditer() method. The connection to 2-arg iter is buried in re's C implementation, via PyCallIter_New. I didn't think it added anything useful here to spell all that out. It turns out (and unsurprisingly so with hindsight) that re can be provoked into the same bad behavior without involving the iteration protocol at all, so in this case I think finditer() just made it easier to expose a flaw that was present regardless. I'm happy to leave this be: the docs match the implemenation, I'm sure *someone* relies on that by now, and the behavior is easy to explain as-is. From mal@lemburg.com Sat Jul 13 21:25:23 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 13 Jul 2002 22:25:23 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> Message-ID: <3D308CB3.4000302@lemburg.com> Fredrik Lundh wrote: > guido wrote: > > > >>There's a full implementation for PEP 263. Martin von Loewis is ready >>to commit it. It's of course possible to let him do this and deal >>with the consequences once they're in CVS, I'd like to see if there's >>anyone who'd like to review the code before it goes in. The patch is >>at http://python.org/sf/534304. I like the PEP fine, I just don't >>have time to review the patch > > > hmm. I'm tempted to think that there's a major > flaw in the PEP, caused by the fact that > > compile(unicode(script, extract_encoding(script))) > > will, from what I can tell, not compile to the same > thing as: > > compile(script) > > but I've had too many holy [gr]ails [1] tonight to > be sure if that's really a flaw at all... Right. The implementation is not a full implementation of what is defined as step 2 in the PEP. However, I don't think that we're that far away from that: all that's needed is to encode a Unicode argument to compiler() to UTF-8 and then either prepend it with a BOM mark or a coding spec before passing it to the compiler. Nice would be to add a new tokenizer API which treats the input as UTF-8 without looking for the coding comment or BOM at all. BTW, the approach mentioned in that PEP is no longer needed (converting the complete tokenizer to using Py_UNICODE internally). I think that the only way to give this code enough testing is by letting Martin check it in and see what happens. Except for the few XXX and CAUTION marks, the code looks OK. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Sat Jul 13 21:32:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 16:32:16 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D307A59.5040707@lemburg.com> Message-ID: [MAL] > The module objects would be different, but that's just about it. [Gordon] > Which was exactly my point. Much code that does *not* use > "from ... import ..." in fact relies on having the same module object. [MAL] > You mean for e.g. hacking the module's globals ? If you consider a module maintaining pieces of its own state in its own globals as an instance of hacking the module's globals, yes, that's the main problem. For example (there are many, this isn't stretching), if the user ends up with two distinct copies of the tempfile module, its "global" _tempdir_lock becomes two distinct locks, and the truly global mutual exclusion _tempdir_lock was supposed to supply is lost. Ditto for the lock used internally by tempfile's global _counter object. The system-wide uniqueness of some globals is crucial to some modules' correct functioning. From tim.one@comcast.net Sat Jul 13 21:51:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 16:51:47 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: Message-ID: [Alex Martelli] > Greg's version's in nondist/sandbox/sets, right? Where's Aric's? [Guido] > http://bent-arrow.com/python [Alex] > Ah, a C implementation. It seems premature to me to consider such > optimization -- for now, it appears, we're still looking around for the > right architecture, and that's much more plastic and faster to > experiment with in Python. So, I have not studied set.c in detail, I have, and I'm -1 on it: it's largely a copy-paste-small-edit of massive portions of dictobject.c. If it has to be implemented via massive code duplication, there are less maintenance-intense ways to do that. > just browsed the readme to get an idea of the interface -- and that > seems even more peculiar to me than freeze-on-hashing, although > generally similar. freeze-on-hashing was pioneered in the Python world by kjbuckets. I've used it in my own Set code for years without particular pain. Greg Wilson seemed to hate it, though. > So, for now, I've stuck to Python, +1 From gmcm@hypernet.com Sat Jul 13 23:44:59 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 13 Jul 2002 18:44:59 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D307A59.5040707@lemburg.com> Message-ID: <3D30752B.9129.BA3B5E1@localhost> Marc-Andre, In this thread you have posted: > python.py: > __path__ = ['.'] and > def _redirect(mx_subpackage): > global __path__ > import os,mx > __path__ = \ > [os.path.join(mx.__path__[0],mx_subpackage)] and > testmodload.py: > import sys, os > sys.modules['testmodload'] = os None of these will freeze successfully. Two of them appear to rely on an implementation detail - that __path__ (only defined for imp.PKG_DIRECTORY's) will be followed even in a plain module. The third is exactly what _xmlplus does, and consensus appears to be that that was a mistake. "Clever" does not mean "good". -- Gordon http://www.mcmillan-inc.com/ From guido@python.org Sun Jul 14 00:07:36 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 19:07:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sat, 13 Jul 2002 15:50:18 EDT." References: Message-ID: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> > [Tim] > >> The base question: does the iteration protocol define what > >> happens if an iterator's next() method is called after the iterator > >> has raised StopIteration? Or is that left up to the discretion of the > >> iterator? > > [Guido] > > The latter. Believe it or not, I thought about this during the design > > of the protocol, and decided that if someone wanted to create an > > iterator that could somehow continue after raising StopIteration, that > > should be their problem. [Tim] > I believe it -- I even vaguely recall discussions about it. Unfortunately, > they don't seem to be recorded anywhere I can find now. > > > Basically, the effect of calling next() after StopIteration is raised is > > undefined. > > That's consistent with a lawyer's reading of the relevant PEP . Actually, not. Under "Resolved Issues" the PEP has this: - Once a particular iterator object has raised StopIteration, will it also raise StopIteration on all subsequent next() calls? Some say that it would be useful to require this, others say that it is useful to leave this open to individual iterators. Note that this may require an additional state bit for some iterator implementations (e.g. function-wrapping iterators). Resolution: once StopIteration is raised, calling it.next() continues to raise StopIteration. So I misremembered, and Tim didn't read the PEP closely enough. :-) > I'm happy to leave this be: the docs match the implemenation, I'm > sure *someone* relies on that by now, and the behavior is easy to > explain as-is. Hm. Given what the PEP says, I'm ready to have this fixed (even in 2.2.2). I can't call code relying on this sane. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler@unpythonic.net Sun Jul 14 01:04:26 2002 From: jepler@unpythonic.net (jepler@unpythonic.net) Date: Sat, 13 Jul 2002 19:04:26 -0500 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020713190420.A2256@unpythonic.net> On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote: > Actually, not. Under "Resolved Issues" the PEP has this: > > - Once a particular iterator object has raised StopIteration, will > it also raise StopIteration on all subsequent next() calls? > Some say that it would be useful to require this, others say > that it is useful to leave this open to individual iterators. > Note that this may require an additional state bit for some > iterator implementations (e.g. function-wrapping iterators). > > Resolution: once StopIteration is raised, calling it.next() > continues to raise StopIteration. > > So I misremembered, and Tim didn't read the PEP closely enough. :-) > > > I'm happy to leave this be: the docs match the implemenation, I'm > > sure *someone* relies on that by now, and the behavior is easy to > > explain as-is. > > Hm. Given what the PEP says, I'm ready to have this fixed (even in > 2.2.2). I can't call code relying on this sane. What about this example? >>> l = [] >>> li = iter(l) >>> li.next() Traceback (most recent call last): File "", line 1, in ? StopIteration >>> l.extend([1, 2, 3]) >>> li.next() 1 does the list iterator violate the proposed behavior? Jeff From guido@python.org Sun Jul 14 01:42:14 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 13 Jul 2002 20:42:14 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sat, 13 Jul 2002 19:04:26 CDT." <20020713190420.A2256@unpythonic.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020713190420.A2256@unpythonic.net> Message-ID: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net> > > So I misremembered, and Tim didn't read the PEP closely enough. :-) > > > > > I'm happy to leave this be: the docs match the implemenation, I'm > > > sure *someone* relies on that by now, and the behavior is easy to > > > explain as-is. > > > > Hm. Given what the PEP says, I'm ready to have this fixed (even in > > 2.2.2). I can't call code relying on this sane. > > What about this example? > >>> l = [] > >>> li = iter(l) > >>> li.next() > Traceback (most recent call last): > File "", line 1, in ? > StopIteration > >>> l.extend([1, 2, 3]) > >>> li.next() > 1 > > does the list iterator violate the proposed behavior? Alternatively, we could change the PEP to make this officially undefined (or at least up to the iterator used). I'm not sure which I like better -- the PEP or reality. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Sun Jul 14 02:49:32 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 13 Jul 2002 21:49:32 -0400 Subject: [Python-Dev] Re: Termination of two-arg iter() In-Reply-To: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020713190420.A2256@unpythonic.net> <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > What about this example? > > >>> l = [] > > >>> li = iter(l) > > >>> li.next() > > Traceback (most recent call last): > > File "", line 1, in ? > > StopIteration > > >>> l.extend([1, 2, 3]) > > >>> li.next() > > 1 > > > > does the list iterator violate the proposed behavior? > Alternatively, we could change the PEP to make this officially > undefined (or at least up to the iterator used). If you change the PEP so the behaviour is undefined in the protocol, then, you will have to separately document the behaviour for all iterators which are produced by the various means available in standard Python, and people will have to remember these differences. Would it be perceived as shocking (or not?) in the example above, having to produce another iterator "li = iter(l)" before reusing it? If not, then I presume regularity and consistency of behaviour should prevail. Are there other problematic cases from the Python distribution itself? Maybe the iteration protocol should invite implementors at returning forever, if it has returned it once by a particular instance of an iterator, only for the sake of consistency with all iterators provided by Python itself, but without making this a hard requirement. So if for some strange application, users want to do differently, they could validly do nevertheless. -- François Pinard http://www.iro.umontreal.ca/~pinard From mhammond@skippinet.com.au Sun Jul 14 03:49:18 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sun, 14 Jul 2002 12:49:18 +1000 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > It seems we're still in the same boat. It would be saner to change > buffer slices to return buffer objects, except for backward > compatibility. I was hoping to hear from someone who uses buffer > objects and knows that this would break his code. Scott apparently > doesn't have this problem with his own code, so his opinion doesn't > help. :-( There may be some breakage in the Win32 overlapped IO world. A common pattern is: buf = allocate_buffer_somehow(size) Perform_Async_Read(size) # wait for notification of read completing nbytes = Wait_For_Read_Notification() data = buf[:nbytes] Currently, "data" is a string. Changing this to a buffer object will presumably break this code once "data" is passed to some other function that truly requires a string. > Maybe the name 'buffer' suggests false expectations? It's not a > buffer, it's an alias for a memory area. This distinction is a little gray. In my example, it is truly a buffer - but also an alias for a memory area. In my example though, it is *not* conceptually an alias for memory owned by another object. > Maybe we should do something stronger, and deprecate the buffer type > altogether. Maybe. However, as you have seen over the years, *something* from all this mess is a real requirement. This example of asynch IO is the only example I have ever used, but IMO, it is a real and reasonable requirement. My example *could* have been done with array() (assuming the array module had a C API exposed which it doesn't/didn't) but that too looks like a square peg in a round hole - my requirements call for a pre-allocated byte buffer, not an array. All that said: if the worst came to the worst, I could ensure that the Win32 extensions are left compatible with the way they are. All such buffers are allocated using a function inside one of my modules. Currently this just returns a buffer() object, but could be changed to a private object with the same semantics as the existing buffer() object. So consider this more a data point than an attempted veto. Mark. From tim.one@comcast.net Sun Jul 14 04:15:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 23:15:50 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > Actually, not. Under "Resolved Issues" the PEP has this: > > - Once a particular iterator object has raised StopIteration, will > it also raise StopIteration on all subsequent next() calls? > Some say that it would be useful to require this, others say > that it is useful to leave this open to individual iterators. > Note that this may require an additional state bit for some > iterator implementations (e.g. function-wrapping iterators). > > Resolution: once StopIteration is raised, calling it.next() > continues to raise StopIteration. > > So I misremembered, and Tim didn't read the PEP closely enough. :-) Not so. I read the PEP *very* closely, twice even. It's just that both times, I gave up in boredom a few points above that one . I think I used to know it, though, and made sure StopIteration is a sink state for generator-iterators because of it. > ... > Hm. Given what the PEP says, I'm ready to have this fixed (even in > 2.2.2). Well, the PEP proper just doesn't say. In a court of Standard Law, I'm pretty sure the "Resolved Issues" section would be ruled to be in the nature of a non-normative appendix. Now that the PSF has some funds, I'm sure we can buy that decision if need be . > I can't call code relying on this sane. Now that I've seen what it actually does, I think it's kind of cute. Like f = file('somefile') get = iter(f.readline, '\n') while 1: paragraph = list(get) if not paragraph: break # deal with paragraph, a list of lines The only big problem is that once you hit the end of the file, this hangs in an infinite loop inside the list() implementation, accumulating an unbounded number of empty strings. But that just makes it extra cute. Cute enough to be insane, probably. From tim.one@comcast.net Sun Jul 14 04:19:43 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 23:19:43 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020713190420.A2256@unpythonic.net> Message-ID: [jepler@unpythonic.net] > What about this example? > >>> l = [] > >>> li = iter(l) > >>> li.next() > Traceback (most recent call last): > File "", line 1, in ? > StopIteration > >>> l.extend([1, 2, 3]) > >>> li.next() > 1 > > does the list iterator violate the proposed behavior? Oh yes. OTOH, its current behavior isn't defined well enough anywhere (short of reading the source code) that raising StopIteration on the second call today could be called "a bug" either. From tim.one@comcast.net Sun Jul 14 04:33:21 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 13 Jul 2002 23:33:21 -0400 Subject: [Python-Dev] Re: Termination of two-arg iter() In-Reply-To: Message-ID: [Fran=E7ois Pinard] > If you change the PEP so the behaviour is undefined in the protocol= , > then, you will have to separately document the behaviour for all > iterators which are produced by the various means available in stan= dard > Python, and people will have to remember these differences. Not necessarily. The standard dodge is to say "undefined" and just l= eave it at that. This is a way of saying that the language so strongly disco= urages the practice that it refuses to saying anything about what happens if= you do it, but that it's not going to stop you if you're determined to do it= . If you do it anyway, it's at your own risk (as if anything you do is eve= r done at someone else's risk ). > Would it be perceived as shocking (or not?) in the example above, h= aving > to produce another iterator "li =3D iter(l)" before reusing it? Jeff's example was too simple to make "the problem" here clear. If y= ou get a new iterator, you'll start over from the beginning of the list. As= is, you continue where the last next() call left off: >>> x =3D range(2) >>> n =3D iter(x).next >>> n() 0 >>> n() 1 >>> n() Traceback (most recent call last): File "", line 1, in ? StopIteration >>> x.extend([6, 7]) >>> n() 6 >>> n() 7 >>> n() Traceback (most recent call last): File "", line 1, in ? StopIteration >>> *Some* code out there may be relying on that, despite that the behavi= or violates what the tail end of the PEP says. thank-god-the-protocol-doesn't-have-three-methods-ly y'rs - ti= m From tim.one@comcast.net Sun Jul 14 05:00:04 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 14 Jul 2002 00:00:04 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <0GZ600EDOJAORQ@mtain01.icomcast.net> Message-ID: [Alex Martelli] > What about the following compromise: there are two set types, > ImmutableSet and MutableSet, with a common supertype Set. > ... This sounds fine to me, except I'd call them Set (mutable) and something else . I'd also check the code into the library now, so lots of people can hack on it before it "becomes real". People just won't play with branches or sandboxes in sufficient numbers to do any collaborative good. If Guido hates what it turns into, we can pull it out again. From tim.one@comcast.net Sun Jul 14 05:17:00 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 14 Jul 2002 00:17:00 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <200207131234.g6DCYvj17144@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > IMO it's no odder than disallowing dicts as dict keys: it's a hack > that allows a much faster implementation. Except that sets are an extremely well-developed concept apart from Python, and you can only go a little way using set-based approaches before sets of sets are a screamingly natural occurrence. In that respect, sets that can't contain sets are akin to limiting integer arithmetic to 32 bits (also a hack that allows a much faster implementation, but screaming speed just isn't Python's forte -- this line of argument belongs more in Fortran-Dev). >> That is, like sets of sets in Icon too, this is a notion of inclusion by >> object identity (although Icon does that on purpose, while the >> BTree-based set mostly inherits it from that BTrees don't implement any >> comparison slots). That's very easy to implement. It's braindead if >> you think of sets as collections of values, but that's what taking pain >> too seriously leads to. > I don't think it is acceptable to have sets-of-sets but test for > membership (in that case) by object identity. > > If you really think object identity is all that's needed, I suggest we > stick to disallowing sets of sets; algorithms needing > sets-of-set-object-identities can use id() on the inner sets. I called the object identity approach "braindead" for those who think of sets as collections of values, and I previously identified myself as one of those suffering the collection-of-values delusion. You can do the modus ponens bit from there . From oren-py-d@hishome.net Sun Jul 14 05:31:13 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 14 Jul 2002 00:31:13 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020714043113.GA53342@hishome.net> On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote: > Actually, not. Under "Resolved Issues" the PEP has this: > > - Once a particular iterator object has raised StopIteration, will > it also raise StopIteration on all subsequent next() calls? > Some say that it would be useful to require this, others say > that it is useful to leave this open to individual iterators. At the time this was discussed on the list has anyone considered the possibility of raising an exception? Something like 'IteratorExhausted'? If the current definition is ruled to be 'undefined' then an iterator MAY raise an exception in this case. Iterable objects can often serve as a replacement for lists in many places and even passed successfully to a lot of old code that was written before the iteration protocol. But an iterable object is not always a suitable replacement for a sequence when the code needs to iterate multiple times and the object is not re-iterable. This will fail in a very nonobvious way without raising an exception because an exhausted iterator looks just like an empty sequence to a for loop. I think this kind of errors should not pass silently. Yes, I have been bitten by this. Perhaps this was a result of overzealous use of iterators because I was so excited with them, but it's a real problem, not some contrived example. Oren From tim.one@comcast.net Sun Jul 14 05:44:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 14 Jul 2002 00:44:16 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <14fb01c2298c$71a145b0$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > Yep, I know about PySequence_Fast(), annd we're currently using that. > However I have a bunch of numerics users who will undoubtedly be working > with some kind of array from NumPy or something -- they'll be really > unimpressed with me when PySequence_Fast() copies their huge multi-pass > sequence without individual Python objects for the elements into a tuple > with each double expressed as a separate Python float. Now that you have a concrete use case (to the extent that "some kind of NumPy array or something" can be called concrete ), have you talked to the NumPy people about it? They're very clever about making things run fast (that's the reason for NumPy's existence), and they may want a different approach entirely. averse-to-generalizing-from-0-examples-ly y'rs - tim From aleax@aleax.it Sun Jul 14 07:09:27 2002 From: aleax@aleax.it (Alex Martelli) Date: Sun, 14 Jul 2002 08:09:27 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: References: Message-ID: <02071408092701.18713@arthur> On Sunday 14 July 2002 06:00, Tim Peters wrote: > [Alex Martelli] > > > What about the following compromise: there are two set types, > > ImmutableSet and MutableSet, with a common supertype Set. > > ... > > This sounds fine to me, except I'd call them Set (mutable) and something Yes, that's what I did in the submission -- Set is the name of the mutable one, BaseSet the common base type (meant as abstract). Please see http://python.org/sf/580995 -- I'm sure there will be other glitches worth fixing. Alex From martin@v.loewis.de Sun Jul 14 09:02:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jul 2002 10:02:15 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings In-Reply-To: <018901c22aa4$e0f41190$ced241d5@hagrid> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> Message-ID: "Fredrik Lundh" writes: > hmm. I'm tempted to think that there's a major > flaw in the PEP, caused by the fact that > > compile(unicode(script, extract_encoding(script))) > > will, from what I can tell, not compile to the same > thing as: > > compile(script) Can you elaborate what you think the difference is? I believe the PEP is silent on this specific aspect, but I think what should happen is (in the Unicode case): - compile will convert the script to UTF-8, which is then tokenized. - in the process of parsing, the encoding declaration (that presumably extract_encoding was looking at as well) is recognized, if any. - Unicode literals are left as-is; byte string literals are converted back to the original encoding. So if there is an encoding declaration in script, then I cannot see a difference. If there is none, the PEP does not elaborate what should happen. Leaving the byte strings as UTF-8 seems safest, since the only way to get "correct" non-ASCII strings without the encoding comment is to use the UTF-8 signature. In any case, this can't cause backwards compatibility problems. compile accepts Unicode strings today only if they can be converted to a byte string. In the standard installation, this will fail today if there is non-ASCII in script. So allowing Unicode in compile is a pure extension. If its precise meaning is underspecified, it should be clarified before stage 2 is implemented. Regards, Martin From drifty@bigfoot.com Sun Jul 14 09:16:10 2002 From: drifty@bigfoot.com (Brett Cannon) Date: Sun, 14 Jul 2002 01:16:10 -0700 (PDT) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020714043113.GA53342@hishome.net> Message-ID: [Oren Tirosh] > At the time this was discussed on the list has anyone considered the > possibility of raising an exception? Something like 'IteratorExhausted'? > I have no idea whether this was discussed before or not, but I personally don't like the idea of having another exception being raised by iterators. Without reading the PEP this exact second, my gut response is that iterators should have a single exception that signals it has reached its end. It seems like StopIteration is saying "stop please" and IteratorExhausted would be like screaming "STOP CALLING .next()!!!". Either you force them to get the clue the first time or you let them continue being rude; Python shouldn't need to raise its voice and act like an over-bearing parent. If we wanted over-bearing parents we would be yelling for typing of arguments. =) > Iterable objects can often serve as a replacement for lists in many places > and even passed successfully to a lot of old code that was written before > the iteration protocol. But an iterable object is not always a suitable > replacement for a sequence when the code needs to iterate multiple times and > the object is not re-iterable. This will fail in a very nonobvious way > without raising an exception because an exhausted iterator looks just like > an empty sequence to a for loop. > After reading this email I felt like not raising StopIteration continuously was like warning that once you hit \0 in a C char array you have reached the end of the string, but keep going if you care to. We all know that ain't a good idea. =) Personally, I say continuously raise StopIteration. I feel that StopIteration says the iterator is done, period. Being able to go beyond the signalled end seems like it is not a true once-through iterator with an actual end but starting to seem like a stream. I thought the point of putting in something like the sentinel was so that you could force the end of an iterator and just have it be a suggestion. I can also see beginners being bitten by this; if people as experienced as Oren are getting bitten by this we know some person starting out definitely will be. I know I thought that StopIteration was continuously raised until the emails on this subject started. When I blindly read "StopIteration", I don't feel this is a warning that one shouldn't keep going but a notice that the iterator is done, thanks for coming but please don't come again (unless restartable iterators are supported =). In other words, I feel that StopIteration sounds like a notice that the end has occured and you can't do any more then an advisement that you should stop. > > Oren > -Brett C. From bsder@mail.allcaps.org Sun Jul 14 11:23:25 2002 From: bsder@mail.allcaps.org (Andrew P. Lentvorski) Date: Sun, 14 Jul 2002 03:23:25 -0700 (PDT) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Message-ID: <20020714023729.Y79323-100000@mail.allcaps.org> On Sun, 14 Jul 2002, Brett Cannon wrote: > end. It seems like StopIteration is saying "stop please" and > IteratorExhausted would be like screaming "STOP CALLING .next()!!!". What about raising IndexError by default when someone attempts to call .next() on an iterator already raising StopIteration? In the case of a list, StopIteration signals that the iterator is pointing to just beyond the end of the list. An attempt to call .next() when StopIteration is already true is effectively an attempt to dereference past the end of a list (since .next() normally wants to return a value). List accesses via an index past list end currently raise an IndexError. Doing something similar for iterators would seem to keep things consistent. -a From oren-py-d@hishome.net Sun Jul 14 12:05:13 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 14 Jul 2002 07:05:13 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: <20020714043113.GA53342@hishome.net> Message-ID: <20020714110513.GA2280@hishome.net> On Sun, Jul 14, 2002 at 01:16:10AM -0700, Brett Cannon wrote: > [Oren Tirosh] > > > At the time this was discussed on the list has anyone considered the > > possibility of raising an exception? Something like 'IteratorExhausted'? > > > > I have no idea whether this was discussed before or not, but I personally > don't like the idea of having another exception being raised by iterators. > Without reading the PEP this exact second, my gut response is that > iterators should have a single exception that signals it has reached its > end. It seems like StopIteration is saying "stop please" and > IteratorExhausted would be like screaming "STOP CALLING .next()!!!". > Either you force them to get the clue the first time or you let them > continue being rude; Python shouldn't need to raise its voice and act like > an over-bearing parent. If we wanted over-bearing parents we would be > yelling for typing of arguments. =) This anthropomorphic description has too many irrelevant associations. Let's leave the parents out of this. The logic is simple: StopIteration is not an error. It's not even a warning, it's a normal part of program operation. It uses the exception mechanism because it is the most convenient form of out-of-band signalling. The hypothetical IteratorExhausted is an error. The fact that both of them happen to be exceptions is almost a coincidence. Unlike IndexError which is sometimes used to bail out of loops the IteratorExhausted exception is almost guaranteed to be a programmer error. And it's error that would otherwise pass silently and produce strange results. > definitely will be. I know I thought that StopIteration was continuously > raised until the emails on this subject started. For most Python iterators it is. This behavior is OK but it could be changed to something stricter. So far I thought this behavior was mandatory so I didn't raise this proposal. Now I learned that officially it is undefined and that this behavior is just what most Python iterators do so it could be possible to change it to something safer. Oren From oren-py-d@hishome.net Sun Jul 14 12:27:45 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 14 Jul 2002 07:27:45 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020714023729.Y79323-100000@mail.allcaps.org> References: <20020714023729.Y79323-100000@mail.allcaps.org> Message-ID: <20020714112745.GB2280@hishome.net> On Sun, Jul 14, 2002 at 03:23:25AM -0700, Andrew P. Lentvorski wrote: > On Sun, 14 Jul 2002, Brett Cannon wrote: > > > end. It seems like StopIteration is saying "stop please" and > > IteratorExhausted would be like screaming "STOP CALLING .next()!!!". > > What about raising IndexError by default when someone attempts to call > .next() on an iterator already raising StopIteration? +1 IndexError is probably better than inventing a new exception. The description of what actually happened would be in the exception text. StopIteration means "That was the last item, thank you. Sorry I couldn't tell you my length in advance -- I'm an iterator and I don't even know it myself." This type of IndexError would mean "Hey, I told you it was the last item. This would have been an out-of-bounds index if I were a sequence". Oren From guido@python.org Sun Jul 14 14:20:51 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 14 Jul 2002 09:20:51 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sun, 14 Jul 2002 07:27:45 EDT." <20020714112745.GB2280@hishome.net> References: <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> Message-ID: <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> > > What about raising IndexError by default when someone attempts to call > > .next() on an iterator already raising StopIteration? > > +1 > > IndexError is probably better than inventing a new exception. The > description of what actually happened would be in the exception > text. -1. IndexError belongs to sequences. I don't like the idea of raising another exception at all -- we should either keep things the way they are, or continue to raise StopIteration forever once it's been raised. Other suggestions don't make sense to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Sun Jul 14 14:33:24 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 14 Jul 2002 09:33:24 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020714043113.GA53342@hishome.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020714043113.GA53342@hishome.net> Message-ID: <20020714133324.GA17215@hishome.net> On Sun, Jul 14, 2002 at 12:31:13AM -0400, Oren Tirosh wrote: > On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote: > > Actually, not. Under "Resolved Issues" the PEP has this: > > > > - Once a particular iterator object has raised StopIteration, will > > it also raise StopIteration on all subsequent next() calls? > > Some say that it would be useful to require this, others say > > that it is useful to leave this open to individual iterators. > > At the time this was discussed on the list has anyone considered the > possibility of raising an exception? Something like 'IteratorExhausted'? An alternative approach would be to raise an exception when calling iter() on an exhausted iterator. This is orthogonal to whatever .next() does on such and iterator. Suggested implementation: when an iterator raises StopIteration it will immediately clean up and decref any referenced objects and then alter its ob_type field to a special closed iterator type. This is similar to the way closed files are handled - any attempt to perform I/O on them raises an IOError. The behavior of this type's tp_iter and tp_iternext is open for discussion. This will "fix" the behavior of list iterators, for example, that can be revived by extending the list. It's a matter of interpretation whether this is a bug or a feature, though. Oren From mal@lemburg.com Sun Jul 14 15:32:09 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 16:32:09 +0200 Subject: [Python-Dev] python package References: <3D30752B.9129.BA3B5E1@localhost> Message-ID: <3D318B69.4080401@lemburg.com> Gordon McMillan wrote: > Marc-Andre, > > In this thread you have posted: > > >>python.py: >>__path__ = ['.'] > > > and > > >>def _redirect(mx_subpackage): >> global __path__ >> import os,mx >> __path__ = \ >> [os.path.join(mx.__path__[0],mx_subpackage)] > > > and > > >>testmodload.py: >>import sys, os >>sys.modules['testmodload'] = os > > > None of these will freeze successfully. Hmm, then how do you freeze _xmlplue ? > Two of them appear to rely on an implementation > detail - that __path__ (only defined for > imp.PKG_DIRECTORY's) will be followed even in > a plain module. AFAIK, that's not an implementation detail, but a documented way of finding out whether a module is a package or not. > The third is exactly what _xmlplus does, and > consensus appears to be that that was a > mistake. > > "Clever" does not mean "good". But it works (tm) :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Sun Jul 14 15:32:36 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 14 Jul 2002 10:32:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sun, 14 Jul 2002 09:33:24 EDT." <20020714133324.GA17215@hishome.net> References: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020714043113.GA53342@hishome.net> <20020714133324.GA17215@hishome.net> Message-ID: <200207141432.g6EEWa127865@pcp02138704pcs.reston01.va.comcast.net> > An alternative approach would be to raise an exception when calling > iter() on an exhausted iterator. This is orthogonal to whatever > .next() does on such and iterator. This just adds more complicated rules to no avail. iter() on an iterator should return that iterator itself. The state of that iterator is what it is. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Sun Jul 14 15:44:23 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 14 Jul 2002 10:44:23 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: <20020714043113.GA53342@hishome.net> Message-ID: <20020714144423.GA29033@panix.com> On Sun, Jul 14, 2002, Brett Cannon wrote: > > Personally, I say continuously raise StopIteration. I feel that > StopIteration says the iterator is done, period. Being able to go > beyond the signalled end seems like it is not a true once-through > iterator with an actual end but starting to seem like a stream. +1 -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From barry@zope.com Sun Jul 14 15:58:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 14 Jul 2002 10:58:02 -0400 Subject: [Python-Dev] Termination of two-arg iter() References: <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15665.37242.446627.141013@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> -1. IndexError belongs to sequences. I don't like the idea GvR> of raising another exception at all -- we should either keep GvR> things the way they are, or continue to raise StopIteration GvR> forever once it's been raised. Other suggestions don't make GvR> sense to me. I think it would be fine to leave the situation as is (i.e. undefined). You can use the PEP to encourage a particular behavior but I'm not sure it needs to be required ("SHOULD" in RFC terms, but not "MUST"). -Barry From oren-py-d@hishome.net Sun Jul 14 17:06:11 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 14 Jul 2002 12:06:11 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <15665.37242.446627.141013@anthem.wooz.org> References: <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> <15665.37242.446627.141013@anthem.wooz.org> Message-ID: <20020714160611.GA25950@hishome.net> On Sun, Jul 14, 2002 at 10:58:02AM -0400, Barry A. Warsaw wrote: > > >>>>> "GvR" == Guido van Rossum writes: > > GvR> -1. IndexError belongs to sequences. I don't like the idea > GvR> of raising another exception at all -- we should either keep > GvR> things the way they are, or continue to raise StopIteration > GvR> forever once it's been raised. Other suggestions don't make > GvR> sense to me. > > I think it would be fine to leave the situation as is > (i.e. undefined). You can use the PEP to encourage a particular > behavior but I'm not sure it needs to be required ("SHOULD" in RFC > terms, but not "MUST"). I'd like it to stay underfined. The issue is how should the iterators of builtin types actually behave within this undefined space. Iterables are very similar to sequences. A lot of code could use either one without any changes. It's precisely because of this similarity that I hate it when they do behave differently - and don't even report it. Files and pipes are very similar too. A lot of code could work with either one but if this code tries to seek on a pipe it will get an exception. Just imagine what would happen if pipes failed silently if you tried to seek back to the beginning of the file. I have much respect for whatever makes or doesn't make sense to Guido but I have been using iterators and generator functions extensively (obsessively?) for over 8 months now and the current behavior doesn't make sense to me. I guess the reason I ran into this has to do with my style of interactive use of the Python prompt. I recall a previous command, change the paramters of one of the processing stages in the dataflow and repeat the process. Then I wonder why I get an empty result - one of the temporary results I stored to a variable wasn't re-iterable. Is it too much to expect an exception? "Errors should never pass silently." Oren From mal@lemburg.com Sun Jul 14 17:17:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 18:17:05 +0200 Subject: [Python-Dev] python package References: Message-ID: <3D31A401.9000103@lemburg.com> Tim Peters wrote: > [MAL] > >>The module objects would be different, but that's just about it. > > > [Gordon] > >>Which was exactly my point. Much code that does *not* use >>"from ... import ..." in fact relies on having the same module object. > > > [MAL] > >>You mean for e.g. hacking the module's globals ? > > > If you consider a module maintaining pieces of its own state in its own > globals as an instance of hacking the module's globals, yes, that's the main > problem. For example (there are many, this isn't stretching), if the user > ends up with two distinct copies of the tempfile module, its "global" > _tempdir_lock becomes two distinct locks, and the truly global mutual > exclusion _tempdir_lock was supposed to supply is lost. Ditto for the lock > used internally by tempfile's global _counter object. The system-wide > uniqueness of some globals is crucial to some modules' correct functioning. Very true and that's why there is only one module containing the actual code. Globals referenced by the code live in that module. The other module only imports the symbols in the first solution I posted. The second even avoids this extra step -- there's only one module (the packaged one) left in sys.modules which is referenced under two names. pickles will gladly unpickle using this scheme while a pickle operation automagically starts using the new packaged name. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sun Jul 14 17:21:34 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 18:21:34 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> Message-ID: <3D31A50E.4050800@lemburg.com> Martin v. Loewis wrote: > "Fredrik Lundh" writes: > > >>hmm. I'm tempted to think that there's a major >>flaw in the PEP, caused by the fact that >> >> compile(unicode(script, extract_encoding(script))) >> >>will, from what I can tell, not compile to the same >>thing as: >> >> compile(script) > > > Can you elaborate what you think the difference is? I believe the PEP > is silent on this specific aspect, It does mention this as part of phase 2. > but I think what should happen is > (in the Unicode case): > > - compile will convert the script to UTF-8, which is then tokenized. > - in the process of parsing, the encoding declaration (that presumably > extract_encoding was looking at as well) is recognized, if any. > - Unicode literals are left as-is; byte string literals are converted > back to the original encoding. Right. > So if there is an encoding declaration in script, then I cannot see a > difference. If there is none, the PEP does not elaborate what should > happen. Leaving the byte strings as UTF-8 seems safest, since the only > way to get "correct" non-ASCII strings without the encoding comment is > to use the UTF-8 signature. > > In any case, this can't cause backwards compatibility > problems. compile accepts Unicode strings today only if they can be > converted to a byte string. In the standard installation, this will > fail today if there is non-ASCII in script. So allowing Unicode in > compile is a pure extension. If its precise meaning is underspecified, > it should be clarified before stage 2 is implemented. No need for this. The PEP already mentions it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Sun Jul 14 17:29:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jul 2002 18:29:20 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings In-Reply-To: <3D31A50E.4050800@lemburg.com> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> <3D31A50E.4050800@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > > Can you elaborate what you think the difference is? I believe the PEP > > is silent on this specific aspect, > > It does mention this as part of phase 2. All I can find is The builtin compile() API will be enhanced to accept Unicode as input. That leaves the question open what the compile function *does* beyond merely accepting Unicode strings; it is canonical that it tries to compile it, as it would with a byte string. The unspecified aspect is the treatment of byte strings within the Unicode string. The current compiler treats them "as-is"; this is clearly no option. The reasonable options are: 1. convert to byte string using "ascii" encoding, 2. convert to byte string using "utf-8" encoding, 3. convert to byte string using system default encoding, 4. convert to byte string using encoding declared inside the code string. If that route is taken, the question is what happens if no encoding declaration is found. > No need for this. The PEP already mentions it. Can you please quote the precise words in the text of the PEP that answer the question which of the four options above is taken? Regards, Martin From mal@lemburg.com Sun Jul 14 18:02:13 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 19:02:13 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> <3D31A50E.4050800@lemburg.com> Message-ID: <3D31AE95.6070804@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: > > >>>Can you elaborate what you think the difference is? I believe the PEP >>>is silent on this specific aspect, >> >>It does mention this as part of phase 2. > > > All I can find is > > > The builtin compile() API will be enhanced to accept Unicode as input. > > > That leaves the question open what the compile function *does* beyond > merely accepting Unicode strings; it is canonical that it tries to > compile it, as it would with a byte string. Oh, I thought it would be natural from reading the complete text: """ 2. Change the tokenizer/compiler base string type from char* to Py_UNICODE* and apply the encoding to the complete file. Source files which fail to decode cause an error to be raised during compilation. The builtin compile() API will be enhanced to accept Unicode as input. 8-bit string input is subject to the standard procedure for encoding detection as decsribed above. """ Of course, we no longer need to convert the tokenizer to work on Py_UNICODE, so the updated text should mention that compile() encodes Unicode input to UTF-8 to the continue with the usual processing. (Also see my reply to Fredrik). > The unspecified aspect is the treatment of byte strings within the > Unicode string. The current compiler treats them "as-is"; this is > clearly no option. The reasonable options are: > > 1. convert to byte string using "ascii" encoding, > 2. convert to byte string using "utf-8" encoding, > 3. convert to byte string using system default encoding, > 4. convert to byte string using encoding declared inside the code > string. If that route is taken, the question is what happens > if no encoding declaration is found. > > >>No need for this. The PEP already mentions it. > > > Can you please quote the precise words in the text of the PEP that > answer the question which of the four options above is taken? Option 2. Ideal would be to have the tokenizer skip the encoding declaration detection and start directly with the UTF-8 string (this also solves the problems you'd run into in case the Unicode source code has a source code encoding comment). Is that possible with the implementation ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gmcm@hypernet.com Sun Jul 14 18:24:48 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 14 Jul 2002 13:24:48 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D318B69.4080401@lemburg.com> Message-ID: <3D317BA0.3011.FA4F068@localhost> On 14 Jul 2002 at 16:32, M.-A. Lemburg wrote: > Gordon McMillan wrote: [various cute hacks] > > None of these will freeze successfully. > > Hmm, then how do you freeze _xmlplue ? Most people whine publicly until someone comes up with a workaround. Installer has a way of hooking modules & packages that play games like that, but if you're using tools/freeze, you'll probably be told to overlay xml with _xmlplus. If the package uses lots of nasty tricks (eg, pyopengl), the answer is "you don't". > > Two of them appear to rely on an implementation > > detail - that __path__ (only defined for > > imp.PKG_DIRECTORY's) will be followed even in > > a plain module. > > AFAIK, that's not an implementation detail, but a > documented way of finding out whether a module is a > package or not. Correct. But stuffing a __path__ attribute into a module does *not* make the module a package. '''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.''' and '''Once loaded, the difference between a package and a module is minimal.''' > But it works (tm) :-) For a sufficiently short-sighted definition of "work". -- Gordon http://www.mcmillan-inc.com/ From mal@lemburg.com Sun Jul 14 18:53:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 19:53:32 +0200 Subject: [Python-Dev] python package References: <3D317BA0.3011.FA4F068@localhost> Message-ID: <3D31BA9C.3030309@lemburg.com> Gordon McMillan wrote: >>>Two of them appear to rely on an implementation >>>detail - that __path__ (only defined for >>>imp.PKG_DIRECTORY's) will be followed even in >>>a plain module. >> >>AFAIK, that's not an implementation detail, but a >>documented way of finding out whether a module is a >>package or not. > > > Correct. But stuffing a __path__ attribute into > a module does *not* make the module a package. > > '''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.''' > > and > > '''Once loaded, the difference between a package and a module is minimal.''' Hmm, I know that Python itself uses __path__ to tell whether it has a package or not, so I don't see why a module can't be regarded as package. Moving the module into a directory of the same name and then renaming it to __init__.py has the same effect. And in that case, hacking __path__ is perfectly legal. >>But it works (tm) :-) > > > For a sufficiently short-sighted definition of "work". You haven't commented on the sys.modules trick yet. This one doesn't even use the __path__ hackery :-) DateTime.py: import sys import mx.DateTime sys.modules[__name__] = mx.DateTime Python 2.1.3 (#1, May 16 2002, 18:59:26) >>> import DateTime >>> DateTime.now() >>> DateTime >>> id(DateTime) 135726540 >>> from mx import DateTime >>> id(DateTime) 135726540 See: it's the same module :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From skip@pobox.com Sun Jul 14 16:50:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 14 Jul 2002 10:50:20 -0500 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error In-Reply-To: <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15665.40380.978210.579022@localhost.localdomain> >> > I just noticed in the development docs that when a timeout on a >> > socket occurs, socket.error is raised. I rather liked the idea >> > that a different exception was raised for timeouts (I used Tim >> > O'Malley's timeout_socket module). Guido> I'd like to understand the use case better. Why would you want Guido> to make this distinction? In my application that uses Tim O'Malley's timeout_socket module, I do very little different in the two cases other than to generate a different message for the user. As I mentioned in the subject, it is a minor quibble. If it's a pain to modify the code to raise a distinct error, I wouldn't bother. Skip From gmcm@hypernet.com Sun Jul 14 20:19:29 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 14 Jul 2002 15:19:29 -0400 Subject: [Python-Dev] python package In-Reply-To: <3D31BA9C.3030309@lemburg.com> Message-ID: <3D319681.9530.100DED6A@localhost> On 14 Jul 2002 at 19:53, M.-A. Lemburg wrote: > Gordon McMillan wrote: > > ... But stuffing a __path__ attribute into > > a module does *not* make the module a package. > Hmm, I know that Python itself uses __path__ to > tell whether it has a package or not, so I don't see > why a > module can't be regarded as package. If you put on a Richard M. Nixon mask, you might be mistaken for ("regarded as") Richard M. Nixon. That doesn't make you Richard M. Nixon. Stuffing __path__ into a module means that *most* of Python's runtime will regard your module as a package. It doesn't make it a package. In particular, most introspection tools and most programmers will not recognize your module as a package. > Moving the module into a directory of the same name > and then renaming it to __init__.py has the same > effect. And in that case, hacking __path__ is > perfectly legal. Yes, it now *is* a package. One which violates recommended practice, which is to keep __init__.py simple, but still a package. > You haven't commented on the sys.modules trick yet. > This one doesn't even use the __path__ hackery :-) > > DateTime.py: > import sys > import mx.DateTime > sys.modules[__name__] = mx.DateTime [...] > See: it's the same module :-) Anytime x != sys.modules[x].__name__, someone, sometime will suffer. Installer and (I believe) py2exe have hooks so that this gets analyzed properly. The hook is keyed by "DateTime". If you really find it intolerable to stick your users with making a one line change in their code, you might consider contributing hooks to Installer (or patches to py2exe). Particularly for your non-free packages, since I'm not going to download those and reverse-engineer them. Or perhaps you could do like Pmw, and include a "bundle" script. -- Gordon http://www.mcmillan-inc.com/ From martin@v.loewis.de Sun Jul 14 20:31:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jul 2002 21:31:27 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings In-Reply-To: <3D31AE95.6070804@lemburg.com> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> <3D31A50E.4050800@lemburg.com> <3D31AE95.6070804@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Oh, I thought it would be natural from reading the complete > text: It still is not natural from reading the text you quote. > > """ > 2. Change the tokenizer/compiler base string type from char* to > Py_UNICODE* and apply the encoding to the complete file. As you say, this is more conveniently done with UTF-8 char*. > Source files which fail to decode cause an error to be raised > during compilation. In the case of Unicode strings passed to compile(), this is irrelevant; the string is already decoded. > The builtin compile() API will be enhanced to accept Unicode as > input. 8-bit string input is subject to the standard procedure > for encoding detection as decsribed above. > """ That only says that Unicode strings are processed; it still does not say how string literals appearing the source code are treated. > Of course, we no longer need to convert the tokenizer to > work on Py_UNICODE, so the updated text should mention > that compile() encodes Unicode input to UTF-8 to the continue > with the usual processing. The PEP currently does not say that. > > 2. convert to byte string using "utf-8" encoding, [...] > Option 2. I think this contradicts the current wording of the PEP. It says "5. ... and creating string objects from the Unicode literal data by first reencoding the UTF-8 data into 8-bit string data using the given file encoding" The phrasing "the given file encoding" is a bit lax, but given the string u""" # -*- coding: iso-8859-1 -*- s = 'some latin-1 text' """ I would expect that the encoding "given" is iso-8859-1, not utf-8. Now, I interpret your message to mean that s will be encoded in utf-8. Correct? If so, I think Fredrik is right, and compile(unicode(script, extract_encoding(script))) does indeed something different than compile(script) as the latter would give the string value assigned to s in its original encoding, i.e. latin-1. > Ideal would be to have the tokenizer skip the encoding declaration > detection and start directly with the UTF-8 string "skip the encoding declaration" can't really work; you have to parse the source code line by line. You can tell the implementation to ignore the encoding declaration, if desired. > (this also solves the problems you'd run into in case the Unicode > source code has a source code encoding comment). Well, that is precisely the issue that I'm trying to address here. I still believe that the resulting behaviour is not specified in the PEP at the moment (which is no big deal, since the current implementation does not touch compile() at all). Regards, Martin From bernie@3captus.com Fri Jul 12 20:29:17 2002 From: bernie@3captus.com (Bernard Yue) Date: Fri, 12 Jul 2002 13:29:17 -0600 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D2F2E0D.2C4FD92F@3captus.com> This is a multi-part message in MIME format. --------------67BDF2B7F2453AAB6129FCFC Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Guido van Rossum wrote: > > [Skip Montanaro] > > > I just noticed in the development docs that when a timeout on a socket > > > occurs, socket.error is raised. I rather liked the idea that a different > > > exception was raised for timeouts (I used Tim O'Malley's timeout_socket > > > module). Making a TimeoutError exception a subclass of socket.error would > > > be fine so you can catch it with existing code, but I could see recovering > > > differently for a timeout as opposed to other possible errors: > > > > > > sock.settimeout(5.0) > > > try: > > > data = sock.recv(8192) > > > except socket.TimeoutError: > > > # maybe requeue the request > > > ... > > > except socket.error, codes: > > > # some more drastic solution is needed > > > ... > > > > > [Bernard Yue] > > +1 on your suggestion. Anyway, under windows, the current > > implementation returns incorrect socket.error code for timeout. I am > > working on the test suite as well as a fix for problem found. Once the > > code is bug free maybe we can put the TimeoutError in. > > > > I will leave it to Guido for the approval of the change. When he comes > > back from his holiday. > > The way I restructured the code it is impossible to distinguish a > timeout error from other errors; you simply get the "no data > available" error from the socket operation. This is the same error > you'd get in non-blocking mode. > To distinguish a timeout error, the caller can check s->sock_timeout when a non-blocking mode error occured, or just return an error code from internal_select() (I guess you must have your reason to taken it out in the first place) > Before I recomplicate the code so that it can raise a separate error > when the select fails, I'd like to understand the use case better. > Why would you want to make this distinction? Requeueing the request > (as in Skip's example) doesn't make sense IMO: you set the timeout for > a reason, and that reason is that you want to give up if it takes too > long. If you really intend to retry you're better of disabling the > timeout! > How about the following (assume we have socket.setDefaultTimeout()): import socket import urllib socket.setDefaultTimeout(5.0) retry = 0 url = 'some url' while retry < 3: try: file = urllib.urlretrieve(url) except socket.TimeoutError: if retry == 2: print "Server too busy, given up!" raise else: print "Server busy, retry!" retry += 1 else: break MS IIS behave strangely to http request. When the server is very busy, it will randomly drop some requests without disconnecting the client. So the best approach for the client is to timeout and retry. I guess that might be the reason why people needed timeoutsocket in the first place. > If you really want to, you can already distinguish the timeout case, > because you get an EAGAIN error then (maybe something else on Windows > -- Bernard, if you have a fix for that, please send it to me). > I am struggling with the test case for the new socket code. The timeout test case I've send you works with the old socketmodule.c (attached), but not with the lastest version (on linux or windows). It's strange, your new implementation looks much cleaner. Please bear with me a bit longer for a patch :.( > So a -0 unless more evidence is brought forward. > > --Guido van Rossum (home page: http://www.python.org/~guido/) Bernie --------------67BDF2B7F2453AAB6129FCFC Content-Type: application/vnd.lotus-organizer; name="socketmodule.c.org" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="socketmodule.c.org" LyogU29ja2V0IG1vZHVsZSAqLwoKLyoKClRoaXMgbW9kdWxlIHByb3ZpZGVzIGFuIGludGVy ZmFjZSB0byBCZXJrZWxleSBzb2NrZXQgSVBDLgoKTGltaXRhdGlvbnM6CgotIE9ubHkgQUZf SU5FVCwgQUZfSU5FVDYgYW5kIEFGX1VOSVggYWRkcmVzcyBmYW1pbGllcyBhcmUgc3VwcG9y dGVkIGluIGEKICBwb3J0YWJsZSBtYW5uZXIsIHRob3VnaCBBRl9QQUNLRVQgaXMgc3VwcG9y dGVkIHVuZGVyIExpbnV4LgotIE5vIHJlYWQvd3JpdGUgb3BlcmF0aW9ucyAodXNlIHNlbmRh bGwvcmVjdiBvciBtYWtlZmlsZSBpbnN0ZWFkKS4KLSBBZGRpdGlvbmFsIHJlc3RyaWN0aW9u cyBhcHBseSBvbiBzb21lIG5vbi1Vbml4IHBsYXRmb3JtcyAoY29tcGVuc2F0ZWQKICBmb3Ig Ynkgc29ja2V0LnB5KS4KCk1vZHVsZSBpbnRlcmZhY2U6CgotIHNvY2tldC5lcnJvcjogZXhj ZXB0aW9uIHJhaXNlZCBmb3Igc29ja2V0IHNwZWNpZmljIGVycm9ycwotIHNvY2tldC5nYWll cnJvcjogZXhjZXB0aW9uIHJhaXNlZCBmb3IgZ2V0YWRkcmluZm8vZ2V0bmFtZWluZm8gZXJy b3JzLAoJYSBzdWJjbGFzcyBvZiBzb2NrZXQuZXJyb3IKLSBzb2NrZXQuaGVycm9yOiBleGNl cHRpb24gcmFpc2VkIGZvciBnZXRob3N0YnkqIGVycm9ycywKCWEgc3ViY2xhc3Mgb2Ygc29j a2V0LmVycm9yCi0gc29ja2V0LmdldGhvc3RieW5hbWUoaG9zdG5hbWUpIC0tPiBob3N0IElQ IGFkZHJlc3MgKHN0cmluZzogJ2RkLmRkLmRkLmRkJykKLSBzb2NrZXQuZ2V0aG9zdGJ5YWRk cihJUCBhZGRyZXNzKSAtLT4gKGhvc3RuYW1lLCBbYWxpYXMsIC4uLl0sIFtJUCBhZGRyLCAu Li5dKQotIHNvY2tldC5nZXRob3N0bmFtZSgpIC0tPiBob3N0IG5hbWUgKHN0cmluZzogJ3Nw YW0nIG9yICdzcGFtLmRvbWFpbi5jb20nKQotIHNvY2tldC5nZXRwcm90b2J5bmFtZShwcm90 b2NvbG5hbWUpIC0tPiBwcm90b2NvbCBudW1iZXIKLSBzb2NrZXQuZ2V0c2VydmJ5bmFtZShz ZXJ2aWNlbmFtZSwgcHJvdG9jb2xuYW1lKSAtLT4gcG9ydCBudW1iZXIKLSBzb2NrZXQuc29j a2V0KGZhbWlseSwgdHlwZSBbLCBwcm90b10pIC0tPiBuZXcgc29ja2V0IG9iamVjdAotIHNv Y2tldC5udG9ocygxNiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5u dG9obCgzMiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5odG9ucygx NiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5odG9ubCgzMiBiaXQg dmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5nZXRhZGRyaW5mbyhob3N0LCBw b3J0IFssIGZhbWlseSwgc29ja3R5cGUsIHByb3RvLCBmbGFnc10pCgktLT4gTGlzdCBvZiAo ZmFtaWx5LCBzb2NrdHlwZSwgcHJvdG8sIGNhbm9ubmFtZSwgc29ja2FkZHIpCi0gc29ja2V0 LmdldG5hbWVpbmZvKHNvY2thZGRyLCBmbGFncykgLS0+IChob3N0LCBwb3J0KQotIHNvY2tl dC5BRl9JTkVULCBzb2NrZXQuU09DS19TVFJFQU0sIGV0Yy46IGNvbnN0YW50cyBmcm9tIDxz b2NrZXQuaD4KLSBzb2NrZXQuaW5ldF9hdG9uKElQIGFkZHJlc3MpIC0+IDMyLWJpdCBwYWNr ZWQgSVAgcmVwcmVzZW50YXRpb24KLSBzb2NrZXQuaW5ldF9udG9hKHBhY2tlZCBJUCkgLT4g SVAgYWRkcmVzcyBzdHJpbmcKLSBhbiBJbnRlcm5ldCBzb2NrZXQgYWRkcmVzcyBpcyBhIHBh aXIgKGhvc3RuYW1lLCBwb3J0KQogIHdoZXJlIGhvc3RuYW1lIGNhbiBiZSBhbnl0aGluZyBy ZWNvZ25pemVkIGJ5IGdldGhvc3RieW5hbWUoKQogIChpbmNsdWRpbmcgdGhlIGRkLmRkLmRk LmRkIG5vdGF0aW9uKSBhbmQgcG9ydCBpcyBpbiBob3N0IGJ5dGUgb3JkZXIKLSB3aGVyZSBh IGhvc3RuYW1lIGlzIHJldHVybmVkLCB0aGUgZGQuZGQuZGQuZGQgbm90YXRpb24gaXMgdXNl ZAotIGEgVU5JWCBkb21haW4gc29ja2V0IGFkZHJlc3MgaXMgYSBzdHJpbmcgc3BlY2lmeWlu ZyB0aGUgcGF0aG5hbWUKLSBhbiBBRl9QQUNLRVQgc29ja2V0IGFkZHJlc3MgaXMgYSB0dXBs ZSBjb250YWluaW5nIGEgc3RyaW5nCiAgc3BlY2lmeWluZyB0aGUgZXRoZXJuZXQgaW50ZXJm YWNlIGFuZCBhbiBpbnRlZ2VyIHNwZWNpZnlpbmcKICB0aGUgRXRoZXJuZXQgcHJvdG9jb2wg bnVtYmVyIHRvIGJlIHJlY2VpdmVkLiBGb3IgZXhhbXBsZToKICAoImV0aDAiLDB4MTIzNCku ICBPcHRpb25hbCAzcmQsNHRoLDV0aCBlbGVtZW50cyBpbiB0aGUgdHVwbGUKICBzcGVjaWZ5 IHBhY2tldC10eXBlIGFuZCBoYS10eXBlL2FkZHIgLS0gdGhlc2UgYXJlIGlnbm9yZWQgYnkK ICBuZXR3b3JraW5nIGNvZGUsIGJ1dCBhY2NlcHRlZCBzaW5jZSB0aGV5IGFyZSByZXR1cm5l ZCBieSB0aGUKICBnZXRzb2NrbmFtZSgpIG1ldGhvZC4KCkxvY2FsIG5hbWluZyBjb252ZW50 aW9uczoKCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBzb2NrXyBhcmUgc29ja2V0IG9iamVjdCBt ZXRob2RzCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBzb2NrZXRfIGFyZSBtb2R1bGUtbGV2ZWwg ZnVuY3Rpb25zCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBQeVNvY2tldCBhcmUgZXhwb3J0ZWQg dGhyb3VnaCBzb2NrZXRtb2R1bGUuaAoKKi8KCi8qIFNvY2tldCBvYmplY3QgZG9jdW1lbnRh dGlvbiAqLwpzdGF0aWMgY2hhciBzb2NrX2RvY1tdID0KInNvY2tldChbZmFtaWx5WywgdHlw ZVssIHByb3RvXV1dKSAtPiBzb2NrZXQgb2JqZWN0XG5cClxuXApPcGVuIGEgc29ja2V0IG9m IHRoZSBnaXZlbiB0eXBlLiAgVGhlIGZhbWlseSBhcmd1bWVudCBzcGVjaWZpZXMgdGhlXG5c CmFkZHJlc3MgZmFtaWx5OyBpdCBkZWZhdWx0cyB0byBBRl9JTkVULiAgVGhlIHR5cGUgYXJn dW1lbnQgc3BlY2lmaWVzXG5cCndoZXRoZXIgdGhpcyBpcyBhIHN0cmVhbSAoU09DS19TVFJF QU0sIHRoaXMgaXMgdGhlIGRlZmF1bHQpXG5cCm9yIGRhdGFncmFtIChTT0NLX0RHUkFNKSBz b2NrZXQuICBUaGUgcHJvdG9jb2wgYXJndW1lbnQgZGVmYXVsdHMgdG8gMCxcblwKc3BlY2lm eWluZyB0aGUgZGVmYXVsdCBwcm90b2NvbC4gIEtleXdvcmQgYXJndW1lbnRzIGFyZSBhY2Nl cHRlZC5cblwKXG5cCkEgc29ja2V0IG9iamVjdCByZXByZXNlbnRzIG9uZSBlbmRwb2ludCBv ZiBhIG5ldHdvcmsgY29ubmVjdGlvbi5cblwKXG5cCk1ldGhvZHMgb2Ygc29ja2V0IG9iamVj dHMgKGtleXdvcmQgYXJndW1lbnRzIG5vdCBhbGxvd2VkKTpcblwKXG5cCmFjY2VwdCgpIC0t IGFjY2VwdCBhIGNvbm5lY3Rpb24sIHJldHVybmluZyBuZXcgc29ja2V0IGFuZCBjbGllbnQg YWRkcmVzc1xuXApiaW5kKGFkZHIpIC0tIGJpbmQgdGhlIHNvY2tldCB0byBhIGxvY2FsIGFk ZHJlc3NcblwKY2xvc2UoKSAtLSBjbG9zZSB0aGUgc29ja2V0XG5cCmNvbm5lY3QoYWRkcikg LS0gY29ubmVjdCB0aGUgc29ja2V0IHRvIGEgcmVtb3RlIGFkZHJlc3NcblwKY29ubmVjdF9l eChhZGRyKSAtLSBjb25uZWN0LCByZXR1cm4gYW4gZXJyb3IgY29kZSBpbnN0ZWFkIG9mIGFu IGV4Y2VwdGlvblxuXApkdXAoKSAtLSByZXR1cm4gYSBuZXcgc29ja2V0IG9iamVjdCBpZGVu dGljYWwgdG8gdGhlIGN1cnJlbnQgb25lIFsqXVxuXApmaWxlbm8oKSAtLSByZXR1cm4gdW5k ZXJseWluZyBmaWxlIGRlc2NyaXB0b3JcblwKZ2V0cGVlcm5hbWUoKSAtLSByZXR1cm4gcmVt b3RlIGFkZHJlc3MgWypdXG5cCmdldHNvY2tuYW1lKCkgLS0gcmV0dXJuIGxvY2FsIGFkZHJl c3NcblwKZ2V0c29ja29wdChsZXZlbCwgb3B0bmFtZVssIGJ1Zmxlbl0pIC0tIGdldCBzb2Nr ZXQgb3B0aW9uc1xuXApnZXR0aW1lb3V0KCkgLS0gcmV0dXJuIHRpbWVvdXQgb3IgTm9uZVxu XApsaXN0ZW4obikgLS0gc3RhcnQgbGlzdGVuaW5nIGZvciBpbmNvbWluZyBjb25uZWN0aW9u c1xuXAptYWtlZmlsZShbbW9kZSwgW2J1ZnNpemVdXSkgLS0gcmV0dXJuIGEgZmlsZSBvYmpl Y3QgZm9yIHRoZSBzb2NrZXQgWypdXG5cCnJlY3YoYnVmbGVuWywgZmxhZ3NdKSAtLSByZWNl aXZlIGRhdGFcblwKcmVjdmZyb20oYnVmbGVuWywgZmxhZ3NdKSAtLSByZWNlaXZlIGRhdGEg YW5kIHNlbmRlcidzIGFkZHJlc3NcblwKc2VuZGFsbChkYXRhWywgZmxhZ3NdKSAtLSBzZW5k IGFsbCBkYXRhXG5cCnNlbmQoZGF0YVssIGZsYWdzXSkgLS0gc2VuZCBkYXRhLCBtYXkgbm90 IHNlbmQgYWxsIG9mIGl0XG5cCnNlbmR0byhkYXRhWywgZmxhZ3NdLCBhZGRyKSAtLSBzZW5k IGRhdGEgdG8gYSBnaXZlbiBhZGRyZXNzXG5cCnNldGJsb2NraW5nKDAgfCAxKSAtLSBzZXQg b3IgY2xlYXIgdGhlIGJsb2NraW5nIEkvTyBmbGFnXG5cCnNldHNvY2tvcHQobGV2ZWwsIG9w dG5hbWUsIHZhbHVlKSAtLSBzZXQgc29ja2V0IG9wdGlvbnNcblwKc2V0dGltZW91dChOb25l IHwgZmxvYXQpIC0tIHNldCBvciBjbGVhciB0aGUgdGltZW91dFxuXApzaHV0ZG93bihob3cp IC0tIHNodXQgZG93biB0cmFmZmljIGluIG9uZSBvciBib3RoIGRpcmVjdGlvbnNcblwKXG5c CiBbKl0gbm90IGF2YWlsYWJsZSBvbiBhbGwgcGxhdGZvcm1zISI7CgojaW5jbHVkZSAiUHl0 aG9uLmgiCgovKiBYWFggVGhpcyBpcyBhIHRlcnJpYmxlIG1lc3Mgb2Ygb2YgcGxhdGZvcm0t ZGVwZW5kZW50IHByZXByb2Nlc3NvciBoYWNrcy4KICAgSSBob3BlIHNvbWUgZGF5IHNvbWVv bmUgY2FuIGNsZWFuIHRoaXMgdXAgcGxlYXNlLi4uICovCgovKiBIYWNrcyBmb3IgZ2V0aG9z dGJ5bmFtZV9yKCkuICBPbiBzb21lIG5vbi1MaW51eCBwbGF0Zm9ybXMsIHRoZSBjb25maWd1 cmUKICAgc2NyaXB0IGRvZXNuJ3QgZ2V0IHRoaXMgcmlnaHQsIHNvIHdlIGhhcmRjb2RlIHNv bWUgcGxhdGZvcm0gY2hlY2tzIGJlbG93LgogICBPbiB0aGUgb3RoZXIgaGFuZCwgbm90IGFs bCBMaW51eCB2ZXJzaW9ucyBhZ3JlZSwgc28gdGhlcmUgdGhlIHNldHRpbmdzCiAgIGNvbXB1 dGVkIGJ5IHRoZSBjb25maWd1cmUgc2NyaXB0IGFyZSBuZWVkZWQhICovCgojaWZuZGVmIGxp bnV4CiMgdW5kZWYgSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcKIyB1bmRlZiBIQVZFX0dF VEhPU1RCWU5BTUVfUl81X0FSRwojIHVuZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzZfQVJH CiNlbmRpZgoKI2lmbmRlZiBXSVRIX1RIUkVBRAojIHVuZGVmIEhBVkVfR0VUSE9TVEJZTkFN RV9SCiNlbmRpZgoKI2lmZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SCiMgaWYgZGVmaW5lZChf QUlYKSB8fCBkZWZpbmVkKF9fb3NmX18pCiMgIGRlZmluZSBIQVZFX0dFVEhPU1RCWU5BTUVf Ul8zX0FSRwojIGVsaWYgZGVmaW5lZChfX3N1bikgfHwgZGVmaW5lZChfX3NnaSkKIyAgZGVm aW5lIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHCiMgZWxpZiBkZWZpbmVkKGxpbnV4KQov KiBSZWx5IG9uIHRoZSBjb25maWd1cmUgc2NyaXB0ICovCiMgZWxzZQojICB1bmRlZiBIQVZF X0dFVEhPU1RCWU5BTUVfUgojIGVuZGlmCiNlbmRpZgoKI2lmICFkZWZpbmVkKEhBVkVfR0VU SE9TVEJZTkFNRV9SKSAmJiBkZWZpbmVkKFdJVEhfVEhSRUFEKSAmJiBcCiAgICAhZGVmaW5l ZChNU19XSU5ET1dTKQojIGRlZmluZSBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCiNlbmRpZgoK I2lmZGVmIFVTRV9HRVRIT1NUQllOQU1FX0xPQ0sKIyBpbmNsdWRlICJweXRocmVhZC5oIgoj ZW5kaWYKCiNpZiBkZWZpbmVkKFBZQ0NfVkFDUFApCiMgaW5jbHVkZSA8dHlwZXMuaD4KIyBp bmNsdWRlIDxpby5oPgojIGluY2x1ZGUgPHN5cy9pb2N0bC5oPgojIGluY2x1ZGUgPHV0aWxz Lmg+CiMgaW5jbHVkZSA8Y3R5cGUuaD4KI2VuZGlmCgojaWYgZGVmaW5lZChQWU9TX09TMikK IyBkZWZpbmUgIElOQ0xfRE9TCiMgZGVmaW5lICBJTkNMX0RPU0VSUk9SUwojIGRlZmluZSAg SU5DTF9OT1BNQVBJCiMgaW5jbHVkZSA8b3MyLmg+CiNlbmRpZgoKLyogR2VuZXJpYyBpbmNs dWRlcyAqLwojaW5jbHVkZSA8c3lzL3R5cGVzLmg+CiNpbmNsdWRlIDxzaWduYWwuaD4KCi8q IEdlbmVyaWMgc29ja2V0IG9iamVjdCBkZWZpbml0aW9ucyBhbmQgaW5jbHVkZXMgKi8KI2Rl ZmluZSBQeVNvY2tldF9CVUlMRElOR19TT0NLRVQKI2luY2x1ZGUgInNvY2tldG1vZHVsZS5o IgoKLyogQWRkcmVzc2luZyBpbmNsdWRlcyAqLwoKI2lmbmRlZiBNU19XSU5ET1dTCgovKiBO b24tTVMgV0lORE9XUyBpbmNsdWRlcyAqLwojIGluY2x1ZGUgPG5ldGRiLmg+CgovKiBIZWFk ZXJzIG5lZWRlZCBmb3IgaW5ldF9udG9hKCkgYW5kIGluZXRfYWRkcigpICovCiMgaWZkZWYg X19CRU9TX18KIyAgaW5jbHVkZSA8bmV0L25ldGRiLmg+CiMgZWxpZiBkZWZpbmVkKFBZT1Nf T1MyKSAmJiBkZWZpbmVkKFBZQ0NfVkFDUFApCiMgIGluY2x1ZGUgPG5ldGRiLmg+CnR5cGVk ZWYgc2l6ZV90IHNvY2tsZW5fdDsKIyBlbHNlCiMgICBpbmNsdWRlIDxhcnBhL2luZXQuaD4K IyBlbmRpZgoKIyBpZm5kZWYgUklTQ09TCiMgIGluY2x1ZGUgPGZjbnRsLmg+CiMgZWxzZQoj ICBpbmNsdWRlIDxzeXMvZmNudGwuaD4KIyAgZGVmaW5lIE5PX0RVUAppbnQgaF9lcnJubzsg Lyogbm90IHVzZWQgKi8KIyBlbmRpZgoKI2Vsc2UKCi8qIE1TX1dJTkRPV1MgaW5jbHVkZXMg Ki8KIyBpbmNsdWRlIDxmY250bC5oPgoKI2VuZGlmCgojaWZkZWYgSEFWRV9TVERERUZfSAoj IGluY2x1ZGUgPHN0ZGRlZi5oPgojZW5kaWYKCiNpZm5kZWYgb2Zmc2V0b2YKIyBkZWZpbmUg b2Zmc2V0b2YodHlwZSwgbWVtYmVyKQkoKHNpemVfdCkoJigodHlwZSAqKTApLT5tZW1iZXIp KQojZW5kaWYKCiNpZm5kZWYgT19OREVMQVkKIyBkZWZpbmUgT19OREVMQVkgT19OT05CTE9D SwkvKiBGb3IgUU5YIG9ubHk/ICovCiNlbmRpZgoKI2luY2x1ZGUgImFkZHJpbmZvLmgiCgoj aWZuZGVmIEhBVkVfSU5FVF9QVE9OCmludCBpbmV0X3B0b24oaW50IGFmLCBjb25zdCBjaGFy ICpzcmMsIHZvaWQgKmRzdCk7CmNvbnN0IGNoYXIgKmluZXRfbnRvcChpbnQgYWYsIGNvbnN0 IHZvaWQgKnNyYywgY2hhciAqZHN0LCBzb2NrbGVuX3Qgc2l6ZSk7CiNlbmRpZgoKI2lmZGVm IF9fQVBQTEVfXwovKiBPbiBPUyBYLCBnZXRhZGRyaW5mbyByZXR1cm5zIG5vIGVycm9yIGlu ZGljYXRpb24gb2YgbG9va3VwCiAgIGZhaWx1cmUsIHNvIHdlIG11c3QgdXNlIHRoZSBlbXVs YXRpb24gaW5zdGVhZCBvZiB0aGUgbGliaW5mbwogICBpbXBsZW1lbnRhdGlvbi4gVW5mb3J0 dW5hdGVseSwgcGVyZm9ybWluZyBhbiBhdXRvY29uZiB0ZXN0CiAgIGZvciB0aGlzIGJ1ZyB3 b3VsZCByZXF1aXJlIEROUyBhY2Nlc3MgZm9yIHRoZSBtYWNoaW5lIHBlcmZvcm1pbmcKICAg dGhlIGNvbmZpZ3VyYXRpb24sIHdoaWNoIGlzIG5vdCBhY2NlcHRhYmxlLiBUaGVyZWZvcmUs IHdlCiAgIGRldGVybWluZSB0aGUgYnVnIGp1c3QgYnkgY2hlY2tpbmcgZm9yIF9fQVBQTEVf Xy4gSWYgdGhpcyBidWcKICAgZ2V0cyBldmVyIGZpeGVkLCBwZXJoYXBzIGNoZWNraW5nIGZv ciBzeXMvdmVyc2lvbi5oIHdvdWxkIGJlCiAgIGFwcHJvcHJpYXRlLCB3aGljaCBpcyAxMC8w IG9uIHRoZSBzeXN0ZW0gd2l0aCB0aGUgYnVnLiAqLwojdW5kZWYgSEFWRV9HRVRBRERSSU5G TwovKiBhdm9pZCBjbGFzaGVzIHdpdGggdGhlIEMgbGlicmFyeSBkZWZpbml0aW9uIG9mIHRo ZSBzeW1ib2wuICovCiNkZWZpbmUgZ2V0YWRkcmluZm8gZmFrZV9nZXRhZGRyaW5mbwojZW5k aWYKCi8qIEkga25vdyB0aGlzIGlzIGEgYmFkIHByYWN0aWNlLCBidXQgaXQgaXMgdGhlIGVh c2llc3QuLi4gKi8KI2lmICFkZWZpbmVkKEhBVkVfR0VUQUREUklORk8pCiNpbmNsdWRlICJn ZXRhZGRyaW5mby5jIgojZW5kaWYKI2lmICFkZWZpbmVkKEhBVkVfR0VUTkFNRUlORk8pCiNp bmNsdWRlICJnZXRuYW1laW5mby5jIgojZW5kaWYKCiNpZiBkZWZpbmVkKE1TX1dJTkRPV1Mp IHx8IGRlZmluZWQoX19CRU9TX18pCi8qIEJlT1Mgc3VmZmVycyBmcm9tIHRoZSBzYW1lIHNv Y2tldCBkaWNob3RvbXkgYXMgV2luMzIuLi4gLSBbY2poXSAqLwovKiBzZWVtIHRvIGJlIGEg ZmV3IGRpZmZlcmVuY2VzIGluIHRoZSBBUEkgKi8KI2RlZmluZSBTT0NLRVRDTE9TRSBjbG9z ZXNvY2tldAojZGVmaW5lIE5PX0RVUCAvKiBBY3R1YWxseSBpdCBleGlzdHMgb24gTlQgMy41 LCBidXQgd2hhdCB0aGUgaGVjay4uLiAqLwojZW5kaWYKCiNpZmRlZiBNU19XSU4zMgojZGVm aW5lIEVBRk5PU1VQUE9SVCBXU0FFQUZOT1NVUFBPUlQKI2RlZmluZSBzbnByaW50ZiBfc25w cmludGYKI2VuZGlmCgojaWYgZGVmaW5lZChQWU9TX09TMikgJiYgIWRlZmluZWQoUFlDQ19H Q0MpCiNkZWZpbmUgU09DS0VUQ0xPU0Ugc29jbG9zZQojZGVmaW5lIE5PX0RVUCAvKiBTb2Nr ZXRzIGFyZSBOb3QgQWN0dWFsIEZpbGUgSGFuZGxlcyB1bmRlciBPUy8yICovCiNlbmRpZgoK I2lmbmRlZiBTT0NLRVRDTE9TRQojZGVmaW5lIFNPQ0tFVENMT1NFIGNsb3NlCiNlbmRpZgoK LyogWFhYIFRoZXJlJ3MgYSBwcm9ibGVtIGhlcmU6ICpzdGF0aWMqIGZ1bmN0aW9ucyBhcmUg bm90IHN1cHBvc2VkIHRvIGhhdmUKICAgYSBQeSBwcmVmaXggKG9yIHVzZSBDYXBpdGFsaXpl ZFdvcmRzKS4gIExhdGVyLi4uICovCgovKiBHbG9iYWwgdmFyaWFibGUgaG9sZGluZyB0aGUg ZXhjZXB0aW9uIHR5cGUgZm9yIGVycm9ycyBkZXRlY3RlZAogICBieSB0aGlzIG1vZHVsZSAo YnV0IG5vdCBhcmd1bWVudCB0eXBlIG9yIG1lbW9yeSBlcnJvcnMsIGV0Yy4pLiAqLwpzdGF0 aWMgUHlPYmplY3QgKnNvY2tldF9lcnJvcjsKc3RhdGljIFB5T2JqZWN0ICpzb2NrZXRfaGVy cm9yOwpzdGF0aWMgUHlPYmplY3QgKnNvY2tldF9nYWllcnJvcjsKCiNpZmRlZiBSSVNDT1MK LyogR2xvYmFsIHZhcmlhYmxlIHdoaWNoIGlzICE9MCBpZiBQeXRob24gaXMgcnVubmluZyBp biBhIFJJU0MgT1MgdGFza3dpbmRvdyAqLwpzdGF0aWMgaW50IHRhc2t3aW5kb3c7CiNlbmRp ZgoKLyogQSBmb3J3YXJkIHJlZmVyZW5jZSB0byB0aGUgc29ja2V0IHR5cGUgb2JqZWN0Lgog ICBUaGUgc29ja190eXBlIHZhcmlhYmxlIGNvbnRhaW5zIHBvaW50ZXJzIHRvIHZhcmlvdXMg ZnVuY3Rpb25zLAogICBzb21lIG9mIHdoaWNoIGNhbGwgbmV3X3NvY2tvYmplY3QoKSwgd2hp Y2ggdXNlcyBzb2NrX3R5cGUsIHNvCiAgIHRoZXJlIGhhcyB0byBiZSBhIGNpcmN1bGFyIHJl ZmVyZW5jZS4gKi8Kc3RhdGljZm9yd2FyZCBQeVR5cGVPYmplY3Qgc29ja190eXBlOwoKLyog Q29udmVuaWVuY2UgZnVuY3Rpb24gdG8gcmFpc2UgYW4gZXJyb3IgYWNjb3JkaW5nIHRvIGVy cm5vCiAgIGFuZCByZXR1cm4gYSBOVUxMIHBvaW50ZXIgZnJvbSBhIGZ1bmN0aW9uLiAqLwoK c3RhdGljIFB5T2JqZWN0ICoKc2V0X2Vycm9yKHZvaWQpCnsKI2lmZGVmIE1TX1dJTkRPV1MK CWludCBlcnJfbm8gPSBXU0FHZXRMYXN0RXJyb3IoKTsKCXN0YXRpYyBzdHJ1Y3QgewoJCWlu dCBubzsKCQljb25zdCBjaGFyICptc2c7Cgl9ICptc2dwLCBtc2dzW10gPSB7CgkJe1dTQUVJ TlRSLCAiSW50ZXJydXB0ZWQgc3lzdGVtIGNhbGwifSwKCQl7V1NBRUJBREYsICJCYWQgZmls ZSBkZXNjcmlwdG9yIn0sCgkJe1dTQUVBQ0NFUywgIlBlcm1pc3Npb24gZGVuaWVkIn0sCgkJ e1dTQUVGQVVMVCwgIkJhZCBhZGRyZXNzIn0sCgkJe1dTQUVJTlZBTCwgIkludmFsaWQgYXJn dW1lbnQifSwKCQl7V1NBRU1GSUxFLCAiVG9vIG1hbnkgb3BlbiBmaWxlcyJ9LAoJCXtXU0FF V09VTERCTE9DSywKCQkgICJUaGUgc29ja2V0IG9wZXJhdGlvbiBjb3VsZCBub3QgY29tcGxl dGUgIgoJCSAgIndpdGhvdXQgYmxvY2tpbmcifSwKCQl7V1NBRUlOUFJPR1JFU1MsICJPcGVy YXRpb24gbm93IGluIHByb2dyZXNzIn0sCgkJe1dTQUVBTFJFQURZLCAiT3BlcmF0aW9uIGFs cmVhZHkgaW4gcHJvZ3Jlc3MifSwKCQl7V1NBRU5PVFNPQ0ssICJTb2NrZXQgb3BlcmF0aW9u IG9uIG5vbi1zb2NrZXQifSwKCQl7V1NBRURFU1RBRERSUkVRLCAiRGVzdGluYXRpb24gYWRk cmVzcyByZXF1aXJlZCJ9LAoJCXtXU0FFTVNHU0laRSwgIk1lc3NhZ2UgdG9vIGxvbmcifSwK CQl7V1NBRVBST1RPVFlQRSwgIlByb3RvY29sIHdyb25nIHR5cGUgZm9yIHNvY2tldCJ9LAoJ CXtXU0FFTk9QUk9UT09QVCwgIlByb3RvY29sIG5vdCBhdmFpbGFibGUifSwKCQl7V1NBRVBS T1RPTk9TVVBQT1JULCAiUHJvdG9jb2wgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFU09DS1RO T1NVUFBPUlQsICJTb2NrZXQgdHlwZSBub3Qgc3VwcG9ydGVkIn0sCgkJe1dTQUVPUE5PVFNV UFAsICJPcGVyYXRpb24gbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFUEZOT1NVUFBPUlQsICJQ cm90b2NvbCBmYW1pbHkgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFQUZOT1NVUFBPUlQsICJB ZGRyZXNzIGZhbWlseSBub3Qgc3VwcG9ydGVkIn0sCgkJe1dTQUVBRERSSU5VU0UsICJBZGRy ZXNzIGFscmVhZHkgaW4gdXNlIn0sCgkJe1dTQUVBRERSTk9UQVZBSUwsICJDYW4ndCBhc3Np Z24gcmVxdWVzdGVkIGFkZHJlc3MifSwKCQl7V1NBRU5FVERPV04sICJOZXR3b3JrIGlzIGRv d24ifSwKCQl7V1NBRU5FVFVOUkVBQ0gsICJOZXR3b3JrIGlzIHVucmVhY2hhYmxlIn0sCgkJ e1dTQUVORVRSRVNFVCwgIk5ldHdvcmsgZHJvcHBlZCBjb25uZWN0aW9uIG9uIHJlc2V0In0s CgkJe1dTQUVDT05OQUJPUlRFRCwgIlNvZnR3YXJlIGNhdXNlZCBjb25uZWN0aW9uIGFib3J0 In0sCgkJe1dTQUVDT05OUkVTRVQsICJDb25uZWN0aW9uIHJlc2V0IGJ5IHBlZXIifSwKCQl7 V1NBRU5PQlVGUywgIk5vIGJ1ZmZlciBzcGFjZSBhdmFpbGFibGUifSwKCQl7V1NBRUlTQ09O TiwgIlNvY2tldCBpcyBhbHJlYWR5IGNvbm5lY3RlZCJ9LAoJCXtXU0FFTk9UQ09OTiwgIlNv Y2tldCBpcyBub3QgY29ubmVjdGVkIn0sCgkJe1dTQUVTSFVURE9XTiwgIkNhbid0IHNlbmQg YWZ0ZXIgc29ja2V0IHNodXRkb3duIn0sCgkJe1dTQUVUT09NQU5ZUkVGUywgIlRvbyBtYW55 IHJlZmVyZW5jZXM6IGNhbid0IHNwbGljZSJ9LAoJCXtXU0FFVElNRURPVVQsICJPcGVyYXRp b24gdGltZWQgb3V0In0sCgkJe1dTQUVDT05OUkVGVVNFRCwgIkNvbm5lY3Rpb24gcmVmdXNl ZCJ9LAoJCXtXU0FFTE9PUCwgIlRvbyBtYW55IGxldmVscyBvZiBzeW1ib2xpYyBsaW5rcyJ9 LAoJCXtXU0FFTkFNRVRPT0xPTkcsICJGaWxlIG5hbWUgdG9vIGxvbmcifSwKCQl7V1NBRUhP U1RET1dOLCAiSG9zdCBpcyBkb3duIn0sCgkJe1dTQUVIT1NUVU5SRUFDSCwgIk5vIHJvdXRl IHRvIGhvc3QifSwKCQl7V1NBRU5PVEVNUFRZLCAiRGlyZWN0b3J5IG5vdCBlbXB0eSJ9LAoJ CXtXU0FFUFJPQ0xJTSwgIlRvbyBtYW55IHByb2Nlc3NlcyJ9LAoJCXtXU0FFVVNFUlMsICJU b28gbWFueSB1c2VycyJ9LAoJCXtXU0FFRFFVT1QsICJEaXNjIHF1b3RhIGV4Y2VlZGVkIn0s CgkJe1dTQUVTVEFMRSwgIlN0YWxlIE5GUyBmaWxlIGhhbmRsZSJ9LAoJCXtXU0FFUkVNT1RF LCAiVG9vIG1hbnkgbGV2ZWxzIG9mIHJlbW90ZSBpbiBwYXRoIn0sCgkJe1dTQVNZU05PVFJF QURZLCAiTmV0d29yayBzdWJzeXN0ZW0gaXMgdW52YWlsYWJsZSJ9LAoJCXtXU0FWRVJOT1RT VVBQT1JURUQsICJXaW5Tb2NrIHZlcnNpb24gaXMgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FO T1RJTklUSUFMSVNFRCwKCQkgICJTdWNjZXNzZnVsIFdTQVN0YXJ0dXAoKSBub3QgeWV0IHBl cmZvcm1lZCJ9LAoJCXtXU0FFRElTQ09OLCAiR3JhY2VmdWwgc2h1dGRvd24gaW4gcHJvZ3Jl c3MifSwKCQkvKiBSZXNvbHZlciBlcnJvcnMgKi8KCQl7V1NBSE9TVF9OT1RfRk9VTkQsICJO byBzdWNoIGhvc3QgaXMga25vd24ifSwKCQl7V1NBVFJZX0FHQUlOLCAiSG9zdCBub3QgZm91 bmQsIG9yIHNlcnZlciBmYWlsZWQifSwKCQl7V1NBTk9fUkVDT1ZFUlksICJVbmV4cGVjdGVk IHNlcnZlciBlcnJvciBlbmNvdW50ZXJlZCJ9LAoJCXtXU0FOT19EQVRBLCAiVmFsaWQgbmFt ZSB3aXRob3V0IHJlcXVlc3RlZCBkYXRhIn0sCgkJe1dTQU5PX0FERFJFU1MsICJObyBhZGRy ZXNzLCBsb29rIGZvciBNWCByZWNvcmQifSwKCQl7MCwgTlVMTH0KCX07CglpZiAoZXJyX25v KSB7CgkJUHlPYmplY3QgKnY7CgkJY29uc3QgY2hhciAqbXNnID0gIndpbnNvY2sgZXJyb3Ii OwoKCQlmb3IgKG1zZ3AgPSBtc2dzOyBtc2dwLT5tc2c7IG1zZ3ArKykgewoJCQlpZiAoZXJy X25vID09IG1zZ3AtPm5vKSB7CgkJCQltc2cgPSBtc2dwLT5tc2c7CgkJCQlicmVhazsKCQkJ fQoJCX0KCgkJdiA9IFB5X0J1aWxkVmFsdWUoIihpcykiLCBlcnJfbm8sIG1zZyk7CgkJaWYg KHYgIT0gTlVMTCkgewoJCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9yLCB2KTsKCQkJ UHlfREVDUkVGKHYpOwoJCX0KCQlyZXR1cm4gTlVMTDsKCX0KCWVsc2UKI2VuZGlmCgojaWYg ZGVmaW5lZChQWU9TX09TMikgJiYgIWRlZmluZWQoUFlDQ19HQ0MpCglpZiAoc29ja19lcnJu bygpICE9IE5PX0VSUk9SKSB7CgkJQVBJUkVUIHJjOwoJCVVMT05HICBtc2dsZW47CgkJY2hh ciBvdXRidWZbMTAwXTsKCQlpbnQgbXllcnJvcmNvZGUgPSBzb2NrX2Vycm5vKCk7CgoJCS8q IFJldHJpZXZlIHNvY2tldC1yZWxhdGVkIGVycm9yIG1lc3NhZ2UgZnJvbSBNUFROLk1TRyBm aWxlICovCgkJcmMgPSBEb3NHZXRNZXNzYWdlKE5VTEwsIDAsIG91dGJ1Ziwgc2l6ZW9mKG91 dGJ1ZiksCgkJCQkgICBteWVycm9yY29kZSAtIFNPQ0JBU0VFUlIgKyAyNiwKCQkJCSAgICJt cHRuLm1zZyIsCgkJCQkgICAmbXNnbGVuKTsKCQlpZiAocmMgPT0gTk9fRVJST1IpIHsKCQkJ UHlPYmplY3QgKnY7CgoJCQkvKiBPUy8yIGRvZXNuJ3QgZ3VhcmFudGVlIGEgdGVybWluYXRv ciAqLwoJCQlvdXRidWZbbXNnbGVuXSA9ICdcMCc7CgkJCWlmIChzdHJsZW4ob3V0YnVmKSA+ IDApIHsKCQkJCS8qIElmIG5vbi1lbXB0eSBtc2csIHRyaW0gQ1JMRiAqLwoJCQkJY2hhciAq bGFzdGMgPSAmb3V0YnVmWyBzdHJsZW4ob3V0YnVmKS0xIF07CgkJCQl3aGlsZSAobGFzdGMg PiBvdXRidWYgJiYgaXNzcGFjZSgqbGFzdGMpKSB7CgkJCQkJLyogVHJpbSB0cmFpbGluZyB3 aGl0ZXNwYWNlIChDUkxGKSAqLwoJCQkJCSpsYXN0Yy0tID0gJ1wwJzsKCQkJCX0KCQkJfQoJ CQl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIG15ZXJyb3Jjb2RlLCBvdXRidWYpOwoJCQlp ZiAodiAhPSBOVUxMKSB7CgkJCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9yLCB2KTsK CQkJCVB5X0RFQ1JFRih2KTsKCQkJfQoJCQlyZXR1cm4gTlVMTDsKCQl9Cgl9CiNlbmRpZgoK CXJldHVybiBQeUVycl9TZXRGcm9tRXJybm8oc29ja2V0X2Vycm9yKTsKfQoKCnN0YXRpYyBQ eU9iamVjdCAqCnNldF9oZXJyb3IoaW50IGhfZXJyb3IpCnsKCVB5T2JqZWN0ICp2OwoKI2lm ZGVmIEhBVkVfSFNUUkVSUk9SCgl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIGhfZXJyb3Is IChjaGFyICopaHN0cmVycm9yKGhfZXJyb3IpKTsKI2Vsc2UKCXYgPSBQeV9CdWlsZFZhbHVl KCIoaXMpIiwgaF9lcnJvciwgImhvc3Qgbm90IGZvdW5kIik7CiNlbmRpZgoJaWYgKHYgIT0g TlVMTCkgewoJCVB5RXJyX1NldE9iamVjdChzb2NrZXRfaGVycm9yLCB2KTsKCQlQeV9ERUNS RUYodik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCgpzdGF0aWMgUHlPYmplY3QgKgpzZXRfZ2Fp ZXJyb3IoaW50IGVycm9yKQp7CglQeU9iamVjdCAqdjsKCiNpZmRlZiBFQUlfU1lTVEVNCgkv KiBFQUlfU1lTVEVNIGlzIG5vdCBhdmFpbGFibGUgb24gV2luZG93cyBYUC4gKi8KCWlmIChl cnJvciA9PSBFQUlfU1lTVEVNKQoJCXJldHVybiBzZXRfZXJyb3IoKTsKI2VuZGlmCgojaWZk ZWYgSEFWRV9HQUlfU1RSRVJST1IKCXYgPSBQeV9CdWlsZFZhbHVlKCIoaXMpIiwgZXJyb3Is IGdhaV9zdHJlcnJvcihlcnJvcikpOwojZWxzZQoJdiA9IFB5X0J1aWxkVmFsdWUoIihpcyki LCBlcnJvciwgImdldGFkZHJpbmZvIGZhaWxlZCIpOwojZW5kaWYKCWlmICh2ICE9IE5VTEwp IHsKCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2dhaWVycm9yLCB2KTsKCQlQeV9ERUNSRUYo dik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCi8qIEZvciB0aW1lb3V0IGVycm9ycyAqLwpzdGF0 aWMgUHlPYmplY3QgKgp0aW1lb3V0X2Vycih2b2lkKQp7CglQeU9iamVjdCAqdjsKCiNpZmRl ZiBNU19XSU5ET1dTCgl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIFdTQUVUSU1FRE9VVCwg IlNvY2tldCBvcGVyYXRpb24gdGltZWQgb3V0Iik7CiNlbHNlCgl2ID0gUHlfQnVpbGRWYWx1 ZSgiKGlzKSIsIEVUSU1FRE9VVCwgIlNvY2tldCBvcGVyYXRpb24gdGltZWQgb3V0Iik7CiNl bmRpZgoKCWlmICh2ICE9IE5VTEwpIHsKCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9y LCB2KTsKCQlQeV9ERUNSRUYodik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCi8qIEZ1bmN0aW9u IHRvIHBlcmZvcm0gdGhlIHNldHRpbmcgb2Ygc29ja2V0IGJsb2NraW5nIG1vZGUKICAgaW50 ZXJuYWxseS4gYmxvY2sgPSAoMSB8IDApLiAqLwpzdGF0aWMgaW50CmludGVybmFsX3NldGJs b2NraW5nKFB5U29ja2V0U29ja09iamVjdCAqcywgaW50IGJsb2NrKQp7CiNpZm5kZWYgUklT Q09TCiNpZm5kZWYgTVNfV0lORE9XUwoJaW50IGRlbGF5X2ZsYWc7CiNlbmRpZgojZW5kaWYK CglQeV9CRUdJTl9BTExPV19USFJFQURTCiNpZmRlZiBfX0JFT1NfXwoJYmxvY2sgPSAhYmxv Y2s7CglzZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIFNPTF9TT0NLRVQsIFNPX05PTkJMT0NLLAoJ CSAgICh2b2lkICopKCZibG9jayksIHNpemVvZihpbnQpKTsKI2Vsc2UKI2lmbmRlZiBSSVND T1MKI2lmbmRlZiBNU19XSU5ET1dTCiNpZiBkZWZpbmVkKFBZT1NfT1MyKSAmJiAhZGVmaW5l ZChQWUNDX0dDQykKCWJsb2NrID0gIWJsb2NrOwoJaW9jdGwocy0+c29ja19mZCwgRklPTkJJ TywgKGNhZGRyX3QpJmJsb2NrLCBzaXplb2YoYmxvY2spKTsKI2Vsc2UgLyogIVBZT1NfT1My ICovCglkZWxheV9mbGFnID0gZmNudGwocy0+c29ja19mZCwgRl9HRVRGTCwgMCk7CglpZiAo YmxvY2spCgkJZGVsYXlfZmxhZyAmPSAofk9fTkRFTEFZKTsKCWVsc2UKCQlkZWxheV9mbGFn IHw9IE9fTkRFTEFZOwoJZmNudGwocy0+c29ja19mZCwgRl9TRVRGTCwgZGVsYXlfZmxhZyk7 CiNlbmRpZiAvKiAhUFlPU19PUzIgKi8KI2Vsc2UgLyogTVNfV0lORE9XUyAqLwoJYmxvY2sg PSAhYmxvY2s7Cglpb2N0bHNvY2tldChzLT5zb2NrX2ZkLCBGSU9OQklPLCAodV9sb25nKikm YmxvY2spOwojZW5kaWYgLyogTVNfV0lORE9XUyAqLwojZW5kaWYgLyogX19CRU9TX18gKi8K I2VuZGlmIC8qIFJJU0NPUyAqLwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCgkvKiBTaW5jZSB0 aGVzZSBkb24ndCByZXR1cm4gYW55dGhpbmcgKi8KCXJldHVybiAxOwp9CgovKiBGb3IgYWNj ZXNzIHRvIHRoZSBzZWxlY3QgbW9kdWxlIHRvIHBvbGwgdGhlIHNvY2tldCBmb3IgdGltZW91 dAogICBmdW5jdGlvbmFsaXR5LiB3cml0aW5nIGlzIDEgZm9yIHdyaXRpbmcsIDAgZm9yIHJl YWRpbmcuCiAgIFJldHVybiB2YWx1ZTogLTEgaWYgZXJyb3IsIDAgaWYgbm90IHJlYWR5LCA+ PSAxIGlmIHJlYWR5LgogICBBbiBleGNlcHRpb24gaXMgc2V0IHdoZW4gdGhlIHJldHVybiB2 YWx1ZSBpcyA8PSAwICghKS4gKi8Kc3RhdGljIGludAppbnRlcm5hbF9zZWxlY3QoUHlTb2Nr ZXRTb2NrT2JqZWN0ICpzLCBpbnQgd3JpdGluZykKewoJZmRfc2V0IGZkczsKCXN0cnVjdCB0 aW1ldmFsIHR2OwoJaW50IGNvdW50OwoKCS8qIENvbnN0cnVjdCB0aGUgYXJndW1lbnRzIHRv IHNlbGVjdCAqLwoJdHYudHZfc2VjID0gKGludClzLT5zb2NrX3RpbWVvdXQ7Cgl0di50dl91 c2VjID0gKGludCkoKHMtPnNvY2tfdGltZW91dCAtIHR2LnR2X3NlYykgKiAxZTYpOwoJRkRf WkVSTygmZmRzKTsKCUZEX1NFVChzLT5zb2NrX2ZkLCAmZmRzKTsKCgkvKiBTZWUgaWYgdGhl IHNvY2tldCBpcyByZWFkeSAqLwoJaWYgKHdyaXRpbmcpCgkJY291bnQgPSBzZWxlY3Qocy0+ c29ja19mZCsxLCBOVUxMLCAmZmRzLCBOVUxMLCAmdHYpOwoJZWxzZQoJCWNvdW50ID0gc2Vs ZWN0KHMtPnNvY2tfZmQrMSwgJmZkcywgTlVMTCwgTlVMTCwgJnR2KTsKCgkvKiBDaGVjayBm b3IgZXJyb3JzICovCglpZiAoY291bnQgPCAwKSB7CgkJcy0+ZXJyb3JoYW5kbGVyKCk7CgkJ cmV0dXJuIC0xOwoJfQoKCS8qIFNldCB0aGUgZXJyb3IgaWYgdGhlIHRpbWVvdXQgaGFzIGVs YXBzZWQsIGkuZSwgd2Ugd2VyZSBub3QKCSAgcG9sbGVkLiAqLwoJaWYgKGNvdW50ID09IDAp CgkJdGltZW91dF9lcnIoKTsKCglyZXR1cm4gY291bnQ7Cn0KCi8qIEluaXRpYWxpemUgYSBu ZXcgc29ja2V0IG9iamVjdC4gKi8KCnN0YXRpYyB2b2lkCmluaXRfc29ja29iamVjdChQeVNv Y2tldFNvY2tPYmplY3QgKnMsCgkJU09DS0VUX1QgZmQsIGludCBmYW1pbHksIGludCB0eXBl LCBpbnQgcHJvdG8pCnsKI2lmZGVmIFJJU0NPUwoJaW50IGJsb2NrID0gMTsKI2VuZGlmCglz LT5zb2NrX2ZkID0gZmQ7CglzLT5zb2NrX2ZhbWlseSA9IGZhbWlseTsKCXMtPnNvY2tfdHlw ZSA9IHR5cGU7CglzLT5zb2NrX3Byb3RvID0gcHJvdG87CglzLT5zb2NrX2Jsb2NraW5nID0g MTsgLyogU3RhcnQgaW4gYmxvY2tpbmcgbW9kZSAqLwoJcy0+c29ja190aW1lb3V0ID0gLTEu MDsgLyogU3RhcnQgd2l0aG91dCB0aW1lb3V0ICovCgoJcy0+ZXJyb3JoYW5kbGVyID0gJnNl dF9lcnJvcjsKI2lmZGVmIFJJU0NPUwoJaWYgKHRhc2t3aW5kb3cpCgkJc29ja2V0aW9jdGwo cy0+c29ja19mZCwgMHg4MDA0NjY3OSwgKHVfbG9uZyopJmJsb2NrKTsKI2VuZGlmCn0KCgov KiBDcmVhdGUgYSBuZXcgc29ja2V0IG9iamVjdC4KICAgVGhpcyBqdXN0IGNyZWF0ZXMgdGhl IG9iamVjdCBhbmQgaW5pdGlhbGl6ZXMgaXQuCiAgIElmIHRoZSBjcmVhdGlvbiBmYWlscywg cmV0dXJuIE5VTEwgYW5kIHNldCBhbiBleGNlcHRpb24gKGltcGxpY2l0CiAgIGluIE5FV09C SigpKS4gKi8KCnN0YXRpYyBQeVNvY2tldFNvY2tPYmplY3QgKgpuZXdfc29ja29iamVjdChT T0NLRVRfVCBmZCwgaW50IGZhbWlseSwgaW50IHR5cGUsIGludCBwcm90bykKewoJUHlTb2Nr ZXRTb2NrT2JqZWN0ICpzOwoJcyA9IChQeVNvY2tldFNvY2tPYmplY3QgKikKCQlQeVR5cGVf R2VuZXJpY05ldygmc29ja190eXBlLCBOVUxMLCBOVUxMKTsKCWlmIChzICE9IE5VTEwpCgkJ aW5pdF9zb2Nrb2JqZWN0KHMsIGZkLCBmYW1pbHksIHR5cGUsIHByb3RvKTsKCXJldHVybiBz Owp9CgoKLyogTG9jayB0byBhbGxvdyBweXRob24gaW50ZXJwcmV0ZXIgdG8gY29udGludWUs IGJ1dCBvbmx5IGFsbG93IG9uZQogICB0aHJlYWQgdG8gYmUgaW4gZ2V0aG9zdGJ5bmFtZSAq LwojaWZkZWYgVVNFX0dFVEhPU1RCWU5BTUVfTE9DSwpQeVRocmVhZF90eXBlX2xvY2sgZ2V0 aG9zdGJ5bmFtZV9sb2NrOwojZW5kaWYKCgovKiBDb252ZXJ0IGEgc3RyaW5nIHNwZWNpZnlp bmcgYSBob3N0IG5hbWUgb3Igb25lIG9mIGEgZmV3IHN5bWJvbGljCiAgIG5hbWVzIHRvIGEg bnVtZXJpYyBJUCBhZGRyZXNzLiAgVGhpcyB1c3VhbGx5IGNhbGxzIGdldGhvc3RieW5hbWUo KQogICB0byBkbyB0aGUgd29yazsgdGhlIG5hbWVzICIiIGFuZCAiPGJyb2FkY2FzdD4iIGFy ZSBzcGVjaWFsLgogICBSZXR1cm4gdGhlIGxlbmd0aCAoSVB2NCBzaG91bGQgYmUgNCBieXRl cyksIG9yIG5lZ2F0aXZlIGlmCiAgIGFuIGVycm9yIG9jY3VycmVkOyB0aGVuIGFuIGV4Y2Vw dGlvbiBpcyByYWlzZWQuICovCgpzdGF0aWMgaW50CnNldGlwYWRkcihjaGFyICpuYW1lLCBz dHJ1Y3Qgc29ja2FkZHIgKmFkZHJfcmV0LCBpbnQgYWYpCnsKCXN0cnVjdCBhZGRyaW5mbyBo aW50cywgKnJlczsKCWludCBlcnJvcjsKCgltZW1zZXQoKHZvaWQgKikgYWRkcl9yZXQsICdc MCcsIHNpemVvZigqYWRkcl9yZXQpKTsKCWlmIChuYW1lWzBdID09ICdcMCcpIHsKCQlpbnQg c2l6OwoJCW1lbXNldCgmaGludHMsIDAsIHNpemVvZihoaW50cykpOwoJCWhpbnRzLmFpX2Zh bWlseSA9IGFmOwoJCWhpbnRzLmFpX3NvY2t0eXBlID0gU09DS19ER1JBTTsJLypkdW1teSov CgkJaGludHMuYWlfZmxhZ3MgPSBBSV9QQVNTSVZFOwoJCWVycm9yID0gZ2V0YWRkcmluZm8o TlVMTCwgIjAiLCAmaGludHMsICZyZXMpOwoJCWlmIChlcnJvcikgewoJCQlzZXRfZ2FpZXJy b3IoZXJyb3IpOwoJCQlyZXR1cm4gLTE7CgkJfQoJCXN3aXRjaCAocmVzLT5haV9mYW1pbHkp IHsKCQljYXNlIEFGX0lORVQ6CgkJCXNpeiA9IDQ7CgkJCWJyZWFrOwojaWZkZWYgRU5BQkxF X0lQVjYKCQljYXNlIEFGX0lORVQ2OgoJCQlzaXogPSAxNjsKCQkJYnJlYWs7CiNlbmRpZgoJ CWRlZmF1bHQ6CgkJCWZyZWVhZGRyaW5mbyhyZXMpOwoJCQlQeUVycl9TZXRTdHJpbmcoc29j a2V0X2Vycm9yLAoJCQkJInVuc3VwcG9ydGVkIGFkZHJlc3MgZmFtaWx5Iik7CgkJCXJldHVy biAtMTsKCQl9CgkJaWYgKHJlcy0+YWlfbmV4dCkgewoJCQlmcmVlYWRkcmluZm8ocmVzKTsK CQkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwKCQkJCSJ3aWxkY2FyZCByZXNvbHZl ZCB0byBtdWx0aXBsZSBhZGRyZXNzIik7CgkJCXJldHVybiAtMTsKCQl9CgkJbWVtY3B5KGFk ZHJfcmV0LCByZXMtPmFpX2FkZHIsIHJlcy0+YWlfYWRkcmxlbik7CgkJZnJlZWFkZHJpbmZv KHJlcyk7CgkJcmV0dXJuIHNpejsKCX0KCWlmIChuYW1lWzBdID09ICc8JyAmJiBzdHJjbXAo bmFtZSwgIjxicm9hZGNhc3Q+IikgPT0gMCkgewoJCXN0cnVjdCBzb2NrYWRkcl9pbiAqc2lu OwoJCWlmIChhZiAhPSBQRl9JTkVUICYmIGFmICE9IFBGX1VOU1BFQykgewoJCQlQeUVycl9T ZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJImFkZHJlc3MgZmFtaWx5IG1pc21hdGNoZWQi KTsKCQkJcmV0dXJuIC0xOwoJCX0KCQlzaW4gPSAoc3RydWN0IHNvY2thZGRyX2luICopYWRk cl9yZXQ7CgkJbWVtc2V0KCh2b2lkICopIHNpbiwgJ1wwJywgc2l6ZW9mKCpzaW4pKTsKCQlz aW4tPnNpbl9mYW1pbHkgPSBBRl9JTkVUOwojaWZkZWYgSEFWRV9TT0NLQUREUl9TQV9MRU4K CQlzaW4tPnNpbl9sZW4gPSBzaXplb2YoKnNpbik7CiNlbmRpZgoJCXNpbi0+c2luX2FkZHIu c19hZGRyID0gSU5BRERSX0JST0FEQ0FTVDsKCQlyZXR1cm4gc2l6ZW9mKHNpbi0+c2luX2Fk ZHIpOwoJfQoJbWVtc2V0KCZoaW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9m YW1pbHkgPSBhZjsKCWVycm9yID0gZ2V0YWRkcmluZm8obmFtZSwgTlVMTCwgJmhpbnRzLCAm cmVzKTsKI2lmIGRlZmluZWQoX19kaWdpdGFsX18pICYmIGRlZmluZWQoX191bml4X18pCglp ZiAoZXJyb3IgPT0gRUFJX05PTkFNRSAmJiBhZiA9PSBBRl9VTlNQRUMpIHsKCQkvKiBPbiBU cnU2NCBWNS4xLCBudW1lcmljLXRvLWFkZHIgY29udmVyc2lvbiBmYWlscwoJCSAgIGlmIG5v IGFkZHJlc3MgZmFtaWx5IGlzIGdpdmVuLiBBc3N1bWUgSVB2NCBmb3Igbm93LiovCgkJaGlu dHMuYWlfZmFtaWx5ID0gQUZfSU5FVDsKCQllcnJvciA9IGdldGFkZHJpbmZvKG5hbWUsIE5V TEwsICZoaW50cywgJnJlcyk7Cgl9CiNlbmRpZgoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVy cm9yKGVycm9yKTsKCQlyZXR1cm4gLTE7Cgl9CgltZW1jcHkoKGNoYXIgKikgYWRkcl9yZXQs IHJlcy0+YWlfYWRkciwgcmVzLT5haV9hZGRybGVuKTsKCWZyZWVhZGRyaW5mbyhyZXMpOwoJ c3dpdGNoIChhZGRyX3JldC0+c2FfZmFtaWx5KSB7CgljYXNlIEFGX0lORVQ6CgkJcmV0dXJu IDQ7CiNpZmRlZiBFTkFCTEVfSVBWNgoJY2FzZSBBRl9JTkVUNjoKCQlyZXR1cm4gMTY7CiNl bmRpZgoJZGVmYXVsdDoKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAidW5rbm93 biBhZGRyZXNzIGZhbWlseSIpOwoJCXJldHVybiAtMTsKCX0KfQoKCi8qIENyZWF0ZSBhIHN0 cmluZyBvYmplY3QgcmVwcmVzZW50aW5nIGFuIElQIGFkZHJlc3MuCiAgIFRoaXMgaXMgYWx3 YXlzIGEgc3RyaW5nIG9mIHRoZSBmb3JtICdkZC5kZC5kZC5kZCcgKHdpdGggdmFyaWFibGUK ICAgc2l6ZSBudW1iZXJzKS4gKi8KCnN0YXRpYyBQeU9iamVjdCAqCm1ha2VpcGFkZHIoc3Ry dWN0IHNvY2thZGRyICphZGRyLCBpbnQgYWRkcmxlbikKewoJY2hhciBidWZbTklfTUFYSE9T VF07CglpbnQgZXJyb3I7CgoJZXJyb3IgPSBnZXRuYW1laW5mbyhhZGRyLCBhZGRybGVuLCBi dWYsIHNpemVvZihidWYpLCBOVUxMLCAwLAoJCU5JX05VTUVSSUNIT1NUKTsKCWlmIChlcnJv cikgewoJCXNldF9nYWllcnJvcihlcnJvcik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4g UHlTdHJpbmdfRnJvbVN0cmluZyhidWYpOwp9CgoKLyogQ3JlYXRlIGFuIG9iamVjdCByZXBy ZXNlbnRpbmcgdGhlIGdpdmVuIHNvY2tldCBhZGRyZXNzLAogICBzdWl0YWJsZSBmb3IgcGFz c2luZyBpdCBiYWNrIHRvIGJpbmQoKSwgY29ubmVjdCgpIGV0Yy4KICAgVGhlIGZhbWlseSBm aWVsZCBvZiB0aGUgc29ja2FkZHIgc3RydWN0dXJlIGlzIGluc3BlY3RlZAogICB0byBkZXRl cm1pbmUgd2hhdCBraW5kIG9mIGFkZHJlc3MgaXQgcmVhbGx5IGlzLiAqLwoKLypBUkdTVVNF RCovCnN0YXRpYyBQeU9iamVjdCAqCm1ha2Vzb2NrYWRkcihpbnQgc29ja2ZkLCBzdHJ1Y3Qg c29ja2FkZHIgKmFkZHIsIGludCBhZGRybGVuKQp7CglpZiAoYWRkcmxlbiA9PSAwKSB7CgkJ LyogTm8gYWRkcmVzcyAtLSBtYXkgYmUgcmVjdmZyb20oKSBmcm9tIGtub3duIHNvY2tldCAq LwoJCVB5X0lOQ1JFRihQeV9Ob25lKTsKCQlyZXR1cm4gUHlfTm9uZTsKCX0KCiNpZmRlZiBf X0JFT1NfXwoJLyogWFhYOiBCZU9TIHZlcnNpb24gb2YgYWNjZXB0KCkgZG9lc24ndCBzZXQg ZmFtaWx5IGNvcnJlY3RseSAqLwoJYWRkci0+c2FfZmFtaWx5ID0gQUZfSU5FVDsKI2VuZGlm CgoJc3dpdGNoIChhZGRyLT5zYV9mYW1pbHkpIHsKCgljYXNlIEFGX0lORVQ6Cgl7CgkJc3Ry dWN0IHNvY2thZGRyX2luICphOwoJCVB5T2JqZWN0ICphZGRyb2JqID0gbWFrZWlwYWRkcihh ZGRyLCBzaXplb2YoKmEpKTsKCQlQeU9iamVjdCAqcmV0ID0gTlVMTDsKCQlpZiAoYWRkcm9i aikgewoJCQlhID0gKHN0cnVjdCBzb2NrYWRkcl9pbiAqKWFkZHI7CgkJCXJldCA9IFB5X0J1 aWxkVmFsdWUoIk9pIiwgYWRkcm9iaiwgbnRvaHMoYS0+c2luX3BvcnQpKTsKCQkJUHlfREVD UkVGKGFkZHJvYmopOwoJCX0KCQlyZXR1cm4gcmV0OwoJfQoKI2lmZGVmIEFGX1VOSVgKCWNh c2UgQUZfVU5JWDoKCXsKCQlzdHJ1Y3Qgc29ja2FkZHJfdW4gKmEgPSAoc3RydWN0IHNvY2th ZGRyX3VuICopIGFkZHI7CgkJcmV0dXJuIFB5U3RyaW5nX0Zyb21TdHJpbmcoYS0+c3VuX3Bh dGgpOwoJfQojZW5kaWYgLyogQUZfVU5JWCAqLwoKI2lmZGVmIEVOQUJMRV9JUFY2CgljYXNl IEFGX0lORVQ2OgoJewoJCXN0cnVjdCBzb2NrYWRkcl9pbjYgKmE7CgkJUHlPYmplY3QgKmFk ZHJvYmogPSBtYWtlaXBhZGRyKGFkZHIsIHNpemVvZigqYSkpOwoJCVB5T2JqZWN0ICpyZXQg PSBOVUxMOwoJCWlmIChhZGRyb2JqKSB7CgkJCWEgPSAoc3RydWN0IHNvY2thZGRyX2luNiAq KWFkZHI7CgkJCXJldCA9IFB5X0J1aWxkVmFsdWUoIk9paWkiLAoJCQkJCSAgICBhZGRyb2Jq LAoJCQkJCSAgICBudG9ocyhhLT5zaW42X3BvcnQpLAoJCQkJCSAgICBhLT5zaW42X2Zsb3dp bmZvLAoJCQkJCSAgICBhLT5zaW42X3Njb3BlX2lkKTsKCQkJUHlfREVDUkVGKGFkZHJvYmop OwoJCX0KCQlyZXR1cm4gcmV0OwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05FVFBBQ0tFVF9Q QUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRyX2xsICphID0g KHN0cnVjdCBzb2NrYWRkcl9sbCAqKWFkZHI7CgkJY2hhciAqaWZuYW1lID0gIiI7CgkJc3Ry dWN0IGlmcmVxIGlmcjsKCQkvKiBuZWVkIHRvIGxvb2sgdXAgaW50ZXJmYWNlIG5hbWUgZ2l2 ZSBpbmRleCAqLwoJCWlmIChhLT5zbGxfaWZpbmRleCkgewoJCQlpZnIuaWZyX2lmaW5kZXgg PSBhLT5zbGxfaWZpbmRleDsKCQkJaWYgKGlvY3RsKHNvY2tmZCwgU0lPQ0dJRk5BTUUsICZp ZnIpID09IDApCgkJCQlpZm5hbWUgPSBpZnIuaWZyX25hbWU7CgkJfQoJCXJldHVybiBQeV9C dWlsZFZhbHVlKCJzaGJocyMiLAoJCQkJICAgICBpZm5hbWUsCgkJCQkgICAgIG50b2hzKGEt PnNsbF9wcm90b2NvbCksCgkJCQkgICAgIGEtPnNsbF9wa3R0eXBlLAoJCQkJICAgICBhLT5z bGxfaGF0eXBlLAoJCQkJICAgICBhLT5zbGxfYWRkciwKCQkJCSAgICAgYS0+c2xsX2hhbGVu KTsKCX0KI2VuZGlmCgoJLyogTW9yZSBjYXNlcyBoZXJlLi4uICovCgoJZGVmYXVsdDoKCQkv KiBJZiB3ZSBkb24ndCBrbm93IHRoZSBhZGRyZXNzIGZhbWlseSwgZG9uJ3QgcmFpc2UgYW4K CQkgICBleGNlcHRpb24gLS0gcmV0dXJuIGl0IGFzIGEgdHVwbGUuICovCgkJcmV0dXJuIFB5 X0J1aWxkVmFsdWUoImlzIyIsCgkJCQkgICAgIGFkZHItPnNhX2ZhbWlseSwKCQkJCSAgICAg YWRkci0+c2FfZGF0YSwKCQkJCSAgICAgc2l6ZW9mKGFkZHItPnNhX2RhdGEpKTsKCgl9Cn0K CgovKiBQYXJzZSBhIHNvY2tldCBhZGRyZXNzIGFyZ3VtZW50IGFjY29yZGluZyB0byB0aGUg c29ja2V0IG9iamVjdCdzCiAgIGFkZHJlc3MgZmFtaWx5LiAgUmV0dXJuIDEgaWYgdGhlIGFk ZHJlc3Mgd2FzIGluIHRoZSBwcm9wZXIgZm9ybWF0LAogICAwIG9mIG5vdC4gIFRoZSBhZGRy ZXNzIGlzIHJldHVybmVkIHRocm91Z2ggYWRkcl9yZXQsIGl0cyBsZW5ndGgKICAgdGhyb3Vn aCBsZW5fcmV0LiAqLwoKc3RhdGljIGludApnZXRzb2NrYWRkcmFyZyhQeVNvY2tldFNvY2tP YmplY3QgKnMsIFB5T2JqZWN0ICphcmdzLAoJICAgICAgIHN0cnVjdCBzb2NrYWRkciAqKmFk ZHJfcmV0LCBpbnQgKmxlbl9yZXQpCnsKCXN3aXRjaCAocy0+c29ja19mYW1pbHkpIHsKCiNp ZmRlZiBBRl9VTklYCgljYXNlIEFGX1VOSVg6Cgl7CgkJc3RydWN0IHNvY2thZGRyX3VuKiBh ZGRyOwoJCWNoYXIgKnBhdGg7CgkJaW50IGxlbjsKCQlhZGRyID0gKHN0cnVjdCBzb2NrYWRk cl91biopJihzLT5zb2NrX2FkZHIpLnVuOwoJCWlmICghUHlBcmdfUGFyc2UoYXJncywgInQj IiwgJnBhdGgsICZsZW4pKQoJCQlyZXR1cm4gMDsKCQlpZiAobGVuID4gc2l6ZW9mIGFkZHIt PnN1bl9wYXRoKSB7CgkJCVB5RXJyX1NldFN0cmluZyhzb2NrZXRfZXJyb3IsCgkJCQkJIkFG X1VOSVggcGF0aCB0b28gbG9uZyIpOwoJCQlyZXR1cm4gMDsKCQl9CgkJYWRkci0+c3VuX2Zh bWlseSA9IHMtPnNvY2tfZmFtaWx5OwoJCW1lbWNweShhZGRyLT5zdW5fcGF0aCwgcGF0aCwg bGVuKTsKCQlhZGRyLT5zdW5fcGF0aFtsZW5dID0gMDsKCQkqYWRkcl9yZXQgPSAoc3RydWN0 IHNvY2thZGRyICopIGFkZHI7CgkJKmxlbl9yZXQgPSBsZW4gKyBzaXplb2YoKmFkZHIpIC0g c2l6ZW9mKGFkZHItPnN1bl9wYXRoKTsKCQlyZXR1cm4gMTsKCX0KI2VuZGlmIC8qIEFGX1VO SVggKi8KCgljYXNlIEFGX0lORVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRyX2luKiBhZGRyOwoJ CWNoYXIgKmhvc3Q7CgkJaW50IHBvcnQ7CiAJCWFkZHI9KHN0cnVjdCBzb2NrYWRkcl9pbiop JihzLT5zb2NrX2FkZHIpLmluOwoJCWlmICghUHlUdXBsZV9DaGVjayhhcmdzKSkgewoJCQlQ eUVycl9Gb3JtYXQoCgkJCQlQeUV4Y19UeXBlRXJyb3IsCgkJCQkiZ2V0c29ja2FkZHJhcmc6 ICIKCQkJCSJBRl9JTkVUIGFkZHJlc3MgbXVzdCBiZSB0dXBsZSwgbm90ICUuNTAwcyIsCgkJ CQlhcmdzLT5vYl90eXBlLT50cF9uYW1lKTsKCQkJcmV0dXJuIDA7CgkJfQoJCWlmICghUHlB cmdfUGFyc2VUdXBsZShhcmdzLCAic2k6Z2V0c29ja2FkZHJhcmciLCAmaG9zdCwgJnBvcnQp KQoJCQlyZXR1cm4gMDsKCQlpZiAoc2V0aXBhZGRyKGhvc3QsIChzdHJ1Y3Qgc29ja2FkZHIg KilhZGRyLCBBRl9JTkVUKSA8IDApCgkJCXJldHVybiAwOwoJCWFkZHItPnNpbl9mYW1pbHkg PSBBRl9JTkVUOwoJCWFkZHItPnNpbl9wb3J0ID0gaHRvbnMoKHNob3J0KXBvcnQpOwoJCSph ZGRyX3JldCA9IChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRkcjsKCQkqbGVuX3JldCA9IHNpemVv ZiAqYWRkcjsKCQlyZXR1cm4gMTsKCX0KCiNpZmRlZiBFTkFCTEVfSVBWNgoJY2FzZSBBRl9J TkVUNjoKCXsKCQlzdHJ1Y3Qgc29ja2FkZHJfaW42KiBhZGRyOwoJCWNoYXIgKmhvc3Q7CgkJ aW50IHBvcnQsIGZsb3dpbmZvLCBzY29wZV9pZDsKIAkJYWRkciA9IChzdHJ1Y3Qgc29ja2Fk ZHJfaW42KikmKHMtPnNvY2tfYWRkcikuaW42OwoJCWZsb3dpbmZvID0gc2NvcGVfaWQgPSAw OwoJCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAic2l8aWkiLCAmaG9zdCwgJnBvcnQs ICZmbG93aW5mbywKCQkJCSAgICAgICZzY29wZV9pZCkpIHsKCQkJcmV0dXJuIDA7CgkJfQoJ CWlmIChzZXRpcGFkZHIoaG9zdCwgKHN0cnVjdCBzb2NrYWRkciAqKWFkZHIsIEFGX0lORVQ2 KSA8IDApCgkJCXJldHVybiAwOwoJCWFkZHItPnNpbjZfZmFtaWx5ID0gcy0+c29ja19mYW1p bHk7CgkJYWRkci0+c2luNl9wb3J0ID0gaHRvbnMoKHNob3J0KXBvcnQpOwoJCWFkZHItPnNp bjZfZmxvd2luZm8gPSBmbG93aW5mbzsKCQlhZGRyLT5zaW42X3Njb3BlX2lkID0gc2NvcGVf aWQ7CgkJKmFkZHJfcmV0ID0gKHN0cnVjdCBzb2NrYWRkciAqKSBhZGRyOwoJCSpsZW5fcmV0 ID0gc2l6ZW9mICphZGRyOwoJCXJldHVybiAxOwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05F VFBBQ0tFVF9QQUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRy X2xsKiBhZGRyOwoJCXN0cnVjdCBpZnJlcSBpZnI7CgkJY2hhciAqaW50ZXJmYWNlTmFtZTsK CQlpbnQgcHJvdG9OdW1iZXI7CgkJaW50IGhhdHlwZSA9IDA7CgkJaW50IHBrdHR5cGUgPSAw OwoJCWNoYXIgKmhhZGRyOwoKCQlpZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgInNpfGlp cyIsICZpbnRlcmZhY2VOYW1lLAoJCQkJICAgICAgJnByb3RvTnVtYmVyLCAmcGt0dHlwZSwg JmhhdHlwZSwgJmhhZGRyKSkKCQkJcmV0dXJuIDA7CgkJc3RybmNweShpZnIuaWZyX25hbWUs IGludGVyZmFjZU5hbWUsIHNpemVvZihpZnIuaWZyX25hbWUpKTsKCQlpZnIuaWZyX25hbWVb KHNpemVvZihpZnIuaWZyX25hbWUpKS0xXSA9ICdcMCc7CgkJaWYgKGlvY3RsKHMtPnNvY2tf ZmQsIFNJT0NHSUZJTkRFWCwgJmlmcikgPCAwKSB7CgkJICAgICAgICBzLT5lcnJvcmhhbmRs ZXIoKTsKCQkJcmV0dXJuIDA7CgkJfQoJCWFkZHIgPSAmKHMtPnNvY2tfYWRkci5sbCk7CgkJ YWRkci0+c2xsX2ZhbWlseSA9IEFGX1BBQ0tFVDsKCQlhZGRyLT5zbGxfcHJvdG9jb2wgPSBo dG9ucygoc2hvcnQpcHJvdG9OdW1iZXIpOwoJCWFkZHItPnNsbF9pZmluZGV4ID0gaWZyLmlm cl9pZmluZGV4OwoJCWFkZHItPnNsbF9wa3R0eXBlID0gcGt0dHlwZTsKCQlhZGRyLT5zbGxf aGF0eXBlID0gaGF0eXBlOwoJCSphZGRyX3JldCA9IChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRk cjsKCQkqbGVuX3JldCA9IHNpemVvZiAqYWRkcjsKCQlyZXR1cm4gMTsKCX0KI2VuZGlmCgoJ LyogTW9yZSBjYXNlcyBoZXJlLi4uICovCgoJZGVmYXVsdDoKCQlQeUVycl9TZXRTdHJpbmco c29ja2V0X2Vycm9yLCAiZ2V0c29ja2FkZHJhcmc6IGJhZCBmYW1pbHkiKTsKCQlyZXR1cm4g MDsKCgl9Cn0KCgovKiBHZXQgdGhlIGFkZHJlc3MgbGVuZ3RoIGFjY29yZGluZyB0byB0aGUg c29ja2V0IG9iamVjdCdzIGFkZHJlc3MgZmFtaWx5LgogICBSZXR1cm4gMSBpZiB0aGUgZmFt aWx5IGlzIGtub3duLCAwIG90aGVyd2lzZS4gIFRoZSBsZW5ndGggaXMgcmV0dXJuZWQKICAg dGhyb3VnaCBsZW5fcmV0LiAqLwoKc3RhdGljIGludApnZXRzb2NrYWRkcmxlbihQeVNvY2tl dFNvY2tPYmplY3QgKnMsIHNvY2tsZW5fdCAqbGVuX3JldCkKewoJc3dpdGNoIChzLT5zb2Nr X2ZhbWlseSkgewoKI2lmZGVmIEFGX1VOSVgKCWNhc2UgQUZfVU5JWDoKCXsKCQkqbGVuX3Jl dCA9IHNpemVvZiAoc3RydWN0IHNvY2thZGRyX3VuKTsKCQlyZXR1cm4gMTsKCX0KI2VuZGlm IC8qIEFGX1VOSVggKi8KCgljYXNlIEFGX0lORVQ6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2Yg KHN0cnVjdCBzb2NrYWRkcl9pbik7CgkJcmV0dXJuIDE7Cgl9CgojaWZkZWYgRU5BQkxFX0lQ VjYKCWNhc2UgQUZfSU5FVDY6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2YgKHN0cnVjdCBzb2Nr YWRkcl9pbjYpOwoJCXJldHVybiAxOwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05FVFBBQ0tF VF9QQUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2YgKHN0 cnVjdCBzb2NrYWRkcl9sbCk7CgkJcmV0dXJuIDE7Cgl9CiNlbmRpZgoKCS8qIE1vcmUgY2Fz ZXMgaGVyZS4uLiAqLwoKCWRlZmF1bHQ6CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJv ciwgImdldHNvY2thZGRybGVuOiBiYWQgZmFtaWx5Iik7CgkJcmV0dXJuIDA7CgoJfQp9CgoK Lyogcy5hY2NlcHQoKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfYWNjZXB0 KFB5U29ja2V0U29ja09iamVjdCAqcykKewoJY2hhciBhZGRyYnVmWzI1Nl07CglTT0NLRVRf VCBuZXdmZDsKCXNvY2tsZW5fdCBhZGRybGVuOwoJUHlPYmplY3QgKnNvY2sgPSBOVUxMOwoJ UHlPYmplY3QgKmFkZHIgPSBOVUxMOwoJUHlPYmplY3QgKnJlcyA9IE5VTEw7CgoJaWYgKCFn ZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5VTEw7CgltZW1zZXQoYWRk cmJ1ZiwgMCwgYWRkcmxlbik7CgoJZXJybm8gPSAwOyAvKiBSZXNldCBpbmRpY2F0b3IgZm9y IHVzZSB3aXRoIHRpbWVvdXQgYmVoYXZpb3IgKi8KCglQeV9CRUdJTl9BTExPV19USFJFQURT CgluZXdmZCA9IGFjY2VwdChzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICopIGFkZHJi dWYsICZhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKHMtPnNvY2tfdGlt ZW91dCA+PSAwLjApIHsKI2lmZGVmIE1TX1dJTkRPV1MKCQlpZiAobmV3ZmQgPT0gSU5WQUxJ RF9TT0NLRVQpCgkJCWlmICghcy0+c29ja19ibG9ja2luZykKCQkJCXJldHVybiBzLT5lcnJv cmhhbmRsZXIoKTsKCQkJLyogQ2hlY2sgaWYgd2UgaGF2ZSBhIHRydWUgZmFpbHVyZQoJCQkg ICBmb3IgYSBibG9ja2luZyBzb2NrZXQgKi8KCQkJaWYgKGVycm5vICE9IFdTQUVXT1VMREJM T0NLKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwojZWxzZQoJCWlmIChuZXdmZCA8 IDApIHsKCQkJaWYgKCFzLT5zb2NrX2Jsb2NraW5nKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFu ZGxlcigpOwoJCQkvKiBDaGVjayBpZiB3ZSBoYXZlIGEgdHJ1ZSBmYWlsdXJlCgkJCSAgIGZv ciBhIGJsb2NraW5nIHNvY2tldCAqLwoJCQlpZiAoZXJybm8gIT0gRUFHQUlOICYmIGVycm5v ICE9IEVXT1VMREJMT0NLKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJCX0KI2Vu ZGlmCgoJCS8qIHRyeSB3YWl0aW5nIHRoZSB0aW1lb3V0IHBlcmlvZCAqLwoJCWlmIChpbnRl cm5hbF9zZWxlY3QocywgMCkgPD0gMCkKCQkJcmV0dXJuIE5VTEw7CgoJCVB5X0JFR0lOX0FM TE9XX1RIUkVBRFMKCQluZXdmZCA9IGFjY2VwdChzLT5zb2NrX2ZkLAoJCQkgICAgICAgKHN0 cnVjdCBzb2NrYWRkciAqKWFkZHJidWYsCgkJCSAgICAgICAmYWRkcmxlbik7CgkJUHlfRU5E X0FMTE9XX1RIUkVBRFMKCX0KCgkvKiBBdCB0aGlzIHBvaW50LCB3ZSByZWFsbHkgaGF2ZSBh biBlcnJvciwgd2hldGhlciB1c2luZyB0aW1lb3V0CgkgICBiZWhhdmlvciBvciByZWd1bGFy IHNvY2tldCBiZWhhdmlvciAqLwojaWZkZWYgTVNfV0lORE9XUwoJaWYgKG5ld2ZkID09IElO VkFMSURfU09DS0VUKQojZWxzZQoJaWYgKG5ld2ZkIDwgMCkKI2VuZGlmCgkJcmV0dXJuIHMt PmVycm9yaGFuZGxlcigpOwoKCS8qIENyZWF0ZSB0aGUgbmV3IG9iamVjdCB3aXRoIHVuc3Bl Y2lmaWVkIGZhbWlseSwKCSAgIHRvIGF2b2lkIGNhbGxzIHRvIGJpbmQoKSBldGMuIG9uIGl0 LiAqLwoJc29jayA9IChQeU9iamVjdCAqKSBuZXdfc29ja29iamVjdChuZXdmZCwKCQkJCQkg ICBzLT5zb2NrX2ZhbWlseSwKCQkJCQkgICBzLT5zb2NrX3R5cGUsCgkJCQkJICAgcy0+c29j a19wcm90byk7CgoJaWYgKHNvY2sgPT0gTlVMTCkgewoJCVNPQ0tFVENMT1NFKG5ld2ZkKTsK CQlnb3RvIGZpbmFsbHk7Cgl9CglhZGRyID0gbWFrZXNvY2thZGRyKHMtPnNvY2tfZmQsIChz dHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLAoJCQkgICAgYWRkcmxlbik7CglpZiAoYWRkciA9 PSBOVUxMKQoJCWdvdG8gZmluYWxseTsKCglyZXMgPSBQeV9CdWlsZFZhbHVlKCJPTyIsIHNv Y2ssIGFkZHIpOwoKZmluYWxseToKCVB5X1hERUNSRUYoc29jayk7CglQeV9YREVDUkVGKGFk ZHIpOwoJcmV0dXJuIHJlczsKfQoKc3RhdGljIGNoYXIgYWNjZXB0X2RvY1tdID0KImFjY2Vw dCgpIC0+IChzb2NrZXQgb2JqZWN0LCBhZGRyZXNzIGluZm8pXG5cClxuXApXYWl0IGZvciBh biBpbmNvbWluZyBjb25uZWN0aW9uLiAgUmV0dXJuIGEgbmV3IHNvY2tldCByZXByZXNlbnRp bmcgdGhlXG5cCmNvbm5lY3Rpb24sIGFuZCB0aGUgYWRkcmVzcyBvZiB0aGUgY2xpZW50LiAg Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0YWRk ciwgcG9ydCkuIjsKCi8qIHMuc2V0YmxvY2tpbmcoMSB8IDApIG1ldGhvZCAqLwoKc3RhdGlj IFB5T2JqZWN0ICoKc29ja19zZXRibG9ja2luZyhQeVNvY2tldFNvY2tPYmplY3QgKnMsIFB5 T2JqZWN0ICphcmcpCnsKCWludCBibG9jazsKCglibG9jayA9IFB5SW50X0FzTG9uZyhhcmcp OwoJaWYgKGJsb2NrID09IC0xICYmIFB5RXJyX09jY3VycmVkKCkpCgkJcmV0dXJuIE5VTEw7 CgoJcy0+c29ja19ibG9ja2luZyA9IGJsb2NrOwoJcy0+c29ja190aW1lb3V0ID0gLTEuMDsg LyogQWx3YXlzIGNsZWFyIHRoZSB0aW1lb3V0ICovCglpbnRlcm5hbF9zZXRibG9ja2luZyhz LCBibG9jayk7CgoJUHlfSU5DUkVGKFB5X05vbmUpOwoJcmV0dXJuIFB5X05vbmU7Cn0KCnN0 YXRpYyBjaGFyIHNldGJsb2NraW5nX2RvY1tdID0KInNldGJsb2NraW5nKGZsYWcpXG5cClxu XApTZXQgdGhlIHNvY2tldCB0byBibG9ja2luZyAoZmxhZyBpcyB0cnVlKSBvciBub24tYmxv Y2tpbmcgKGZhbHNlKS5cblwKVGhpcyB1c2VzIHRoZSBGSU9OQklPIGlvY3RsIHdpdGggdGhl IE9fTkRFTEFZIGZsYWcuIjsKCi8qIHMuc2V0dGltZW91dChOb25lIHwgZmxvYXQpIG1ldGhv ZC4KICAgQ2F1c2VzIGFuIGV4Y2VwdGlvbiB0byBiZSByYWlzZWQgd2hlbiB0aGUgZ2l2ZW4g dGltZSBoYXMKICAgZWxhcHNlZCB3aGVuIHBlcmZvcm1pbmcgYSBibG9ja2luZyBzb2NrZXQg b3BlcmF0aW9uLiAqLwpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX3NldHRpbWVvdXQoUHlTb2Nr ZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYXJnKQp7Cglkb3VibGUgdmFsdWU7CgoJaWYg KGFyZyA9PSBQeV9Ob25lKQoJCXZhbHVlID0gLTEuMDsKCWVsc2UgewoJCXZhbHVlID0gUHlG bG9hdF9Bc0RvdWJsZShhcmcpOwoJCWlmICh2YWx1ZSA8IDAuMCkgewoJCQlpZiAoIVB5RXJy X09jY3VycmVkKCkpCgkJCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfVmFsdWVFcnJvciwKCQkJ CQkJIkludmFsaWQgdGltZW91dCB2YWx1ZSIpOwoJCQlyZXR1cm4gTlVMTDsKCQl9Cgl9CgoJ cy0+c29ja190aW1lb3V0ID0gdmFsdWU7CgoJLyogVGhlIHNlbWFudGljcyBvZiBzZXR0aW5n IHNvY2tldCB0aW1lb3V0cyBhcmU6CgkgICBJZiB5b3Ugc2V0dGltZW91dCghPU5vbmUpOgoJ ICAgICAgIFRoZSBhY3R1YWwgc29ja2V0IGdldHMgcHV0IGluIG5vbi1ibG9ja2luZyBtb2Rl IGFuZCB0aGUgc2VsZWN0CgkgICAgICAgaXMgdXNlZCB0byBjb250cm9sIHRpbWVvdXRzLgoJ ICAgRWxzZSBpZiB5b3Ugc2V0dGltZW91dChOb25lKSBbdGhlbiB2YWx1ZSBpcyAtMS4wXToK CSAgICAgICBUaGUgb2xkIGJlaGF2aW9yIGlzIHVzZWQgQU5EIGF1dG9tYXRpY2FsbHksIHRo ZSBzb2NrZXQgaXMgc2V0CgkgICAgICAgdG8gYmxvY2tpbmcgbW9kZS4gVGhhdCBtZWFucyB0 aGF0IHNvbWVvbmUgd2hvIHdhcyBkb2luZwoJICAgICAgIG5vbi1ibG9ja2luZyBzdHVmZiBi ZWZvcmUsIHNldHMgYSB0aW1lb3V0LCBhbmQgdGhlbiB1bnNldHMKCSAgICAgICBvbmUsIHdp bGwgaGF2ZSB0byBjYWxsIHNldGJsb2NraW5nKDApIGFnYWluIGlmIGhlIHdhbnRzCgkgICAg ICAgbm9uLWJsb2NraW5nIHN0dWZmLiBUaGlzIG1ha2VzIHNlbnNlIGJlY2F1c2UgdGltZW91 dCBzdHVmZiBpcwoJICAgICAgIGJsb2NraW5nIGJ5IG5hdHVyZS4gKi8KCWludGVybmFsX3Nl dGJsb2NraW5nKHMsIHZhbHVlIDwgMC4wKTsKCglzLT5zb2NrX2Jsb2NraW5nID0gMTsgLyog QWx3YXlzIG5lZ2F0ZSBzZXRibG9ja2luZygpICovCgoJUHlfSU5DUkVGKFB5X05vbmUpOwoJ cmV0dXJuIFB5X05vbmU7Cn0KCnN0YXRpYyBjaGFyIHNldHRpbWVvdXRfZG9jW10gPQoic2V0 dGltZW91dCh0aW1lb3V0KVxuXApcblwKU2V0IGEgdGltZW91dCBvbiBibG9ja2luZyBzb2Nr ZXQgb3BlcmF0aW9ucy4gICd0aW1lb3V0JyBjYW4gYmUgYSBmbG9hdCxcblwKZ2l2aW5nIHNl Y29uZHMsIG9yIE5vbmUuICBTZXR0aW5nIGEgdGltZW91dCBvZiBOb25lIGRpc2FibGVzIHRp bWVvdXQuIjsKCi8qIHMuZ2V0dGltZW91dCgpIG1ldGhvZC4KICAgUmV0dXJucyB0aGUgdGlt ZW91dCBhc3NvY2lhdGVkIHdpdGggYSBzb2NrZXQuICovCnN0YXRpYyBQeU9iamVjdCAqCnNv Y2tfZ2V0dGltZW91dChQeVNvY2tldFNvY2tPYmplY3QgKnMpCnsKCWlmIChzLT5zb2NrX3Rp bWVvdXQgPCAwLjApIHsKCQlQeV9JTkNSRUYoUHlfTm9uZSk7CgkJcmV0dXJuIFB5X05vbmU7 Cgl9CgllbHNlCgkJcmV0dXJuIFB5RmxvYXRfRnJvbURvdWJsZShzLT5zb2NrX3RpbWVvdXQp Owp9CgpzdGF0aWMgY2hhciBnZXR0aW1lb3V0X2RvY1tdID0KImdldHRpbWVvdXQoKVxuXApc blwKUmV0dXJucyB0aGUgdGltZW91dCBpbiBmbG9hdGluZyBzZWNvbmRzIGFzc29jaWF0ZWQg d2l0aCBzb2NrZXQgXG5cCm9wZXJhdGlvbnMuIEEgdGltZW91dCBvZiBOb25lIGluZGljYXRl cyB0aGF0IHRpbWVvdXRzIG9uIHNvY2tldCBcblwKb3BlcmF0aW9ucyBhcmUgZGlzYWJsZWQu IjsKCiNpZmRlZiBSSVNDT1MKLyogcy5zbGVlcHRhc2t3KDEgfCAwKSBtZXRob2QgKi8KCnN0 YXRpYyBQeU9iamVjdCAqCnNvY2tfc2xlZXB0YXNrdyhQeVNvY2tldFNvY2tPYmplY3QgKnMs UHlPYmplY3QgKmFyZ3MpCnsKCWludCBibG9jazsKCWludCBkZWxheV9mbGFnOwoJaWYgKCFQ eUFyZ19QYXJzZShhcmdzLCAiaSIsICZibG9jaykpCgkJcmV0dXJuIE5VTEw7CglQeV9CRUdJ Tl9BTExPV19USFJFQURTCglzb2NrZXRpb2N0bChzLT5zb2NrX2ZkLCAweDgwMDQ2Njc5LCAo dV9sb25nKikmYmxvY2spOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCglQeV9JTkNSRUYoUHlf Tm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQpzdGF0aWMgY2hhciBzbGVlcHRhc2t3X2RvY1td ID0KInNsZWVwdGFza3coZmxhZylcblwKXG5cCkFsbG93IHNsZWVwcyBpbiB0YXNrd2luZG93 cy4iOwojZW5kaWYKCgovKiBzLnNldHNvY2tvcHQoKSBtZXRob2QuCiAgIFdpdGggYW4gaW50 ZWdlciB0aGlyZCBhcmd1bWVudCwgc2V0cyBhbiBpbnRlZ2VyIG9wdGlvbi4KICAgV2l0aCBh IHN0cmluZyB0aGlyZCBhcmd1bWVudCwgc2V0cyBhbiBvcHRpb24gZnJvbSBhIGJ1ZmZlcjsK ICAgdXNlIG9wdGlvbmFsIGJ1aWx0LWluIG1vZHVsZSAnc3RydWN0JyB0byBlbmNvZGUgdGhl IHN0cmluZy4gKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2V0c29ja29wdChQeVNvY2tl dFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmdzKQp7CglpbnQgbGV2ZWw7CglpbnQgb3B0 bmFtZTsKCWludCByZXM7CgljaGFyICpidWY7CglpbnQgYnVmbGVuOwoJaW50IGZsYWc7CgoJ aWYgKFB5QXJnX1BhcnNlVHVwbGUoYXJncywgImlpaTpzZXRzb2Nrb3B0IiwKCQkJICAgICAm bGV2ZWwsICZvcHRuYW1lLCAmZmxhZykpIHsKCQlidWYgPSAoY2hhciAqKSAmZmxhZzsKCQli dWZsZW4gPSBzaXplb2YgZmxhZzsKCX0KCWVsc2UgewoJCVB5RXJyX0NsZWFyKCk7CgkJaWYg KCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJpaXMjOnNldHNvY2tvcHQiLAoJCQkJICAgICAg JmxldmVsLCAmb3B0bmFtZSwgJmJ1ZiwgJmJ1ZmxlbikpCgkJCXJldHVybiBOVUxMOwoJfQoJ cmVzID0gc2V0c29ja29wdChzLT5zb2NrX2ZkLCBsZXZlbCwgb3B0bmFtZSwgKHZvaWQgKili dWYsIGJ1Zmxlbik7CglpZiAocmVzIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7 CglQeV9JTkNSRUYoUHlfTm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIg c2V0c29ja29wdF9kb2NbXSA9CiJzZXRzb2Nrb3B0KGxldmVsLCBvcHRpb24sIHZhbHVlKVxu XApcblwKU2V0IGEgc29ja2V0IG9wdGlvbi4gIFNlZSB0aGUgVW5peCBtYW51YWwgZm9yIGxl dmVsIGFuZCBvcHRpb24uXG5cClRoZSB2YWx1ZSBhcmd1bWVudCBjYW4gZWl0aGVyIGJlIGFu IGludGVnZXIgb3IgYSBzdHJpbmcuIjsKCgovKiBzLmdldHNvY2tvcHQoKSBtZXRob2QuCiAg IFdpdGggdHdvIGFyZ3VtZW50cywgcmV0cmlldmVzIGFuIGludGVnZXIgb3B0aW9uLgogICBX aXRoIGEgdGhpcmQgaW50ZWdlciBhcmd1bWVudCwgcmV0cmlldmVzIGEgc3RyaW5nIGJ1ZmZl ciBvZiB0aGF0IHNpemU7CiAgIHVzZSBvcHRpb25hbCBidWlsdC1pbiBtb2R1bGUgJ3N0cnVj dCcgdG8gZGVjb2RlIHRoZSBzdHJpbmcuICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2dl dHNvY2tvcHQoUHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYXJncykKewoJaW50 IGxldmVsOwoJaW50IG9wdG5hbWU7CglpbnQgcmVzOwoJUHlPYmplY3QgKmJ1ZjsKCXNvY2ts ZW5fdCBidWZsZW4gPSAwOwoKI2lmZGVmIF9fQkVPU19fCgkvKiBXZSBoYXZlIGluY29tcGxl dGUgc29ja2V0IHN1cHBvcnQuICovCglQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAi Z2V0c29ja29wdCBub3Qgc3VwcG9ydGVkIik7CglyZXR1cm4gTlVMTDsKI2Vsc2UKCglpZiAo IVB5QXJnX1BhcnNlVHVwbGUoYXJncywgImlpfGk6Z2V0c29ja29wdCIsCgkJCSAgICAgICZs ZXZlbCwgJm9wdG5hbWUsICZidWZsZW4pKQoJCXJldHVybiBOVUxMOwoKCWlmIChidWZsZW4g PT0gMCkgewoJCWludCBmbGFnID0gMDsKCQlzb2NrbGVuX3QgZmxhZ3NpemUgPSBzaXplb2Yg ZmxhZzsKCQlyZXMgPSBnZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIGxldmVsLCBvcHRuYW1lLAoJ CQkJICh2b2lkICopJmZsYWcsICZmbGFnc2l6ZSk7CgkJaWYgKHJlcyA8IDApCgkJCXJldHVy biBzLT5lcnJvcmhhbmRsZXIoKTsKCQlyZXR1cm4gUHlJbnRfRnJvbUxvbmcoZmxhZyk7Cgl9 CglpZiAoYnVmbGVuIDw9IDAgfHwgYnVmbGVuID4gMTAyNCkgewoJCVB5RXJyX1NldFN0cmlu Zyhzb2NrZXRfZXJyb3IsCgkJCQkiZ2V0c29ja29wdCBidWZsZW4gb3V0IG9mIHJhbmdlIik7 CgkJcmV0dXJuIE5VTEw7Cgl9CglidWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgo Y2hhciAqKU5VTEwsIGJ1Zmxlbik7CglpZiAoYnVmID09IE5VTEwpCgkJcmV0dXJuIE5VTEw7 CglyZXMgPSBnZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIGxldmVsLCBvcHRuYW1lLAoJCQkgKHZv aWQgKilQeVN0cmluZ19BU19TVFJJTkcoYnVmKSwgJmJ1Zmxlbik7CglpZiAocmVzIDwgMCkg ewoJCVB5X0RFQ1JFRihidWYpOwoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCX0KCV9Q eVN0cmluZ19SZXNpemUoJmJ1ZiwgYnVmbGVuKTsKCXJldHVybiBidWY7CiNlbmRpZiAvKiBf X0JFT1NfXyAqLwp9CgpzdGF0aWMgY2hhciBnZXRzb2Nrb3B0X2RvY1tdID0KImdldHNvY2tv cHQobGV2ZWwsIG9wdGlvblssIGJ1ZmZlcnNpemVdKSAtPiB2YWx1ZVxuXApcblwKR2V0IGEg c29ja2V0IG9wdGlvbi4gIFNlZSB0aGUgVW5peCBtYW51YWwgZm9yIGxldmVsIGFuZCBvcHRp b24uXG5cCklmIGEgbm9uemVybyBidWZmZXJzaXplIGFyZ3VtZW50IGlzIGdpdmVuLCB0aGUg cmV0dXJuIHZhbHVlIGlzIGFcblwKc3RyaW5nIG9mIHRoYXQgbGVuZ3RoOyBvdGhlcndpc2Ug aXQgaXMgYW4gaW50ZWdlci4iOwoKCi8qIHMuYmluZChzb2NrYWRkcikgbWV0aG9kICovCgpz dGF0aWMgUHlPYmplY3QgKgpzb2NrX2JpbmQoUHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9i amVjdCAqYWRkcm8pCnsKCXN0cnVjdCBzb2NrYWRkciAqYWRkcjsKCWludCBhZGRybGVuOwoJ aW50IHJlczsKCglpZiAoIWdldHNvY2thZGRyYXJnKHMsIGFkZHJvLCAmYWRkciwgJmFkZHJs ZW4pKQoJCXJldHVybiBOVUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gYmlu ZChzLT5zb2NrX2ZkLCBhZGRyLCBhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCglp ZiAocmVzIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7CglQeV9JTkNSRUYoUHlf Tm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIgYmluZF9kb2NbXSA9CiJi aW5kKGFkZHJlc3MpXG5cClxuXApCaW5kIHRoZSBzb2NrZXQgdG8gYSBsb2NhbCBhZGRyZXNz LiAgRm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzIGlzIGFcblwKcGFpciAoaG9zdCwgcG9y dCk7IHRoZSBob3N0IG11c3QgcmVmZXIgdG8gdGhlIGxvY2FsIGhvc3QuIEZvciByYXcgcGFj a2V0XG5cCnNvY2tldHMgdGhlIGFkZHJlc3MgaXMgYSB0dXBsZSAoaWZuYW1lLCBwcm90byBb LHBrdHR5cGUgWyxoYXR5cGVdXSkiOwoKCi8qIHMuY2xvc2UoKSBtZXRob2QuCiAgIFNldCB0 aGUgZmlsZSBkZXNjcmlwdG9yIHRvIC0xIHNvIG9wZXJhdGlvbnMgdHJpZWQgc3Vic2VxdWVu dGx5CiAgIHdpbGwgc3VyZWx5IGZhaWwuICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2Ns b3NlKFB5U29ja2V0U29ja09iamVjdCAqcykKewoJU09DS0VUX1QgZmQ7CgoJaWYgKChmZCA9 IHMtPnNvY2tfZmQpICE9IC0xKSB7CgkJcy0+c29ja19mZCA9IC0xOwoJCVB5X0JFR0lOX0FM TE9XX1RIUkVBRFMKCQkodm9pZCkgU09DS0VUQ0xPU0UoZmQpOwoJCVB5X0VORF9BTExPV19U SFJFQURTCgl9CglQeV9JTkNSRUYoUHlfTm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3Rh dGljIGNoYXIgY2xvc2VfZG9jW10gPQoiY2xvc2UoKVxuXApcblwKQ2xvc2UgdGhlIHNvY2tl dC4gIEl0IGNhbm5vdCBiZSB1c2VkIGFmdGVyIHRoaXMgY2FsbC4iOwoKCi8qIHMuY29ubmVj dChzb2NrYWRkcikgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2Nvbm5lY3Qo UHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYWRkcm8pCnsKCXN0cnVjdCBzb2Nr YWRkciAqYWRkcjsKCWludCBhZGRybGVuOwoJaW50IHJlczsKCglpZiAoIWdldHNvY2thZGRy YXJnKHMsIGFkZHJvLCAmYWRkciwgJmFkZHJsZW4pKQoJCXJldHVybiBOVUxMOwoKCWVycm5v ID0gMDsgLyogUmVzZXQgdGhlIGVyciBpbmRpY2F0b3IgZm9yIHVzZSB3aXRoIHRpbWVvdXRz ICovCgoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gY29ubmVjdChzLT5zb2NrX2Zk LCBhZGRyLCBhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKHMtPnNvY2tf dGltZW91dCA+PSAwLjApIHsKCQlpZiAocmVzIDwgMCkgewoJCQkvKiBSZXR1cm4gaWYgd2Un cmUgYWxyZWFkeSBjb25uZWN0ZWQgKi8KI2lmZGVmIE1TX1dJTkRPV1MKCQkJaWYgKGVycm5v ID09IFdTQUVJTlZBTCB8fCBlcnJubyA9PSBXU0FFSVNDT05OKQojZWxzZQoJCQlpZiAoZXJy bm8gPT0gRUlTQ09OTikKI2VuZGlmCgkJCQlnb3RvIGNvbm5lY3RlZDsKCgkJCS8qIENoZWNr IGlmIHdlIGhhdmUgYW4gZXJyb3IgKi8KCQkJaWYgKCFzLT5zb2NrX2Jsb2NraW5nKQoJCQkJ cmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJCQkvKiBDaGVjayBpZiB3ZSBoYXZlIGEgdHJ1 ZSBmYWlsdXJlCgkJCSAgIGZvciBhIGJsb2NraW5nIHNvY2tldCAqLwojaWZkZWYgTVNfV0lO RE9XUwoJCQlpZiAoZXJybm8gIT0gV1NBRVdPVUxEQkxPQ0spCiNlbHNlCgkJCWlmIChlcnJu byAhPSBFSU5QUk9HUkVTUyAmJiBlcnJubyAhPSBFQUxSRUFEWSAmJgoJCQkgICAgZXJybm8g IT0gRVdPVUxEQkxPQ0spCiNlbmRpZgoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJ CX0KCgkJLyogQ2hlY2sgaWYgd2UncmUgcmVhZHkgZm9yIHRoZSBjb25uZWN0IHZpYSBzZWxl Y3QgKi8KCQlpZiAoaW50ZXJuYWxfc2VsZWN0KHMsIDEpIDw9IDApCgkJCXJldHVybiBOVUxM OwoKCQkvKiBDb21wbGV0ZSB0aGUgY29ubmVjdGlvbiBub3cgKi8KCQlQeV9CRUdJTl9BTExP V19USFJFQURTCgkJcmVzID0gY29ubmVjdChzLT5zb2NrX2ZkLCBhZGRyLCBhZGRybGVuKTsK CQlQeV9FTkRfQUxMT1dfVEhSRUFEUwoJfQoKCWlmIChyZXMgPCAwKQoJCXJldHVybiBzLT5l cnJvcmhhbmRsZXIoKTsKCmNvbm5lY3RlZDoKCVB5X0lOQ1JFRihQeV9Ob25lKTsKCXJldHVy biBQeV9Ob25lOwp9CgpzdGF0aWMgY2hhciBjb25uZWN0X2RvY1tdID0KImNvbm5lY3QoYWRk cmVzcylcblwKXG5cCkNvbm5lY3QgdGhlIHNvY2tldCB0byBhIHJlbW90ZSBhZGRyZXNzLiAg Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmlzIGEgcGFpciAoaG9zdCwgcG9ydCku IjsKCgovKiBzLmNvbm5lY3RfZXgoc29ja2FkZHIpIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq ZWN0ICoKc29ja19jb25uZWN0X2V4KFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3Qg KmFkZHJvKQp7CglzdHJ1Y3Qgc29ja2FkZHIgKmFkZHI7CglpbnQgYWRkcmxlbjsKCWludCBy ZXM7CgoJaWYgKCFnZXRzb2NrYWRkcmFyZyhzLCBhZGRybywgJmFkZHIsICZhZGRybGVuKSkK CQlyZXR1cm4gTlVMTDsKCgllcnJubyA9IDA7IC8qIFJlc2V0IHRoZSBlcnIgaW5kaWNhdG9y IGZvciB1c2Ugd2l0aCB0aW1lb3V0cyAqLwoKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCXJl cyA9IGNvbm5lY3Qocy0+c29ja19mZCwgYWRkciwgYWRkcmxlbik7CglQeV9FTkRfQUxMT1df VEhSRUFEUwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHJlcyA8IDAp IHsKCQkJLyogUmV0dXJuIGlmIHdlJ3JlIGFscmVhZHkgY29ubmVjdGVkICovCiNpZmRlZiBN U19XSU5ET1dTCgkJCWlmIChlcnJubyA9PSBXU0FFSU5WQUwgfHwgZXJybm8gPT0gV1NBRUlT Q09OTikKI2Vsc2UKCQkJaWYgKGVycm5vID09IEVJU0NPTk4pCiNlbmRpZgoJCQkJZ290byBj b25leF9maW5hbGx5OwoKCQkJLyogQ2hlY2sgaWYgd2UgaGF2ZSBhbiBlcnJvciAqLwoJCQlp ZiAoIXMtPnNvY2tfYmxvY2tpbmcpCgkJCQlnb3RvIGNvbmV4X2ZpbmFsbHk7CgkJCS8qIENo ZWNrIGlmIHdlIGhhdmUgYSB0cnVlIGZhaWx1cmUKCQkJICAgZm9yIGEgYmxvY2tpbmcgc29j a2V0ICovCiNpZmRlZiBNU19XSU5ET1dTCgkJCWlmIChlcnJubyAhPSBXU0FFV09VTERCTE9D SykKI2Vsc2UKCQkJaWYgKGVycm5vICE9IEVJTlBST0dSRVNTICYmIGVycm5vICE9IEVBTFJF QURZICYmCgkJCSAgICBlcnJubyAhPSBFV09VTERCTE9DSykKI2VuZGlmCgkJCQlnb3RvIGNv bmV4X2ZpbmFsbHk7CgkJfQoKCQkvKiBDaGVjayBpZiB3ZSdyZSByZWFkeSBmb3IgdGhlIGNv bm5lY3QgdmlhIHNlbGVjdCAqLwoJCWlmIChpbnRlcm5hbF9zZWxlY3QocywgMSkgPD0gMCkK CQkJcmV0dXJuIE5VTEw7CgoJCS8qIENvbXBsZXRlIHRoZSBjb25uZWN0aW9uIG5vdyAqLwoJ CVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCQlyZXMgPSBjb25uZWN0KHMtPnNvY2tfZmQsIGFk ZHIsIGFkZHJsZW4pOwoJCVB5X0VORF9BTExPV19USFJFQURTCgl9CgoJaWYgKHJlcyAhPSAw KSB7CiNpZmRlZiBNU19XSU5ET1dTCgkJcmVzID0gV1NBR2V0TGFzdEVycm9yKCk7CiNlbHNl CgkJcmVzID0gZXJybm87CiNlbmRpZgoJfQoKY29uZXhfZmluYWxseToKCXJldHVybiBQeUlu dF9Gcm9tTG9uZygobG9uZykgcmVzKTsKfQoKc3RhdGljIGNoYXIgY29ubmVjdF9leF9kb2Nb XSA9CiJjb25uZWN0X2V4KGFkZHJlc3MpXG5cClxuXApUaGlzIGlzIGxpa2UgY29ubmVjdChh ZGRyZXNzKSwgYnV0IHJldHVybnMgYW4gZXJyb3IgY29kZSAodGhlIGVycm5vIHZhbHVlKVxu XAppbnN0ZWFkIG9mIHJhaXNpbmcgYW4gZXhjZXB0aW9uIHdoZW4gYW4gZXJyb3Igb2NjdXJz LiI7CgoKLyogcy5maWxlbm8oKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tf ZmlsZW5vKFB5U29ja2V0U29ja09iamVjdCAqcykKewojaWYgU0laRU9GX1NPQ0tFVF9UIDw9 IFNJWkVPRl9MT05HCglyZXR1cm4gUHlJbnRfRnJvbUxvbmcoKGxvbmcpIHMtPnNvY2tfZmQp OwojZWxzZQoJcmV0dXJuIFB5TG9uZ19Gcm9tTG9uZ0xvbmcoKExPTkdfTE9ORylzLT5zb2Nr X2ZkKTsKI2VuZGlmCn0KCnN0YXRpYyBjaGFyIGZpbGVub19kb2NbXSA9CiJmaWxlbm8oKSAt PiBpbnRlZ2VyXG5cClxuXApSZXR1cm4gdGhlIGludGVnZXIgZmlsZSBkZXNjcmlwdG9yIG9m IHRoZSBzb2NrZXQuIjsKCgojaWZuZGVmIE5PX0RVUAovKiBzLmR1cCgpIG1ldGhvZCAqLwoK c3RhdGljIFB5T2JqZWN0ICoKc29ja19kdXAoUHlTb2NrZXRTb2NrT2JqZWN0ICpzKQp7CglT T0NLRVRfVCBuZXdmZDsKCVB5T2JqZWN0ICpzb2NrOwoKCW5ld2ZkID0gZHVwKHMtPnNvY2tf ZmQpOwoJaWYgKG5ld2ZkIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7Cglzb2Nr ID0gKFB5T2JqZWN0ICopIG5ld19zb2Nrb2JqZWN0KG5ld2ZkLAoJCQkJCSAgIHMtPnNvY2tf ZmFtaWx5LAoJCQkJCSAgIHMtPnNvY2tfdHlwZSwKCQkJCQkgICBzLT5zb2NrX3Byb3RvKTsK CWlmIChzb2NrID09IE5VTEwpCgkJU09DS0VUQ0xPU0UobmV3ZmQpOwoJcmV0dXJuIHNvY2s7 Cn0KCnN0YXRpYyBjaGFyIGR1cF9kb2NbXSA9CiJkdXAoKSAtPiBzb2NrZXQgb2JqZWN0XG5c ClxuXApSZXR1cm4gYSBuZXcgc29ja2V0IG9iamVjdCBjb25uZWN0ZWQgdG8gdGhlIHNhbWUg c3lzdGVtIHJlc291cmNlLiI7CgojZW5kaWYKCgovKiBzLmdldHNvY2tuYW1lKCkgbWV0aG9k ICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2dldHNvY2tuYW1lKFB5U29ja2V0U29ja09i amVjdCAqcykKewoJY2hhciBhZGRyYnVmWzI1Nl07CglpbnQgcmVzOwoJc29ja2xlbl90IGFk ZHJsZW47CgoJaWYgKCFnZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5V TEw7CgltZW1zZXQoYWRkcmJ1ZiwgMCwgYWRkcmxlbik7CglQeV9CRUdJTl9BTExPV19USFJF QURTCglyZXMgPSBnZXRzb2NrbmFtZShzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICop IGFkZHJidWYsICZhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCglpZiAocmVzIDwg MCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7CglyZXR1cm4gbWFrZXNvY2thZGRyKHMt PnNvY2tfZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRkcmJ1ZiwgYWRkcmxlbik7Cn0KCnN0 YXRpYyBjaGFyIGdldHNvY2tuYW1lX2RvY1tdID0KImdldHNvY2tuYW1lKCkgLT4gYWRkcmVz cyBpbmZvXG5cClxuXApSZXR1cm4gdGhlIGFkZHJlc3Mgb2YgdGhlIGxvY2FsIGVuZHBvaW50 LiAgRm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0 YWRkciwgcG9ydCkuIjsKCgojaWZkZWYgSEFWRV9HRVRQRUVSTkFNRQkJLyogQ3JheSBBUFAg ZG9lc24ndCBoYXZlIHRoaXMgOi0oICovCi8qIHMuZ2V0cGVlcm5hbWUoKSBtZXRob2QgKi8K CnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfZ2V0cGVlcm5hbWUoUHlTb2NrZXRTb2NrT2JqZWN0 ICpzKQp7CgljaGFyIGFkZHJidWZbMjU2XTsKCWludCByZXM7Cglzb2NrbGVuX3QgYWRkcmxl bjsKCglpZiAoIWdldHNvY2thZGRybGVuKHMsICZhZGRybGVuKSkKCQlyZXR1cm4gTlVMTDsK CW1lbXNldChhZGRyYnVmLCAwLCBhZGRybGVuKTsKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMK CXJlcyA9IGdldHBlZXJuYW1lKHMtPnNvY2tfZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRk cmJ1ZiwgJmFkZHJsZW4pOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChyZXMgPCAwKQoJ CXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCXJldHVybiBtYWtlc29ja2FkZHIocy0+c29j a19mZCwgKHN0cnVjdCBzb2NrYWRkciAqKSBhZGRyYnVmLCBhZGRybGVuKTsKfQoKc3RhdGlj IGNoYXIgZ2V0cGVlcm5hbWVfZG9jW10gPQoiZ2V0cGVlcm5hbWUoKSAtPiBhZGRyZXNzIGlu Zm9cblwKXG5cClJldHVybiB0aGUgYWRkcmVzcyBvZiB0aGUgcmVtb3RlIGVuZHBvaW50LiAg Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0YWRk ciwgcG9ydCkuIjsKCiNlbmRpZiAvKiBIQVZFX0dFVFBFRVJOQU1FICovCgoKLyogcy5saXN0 ZW4obikgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2xpc3RlbihQeVNvY2tl dFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmcpCnsKCWludCBiYWNrbG9nOwoJaW50IHJl czsKCgliYWNrbG9nID0gUHlJbnRfQXNMb25nKGFyZyk7CglpZiAoYmFja2xvZyA9PSAtMSAm JiBQeUVycl9PY2N1cnJlZCgpKQoJCXJldHVybiBOVUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhS RUFEUwoJaWYgKGJhY2tsb2cgPCAxKQoJCWJhY2tsb2cgPSAxOwoJcmVzID0gbGlzdGVuKHMt PnNvY2tfZmQsIGJhY2tsb2cpOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChyZXMgPCAw KQoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCVB5X0lOQ1JFRihQeV9Ob25lKTsKCXJl dHVybiBQeV9Ob25lOwp9CgpzdGF0aWMgY2hhciBsaXN0ZW5fZG9jW10gPQoibGlzdGVuKGJh Y2tsb2cpXG5cClxuXApFbmFibGUgYSBzZXJ2ZXIgdG8gYWNjZXB0IGNvbm5lY3Rpb25zLiAg VGhlIGJhY2tsb2cgYXJndW1lbnQgbXVzdCBiZSBhdFxuXApsZWFzdCAxOyBpdCBzcGVjaWZp ZXMgdGhlIG51bWJlciBvZiB1bmFjY2VwdGVkIGNvbm5lY3Rpb24gdGhhdCB0aGUgc3lzdGVt XG5cCndpbGwgYWxsb3cgYmVmb3JlIHJlZnVzaW5nIG5ldyBjb25uZWN0aW9ucy4iOwoKCiNp Zm5kZWYgTk9fRFVQCi8qIHMubWFrZWZpbGUobW9kZSkgbWV0aG9kLgogICBDcmVhdGUgYSBu ZXcgb3BlbiBmaWxlIG9iamVjdCByZWZlcnJpbmcgdG8gYSBkdXBwZWQgdmVyc2lvbiBvZgog ICB0aGUgc29ja2V0J3MgZmlsZSBkZXNjcmlwdG9yLiAgKFRoZSBkdXAoKSBjYWxsIGlzIG5l Y2Vzc2FyeSBzbwogICB0aGF0IHRoZSBvcGVuIGZpbGUgYW5kIHNvY2tldCBvYmplY3RzIG1h eSBiZSBjbG9zZWQgaW5kZXBlbmRlbnQKICAgb2YgZWFjaCBvdGhlci4pCiAgIFRoZSBtb2Rl IGFyZ3VtZW50IHNwZWNpZmllcyAncicgb3IgJ3cnIHBhc3NlZCB0byBmZG9wZW4oKS4gKi8K CnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfbWFrZWZpbGUoUHlTb2NrZXRTb2NrT2JqZWN0ICpz LCBQeU9iamVjdCAqYXJncykKewoJZXh0ZXJuIGludCBmY2xvc2UoRklMRSAqKTsKCWNoYXIg Km1vZGUgPSAiciI7CglpbnQgYnVmc2l6ZSA9IC0xOwojaWZkZWYgTVNfV0lOMzIKCVB5X2lu dHB0cl90IGZkOwojZWxzZQoJaW50IGZkOwojZW5kaWYKCUZJTEUgKmZwOwoJUHlPYmplY3Qg KmY7CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJ8c2k6bWFrZWZpbGUiLCAmbW9k ZSwgJmJ1ZnNpemUpKQoJCXJldHVybiBOVUxMOwojaWZkZWYgTVNfV0lOMzIKCWlmICgoKGZk ID0gX29wZW5fb3NmaGFuZGxlKHMtPnNvY2tfZmQsIF9PX0JJTkFSWSkpIDwgMCkgfHwKCSAg ICAoKGZkID0gZHVwKGZkKSkgPCAwKSB8fCAoKGZwID0gZmRvcGVuKGZkLCBtb2RlKSkgPT0g TlVMTCkpCiNlbHNlCglpZiAoKGZkID0gZHVwKHMtPnNvY2tfZmQpKSA8IDAgfHwgKGZwID0g ZmRvcGVuKGZkLCBtb2RlKSkgPT0gTlVMTCkKI2VuZGlmCgl7CgkJaWYgKGZkID49IDApCgkJ CVNPQ0tFVENMT1NFKGZkKTsKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7Cgl9CiNpZmRl ZiBVU0VfR1VTSTIKCS8qIFdvcmthcm91bmQgZm9yIGJ1ZyBpbiBNZXRyb3dlcmtzIE1TTCB2 cy4gR1VTSSBJL08gbGlicmFyeSAqLwoJaWYgKHN0cmNocihtb2RlLCAnYicpICE9IE5VTEwp CgkJYnVmc2l6ZSA9IDA7CiNlbmRpZgoJZiA9IFB5RmlsZV9Gcm9tRmlsZShmcCwgIjxzb2Nr ZXQ+IiwgbW9kZSwgZmNsb3NlKTsKCWlmIChmICE9IE5VTEwpCgkJUHlGaWxlX1NldEJ1ZlNp emUoZiwgYnVmc2l6ZSk7CglyZXR1cm4gZjsKfQoKc3RhdGljIGNoYXIgbWFrZWZpbGVfZG9j W10gPQoibWFrZWZpbGUoW21vZGVbLCBidWZmZXJzaXplXV0pIC0+IGZpbGUgb2JqZWN0XG5c ClxuXApSZXR1cm4gYSByZWd1bGFyIGZpbGUgb2JqZWN0IGNvcnJlc3BvbmRpbmcgdG8gdGhl IHNvY2tldC5cblwKVGhlIG1vZGUgYW5kIGJ1ZmZlcnNpemUgYXJndW1lbnRzIGFyZSBhcyBm b3IgdGhlIGJ1aWx0LWluIG9wZW4oKSBmdW5jdGlvbi4iOwoKI2VuZGlmIC8qIE5PX0RVUCAq LwoKCi8qIHMucmVjdihuYnl0ZXMgWyxmbGFnc10pIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq ZWN0ICoKc29ja19yZWN2KFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3QgKmFyZ3Mp CnsKCWludCBsZW4sIG4sIGZsYWdzID0gMDsKCVB5T2JqZWN0ICpidWY7CgoJaWYgKCFQeUFy Z19QYXJzZVR1cGxlKGFyZ3MsICJpfGk6cmVjdiIsICZsZW4sICZmbGFncykpCgkJcmV0dXJu IE5VTEw7CgoJaWYgKGxlbiA8IDApIHsKCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfVmFsdWVF cnJvciwKCQkJCSJuZWdhdGl2ZSBidWZmZXJzaXplIGluIGNvbm5lY3QiKTsKCQlyZXR1cm4g TlVMTDsKCX0KCglidWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgoY2hhciAqKSAw LCBsZW4pOwoJaWYgKGJ1ZiA9PSBOVUxMKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2Nr X3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGlu dGVybmFsX3NlbGVjdChzLCAwKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5 X0JFR0lOX0FMTE9XX1RIUkVBRFMKCW4gPSByZWN2KHMtPnNvY2tfZmQsIFB5U3RyaW5nX0FT X1NUUklORyhidWYpLCBsZW4sIGZsYWdzKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYg KG4gPCAwKSB7CgkJUHlfREVDUkVGKGJ1Zik7CgkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigp OwoJfQoJaWYgKG4gIT0gbGVuKQoJCV9QeVN0cmluZ19SZXNpemUoJmJ1Ziwgbik7CglyZXR1 cm4gYnVmOwp9CgpzdGF0aWMgY2hhciByZWN2X2RvY1tdID0KInJlY3YoYnVmZmVyc2l6ZVss IGZsYWdzXSkgLT4gZGF0YVxuXApcblwKUmVjZWl2ZSB1cCB0byBidWZmZXJzaXplIGJ5dGVz IGZyb20gdGhlIHNvY2tldC4gIEZvciB0aGUgb3B0aW9uYWwgZmxhZ3NcblwKYXJndW1lbnQs IHNlZSB0aGUgVW5peCBtYW51YWwuICBXaGVuIG5vIGRhdGEgaXMgYXZhaWxhYmxlLCBibG9j ayB1bnRpbFxuXAphdCBsZWFzdCBvbmUgYnl0ZSBpcyBhdmFpbGFibGUgb3IgdW50aWwgdGhl IHJlbW90ZSBlbmQgaXMgY2xvc2VkLiAgV2hlblxuXAp0aGUgcmVtb3RlIGVuZCBpcyBjbG9z ZWQgYW5kIGFsbCBkYXRhIGlzIHJlYWQsIHJldHVybiB0aGUgZW1wdHkgc3RyaW5nLiI7CgoK Lyogcy5yZWN2ZnJvbShuYnl0ZXMgWyxmbGFnc10pIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq ZWN0ICoKc29ja19yZWN2ZnJvbShQeVNvY2tldFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICph cmdzKQp7CgljaGFyIGFkZHJidWZbMjU2XTsKCVB5T2JqZWN0ICpidWYgPSBOVUxMOwoJUHlP YmplY3QgKmFkZHIgPSBOVUxMOwoJUHlPYmplY3QgKnJldCA9IE5VTEw7CglpbnQgbGVuLCBu LCBmbGFncyA9IDA7Cglzb2NrbGVuX3QgYWRkcmxlbjsKCglpZiAoIVB5QXJnX1BhcnNlVHVw bGUoYXJncywgIml8aTpyZWN2ZnJvbSIsICZsZW4sICZmbGFncykpCgkJcmV0dXJuIE5VTEw7 CgoJaWYgKCFnZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5VTEw7Cgli dWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgoY2hhciAqKSAwLCBsZW4pOwoJaWYg KGJ1ZiA9PSBOVUxMKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0g MC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGludGVybmFsX3NlbGVj dChzLCAwKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5X0JFR0lOX0FMTE9X X1RIUkVBRFMKCW1lbXNldChhZGRyYnVmLCAwLCBhZGRybGVuKTsKCW4gPSByZWN2ZnJvbShz LT5zb2NrX2ZkLCBQeVN0cmluZ19BU19TVFJJTkcoYnVmKSwgbGVuLCBmbGFncywKI2lmbmRl ZiBNU19XSU5ET1dTCiNpZiBkZWZpbmVkKFBZT1NfT1MyKSAmJiAhZGVmaW5lZChQWUNDX0dD QykKCQkgICAgIChzdHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLCAmYWRkcmxlbgojZWxzZQoJ CSAgICAgKHZvaWQgKilhZGRyYnVmLCAmYWRkcmxlbgojZW5kaWYKI2Vsc2UKCQkgICAgIChz dHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLCAmYWRkcmxlbgojZW5kaWYKCQkgICAgICk7CglQ eV9FTkRfQUxMT1dfVEhSRUFEUwoKCWlmIChuIDwgMCkgewoJCVB5X0RFQ1JFRihidWYpOwoJ CXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCX0KCglpZiAobiAhPSBsZW4gJiYgX1B5U3Ry aW5nX1Jlc2l6ZSgmYnVmLCBuKSA8IDApCgkJcmV0dXJuIE5VTEw7CgoJaWYgKCEoYWRkciA9 IG1ha2Vzb2NrYWRkcihzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICopYWRkcmJ1ZiwK CQkJCSAgYWRkcmxlbikpKQoJCWdvdG8gZmluYWxseTsKCglyZXQgPSBQeV9CdWlsZFZhbHVl KCJPTyIsIGJ1ZiwgYWRkcik7CgpmaW5hbGx5OgoJUHlfWERFQ1JFRihhZGRyKTsKCVB5X1hE RUNSRUYoYnVmKTsKCXJldHVybiByZXQ7Cn0KCnN0YXRpYyBjaGFyIHJlY3Zmcm9tX2RvY1td ID0KInJlY3Zmcm9tKGJ1ZmZlcnNpemVbLCBmbGFnc10pIC0+IChkYXRhLCBhZGRyZXNzIGlu Zm8pXG5cClxuXApMaWtlIHJlY3YoYnVmZmVyc2l6ZSwgZmxhZ3MpIGJ1dCBhbHNvIHJldHVy biB0aGUgc2VuZGVyJ3MgYWRkcmVzcyBpbmZvLiI7CgovKiBzLnNlbmQoZGF0YSBbLGZsYWdz XSkgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX3NlbmQoUHlTb2NrZXRTb2Nr T2JqZWN0ICpzLCBQeU9iamVjdCAqYXJncykKewoJY2hhciAqYnVmOwoJaW50IGxlbiwgbiwg ZmxhZ3MgPSAwOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAicyN8aTpzZW5kIiwg JmJ1ZiwgJmxlbiwgJmZsYWdzKSkKCQlyZXR1cm4gTlVMTDsKCglpZiAocy0+c29ja190aW1l b3V0ID49IDAuMCkgewoJCWlmIChzLT5zb2NrX2Jsb2NraW5nKSB7CgkJCWlmIChpbnRlcm5h bF9zZWxlY3QocywgMSkgPD0gMCkKCQkJCXJldHVybiBOVUxMOwoJCX0KCX0KCglQeV9CRUdJ Tl9BTExPV19USFJFQURTCgluID0gc2VuZChzLT5zb2NrX2ZkLCBidWYsIGxlbiwgZmxhZ3Mp OwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCglpZiAobiA8IDApCgkJcmV0dXJuIHMtPmVycm9y aGFuZGxlcigpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKChsb25nKW4pOwp9CgpzdGF0aWMg Y2hhciBzZW5kX2RvY1tdID0KInNlbmQoZGF0YVssIGZsYWdzXSkgLT4gY291bnRcblwKXG5c ClNlbmQgYSBkYXRhIHN0cmluZyB0byB0aGUgc29ja2V0LiAgRm9yIHRoZSBvcHRpb25hbCBm bGFnc1xuXAphcmd1bWVudCwgc2VlIHRoZSBVbml4IG1hbnVhbC4gIFJldHVybiB0aGUgbnVt YmVyIG9mIGJ5dGVzXG5cCnNlbnQ7IHRoaXMgbWF5IGJlIGxlc3MgdGhhbiBsZW4oZGF0YSkg aWYgdGhlIG5ldHdvcmsgaXMgYnVzeS4iOwoKCi8qIHMuc2VuZGFsbChkYXRhIFssZmxhZ3Nd KSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2VuZGFsbChQeVNvY2tldFNv Y2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmdzKQp7CgljaGFyICpidWY7CglpbnQgbGVuLCBu LCBmbGFncyA9IDA7CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJzI3xpOnNlbmRh bGwiLCAmYnVmLCAmbGVuLCAmZmxhZ3MpKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2Nr X3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGlu dGVybmFsX3NlbGVjdChzLCAxKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5 X0JFR0lOX0FMTE9XX1RIUkVBRFMKCWRvIHsKCQluID0gc2VuZChzLT5zb2NrX2ZkLCBidWYs IGxlbiwgZmxhZ3MpOwoJCWlmIChuIDwgMCkKCQkJYnJlYWs7CgkJYnVmICs9IG47CgkJbGVu IC09IG47Cgl9IHdoaWxlIChsZW4gPiAwKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYg KG4gPCAwKQoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCglQeV9JTkNSRUYoUHlfTm9u ZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIgc2VuZGFsbF9kb2NbXSA9CiJz ZW5kYWxsKGRhdGFbLCBmbGFnc10pXG5cClxuXApTZW5kIGEgZGF0YSBzdHJpbmcgdG8gdGhl IHNvY2tldC4gIEZvciB0aGUgb3B0aW9uYWwgZmxhZ3NcblwKYXJndW1lbnQsIHNlZSB0aGUg VW5peCBtYW51YWwuICBUaGlzIGNhbGxzIHNlbmQoKSByZXBlYXRlZGx5XG5cCnVudGlsIGFs bCBkYXRhIGlzIHNlbnQuICBJZiBhbiBlcnJvciBvY2N1cnMsIGl0J3MgaW1wb3NzaWJsZVxu XAp0byB0ZWxsIGhvdyBtdWNoIGRhdGEgaGFzIGJlZW4gc2VudC4iOwoKCi8qIHMuc2VuZHRv KGRhdGEsIFtmbGFncyxdIHNvY2thZGRyKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAq CnNvY2tfc2VuZHRvKFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3QgKmFyZ3MpCnsK CVB5T2JqZWN0ICphZGRybzsKCWNoYXIgKmJ1ZjsKCXN0cnVjdCBzb2NrYWRkciAqYWRkcjsK CWludCBhZGRybGVuLCBsZW4sIG4sIGZsYWdzOwoKCWZsYWdzID0gMDsKCWlmICghUHlBcmdf UGFyc2VUdXBsZShhcmdzLCAicyNPOnNlbmR0byIsICZidWYsICZsZW4sICZhZGRybykpIHsK CQlQeUVycl9DbGVhcigpOwoJCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAicyNpTzpz ZW5kdG8iLAoJCQkJICAgICAgJmJ1ZiwgJmxlbiwgJmZsYWdzLCAmYWRkcm8pKQoJCQlyZXR1 cm4gTlVMTDsKCX0KCglpZiAoIWdldHNvY2thZGRyYXJnKHMsIGFkZHJvLCAmYWRkciwgJmFk ZHJsZW4pKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0gMC4wKSB7 CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGludGVybmFsX3NlbGVjdChzLCAx KSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5X0JFR0lOX0FMTE9XX1RIUkVB RFMKCW4gPSBzZW5kdG8ocy0+c29ja19mZCwgYnVmLCBsZW4sIGZsYWdzLCBhZGRyLCBhZGRy bGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKG4gPCAwKQoJCXJldHVybiBzLT5l cnJvcmhhbmRsZXIoKTsKCXJldHVybiBQeUludF9Gcm9tTG9uZygobG9uZyluKTsKfQoKc3Rh dGljIGNoYXIgc2VuZHRvX2RvY1tdID0KInNlbmR0byhkYXRhWywgZmxhZ3NdLCBhZGRyZXNz KVxuXApcblwKTGlrZSBzZW5kKGRhdGEsIGZsYWdzKSBidXQgYWxsb3dzIHNwZWNpZnlpbmcg dGhlIGRlc3RpbmF0aW9uIGFkZHJlc3MuXG5cCkZvciBJUCBzb2NrZXRzLCB0aGUgYWRkcmVz cyBpcyBhIHBhaXIgKGhvc3RhZGRyLCBwb3J0KS4iOwoKCi8qIHMuc2h1dGRvd24oaG93KSBt ZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2h1dGRvd24oUHlTb2NrZXRTb2Nr T2JqZWN0ICpzLCBQeU9iamVjdCAqYXJnKQp7CglpbnQgaG93OwoJaW50IHJlczsKCglob3cg PSBQeUludF9Bc0xvbmcoYXJnKTsKCWlmIChob3cgPT0gLTEgJiYgUHlFcnJfT2NjdXJyZWQo KSkKCQlyZXR1cm4gTlVMTDsKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCXJlcyA9IHNodXRk b3duKHMtPnNvY2tfZmQsIGhvdyk7CglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJaWYgKHJlcyA8 IDApCgkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJUHlfSU5DUkVGKFB5X05vbmUpOwoJ cmV0dXJuIFB5X05vbmU7Cn0KCnN0YXRpYyBjaGFyIHNodXRkb3duX2RvY1tdID0KInNodXRk b3duKGZsYWcpXG5cClxuXApTaHV0IGRvd24gdGhlIHJlYWRpbmcgc2lkZSBvZiB0aGUgc29j a2V0IChmbGFnID09IDApLCB0aGUgd3JpdGluZyBzaWRlXG5cCm9mIHRoZSBzb2NrZXQgKGZs YWcgPT0gMSksIG9yIGJvdGggZW5kcyAoZmxhZyA9PSAyKS4iOwoKCi8qIExpc3Qgb2YgbWV0 aG9kcyBmb3Igc29ja2V0IG9iamVjdHMgKi8KCnN0YXRpYyBQeU1ldGhvZERlZiBzb2NrX21l dGhvZHNbXSA9IHsKCXsiYWNjZXB0IiwJKFB5Q0Z1bmN0aW9uKXNvY2tfYWNjZXB0LCBNRVRI X05PQVJHUywKCQkJYWNjZXB0X2RvY30sCgl7ImJpbmQiLAkoUHlDRnVuY3Rpb24pc29ja19i aW5kLCBNRVRIX08sCgkJCWJpbmRfZG9jfSwKCXsiY2xvc2UiLAkoUHlDRnVuY3Rpb24pc29j a19jbG9zZSwgTUVUSF9OT0FSR1MsCgkJCWNsb3NlX2RvY30sCgl7ImNvbm5lY3QiLAkoUHlD RnVuY3Rpb24pc29ja19jb25uZWN0LCBNRVRIX08sCgkJCWNvbm5lY3RfZG9jfSwKCXsiY29u bmVjdF9leCIsCShQeUNGdW5jdGlvbilzb2NrX2Nvbm5lY3RfZXgsIE1FVEhfTywKCQkJY29u bmVjdF9leF9kb2N9LAojaWZuZGVmIE5PX0RVUAoJeyJkdXAiLAkJKFB5Q0Z1bmN0aW9uKXNv Y2tfZHVwLCBNRVRIX05PQVJHUywKCQkJZHVwX2RvY30sCiNlbmRpZgoJeyJmaWxlbm8iLAko UHlDRnVuY3Rpb24pc29ja19maWxlbm8sIE1FVEhfTk9BUkdTLAoJCQlmaWxlbm9fZG9jfSwK I2lmZGVmIEhBVkVfR0VUUEVFUk5BTUUKCXsiZ2V0cGVlcm5hbWUiLAkoUHlDRnVuY3Rpb24p c29ja19nZXRwZWVybmFtZSwKCQkJTUVUSF9OT0FSR1MsIGdldHBlZXJuYW1lX2RvY30sCiNl bmRpZgoJeyJnZXRzb2NrbmFtZSIsCShQeUNGdW5jdGlvbilzb2NrX2dldHNvY2tuYW1lLAoJ CQlNRVRIX05PQVJHUywgZ2V0c29ja25hbWVfZG9jfSwKCXsiZ2V0c29ja29wdCIsCShQeUNG dW5jdGlvbilzb2NrX2dldHNvY2tvcHQsIE1FVEhfVkFSQVJHUywKCQkJZ2V0c29ja29wdF9k b2N9LAoJeyJsaXN0ZW4iLAkoUHlDRnVuY3Rpb24pc29ja19saXN0ZW4sIE1FVEhfTywKCQkJ bGlzdGVuX2RvY30sCiNpZm5kZWYgTk9fRFVQCgl7Im1ha2VmaWxlIiwJKFB5Q0Z1bmN0aW9u KXNvY2tfbWFrZWZpbGUsIE1FVEhfVkFSQVJHUywKCQkJbWFrZWZpbGVfZG9jfSwKI2VuZGlm Cgl7InJlY3YiLAkoUHlDRnVuY3Rpb24pc29ja19yZWN2LCBNRVRIX1ZBUkFSR1MsCgkJCXJl Y3ZfZG9jfSwKCXsicmVjdmZyb20iLAkoUHlDRnVuY3Rpb24pc29ja19yZWN2ZnJvbSwgTUVU SF9WQVJBUkdTLAoJCQlyZWN2ZnJvbV9kb2N9LAoJeyJzZW5kIiwJKFB5Q0Z1bmN0aW9uKXNv Y2tfc2VuZCwgTUVUSF9WQVJBUkdTLAoJCQlzZW5kX2RvY30sCgl7InNlbmRhbGwiLAkoUHlD RnVuY3Rpb24pc29ja19zZW5kYWxsLCBNRVRIX1ZBUkFSR1MsCgkJCXNlbmRhbGxfZG9jfSwK CXsic2VuZHRvIiwJKFB5Q0Z1bmN0aW9uKXNvY2tfc2VuZHRvLCBNRVRIX1ZBUkFSR1MsCgkJ CXNlbmR0b19kb2N9LAoJeyJzZXRibG9ja2luZyIsCShQeUNGdW5jdGlvbilzb2NrX3NldGJs b2NraW5nLCBNRVRIX08sCgkJCXNldGJsb2NraW5nX2RvY30sCgl7InNldHRpbWVvdXQiLCAo UHlDRnVuY3Rpb24pc29ja19zZXR0aW1lb3V0LCBNRVRIX08sCgkJCXNldHRpbWVvdXRfZG9j fSwKCXsiZ2V0dGltZW91dCIsIChQeUNGdW5jdGlvbilzb2NrX2dldHRpbWVvdXQsIE1FVEhf Tk9BUkdTLAoJCQlnZXR0aW1lb3V0X2RvY30sCgl7InNldHNvY2tvcHQiLAkoUHlDRnVuY3Rp b24pc29ja19zZXRzb2Nrb3B0LCBNRVRIX1ZBUkFSR1MsCgkJCXNldHNvY2tvcHRfZG9jfSwK CXsic2h1dGRvd24iLAkoUHlDRnVuY3Rpb24pc29ja19zaHV0ZG93biwgTUVUSF9PLAoJCQlz aHV0ZG93bl9kb2N9LAojaWZkZWYgUklTQ09TCgl7InNsZWVwdGFza3ciLAkoUHlDRnVuY3Rp b24pc29ja19zbGVlcHRhc2t3LCBNRVRIX1ZBUkFSR1MsCgkgCQlzbGVlcHRhc2t3X2RvY30s CiNlbmRpZgoJe05VTEwsCQkJTlVMTH0JCS8qIHNlbnRpbmVsICovCn07CgoKLyogRGVhbGxv Y2F0ZSBhIHNvY2tldCBvYmplY3QgaW4gcmVzcG9uc2UgdG8gdGhlIGxhc3QgUHlfREVDUkVG KCkuCiAgIEZpcnN0IGNsb3NlIHRoZSBmaWxlIGRlc2NyaXB0aW9uLiAqLwoKc3RhdGljIHZv aWQKc29ja19kZWFsbG9jKFB5U29ja2V0U29ja09iamVjdCAqcykKewoJaWYgKHMtPnNvY2tf ZmQgIT0gLTEpCgkJKHZvaWQpIFNPQ0tFVENMT1NFKHMtPnNvY2tfZmQpOwoJcy0+b2JfdHlw ZS0+dHBfZnJlZSgoUHlPYmplY3QgKilzKTsKfQoKCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tf cmVwcihQeVNvY2tldFNvY2tPYmplY3QgKnMpCnsKCWNoYXIgYnVmWzUxMl07CiNpZiBTSVpF T0ZfU09DS0VUX1QgPiBTSVpFT0ZfTE9ORwoJaWYgKHMtPnNvY2tfZmQgPiBMT05HX01BWCkg ewoJCS8qIHRoaXMgY2FuIG9jY3VyIG9uIFdpbjY0LCBhbmQgYWN0dWFsbHkgdGhlcmUgaXMg YSBzcGVjaWFsCgkJICAgdWdseSBwcmludGYgZm9ybWF0dGVyIGZvciBkZWNpbWFsIHBvaW50 ZXIgbGVuZ3RoIGludGVnZXIKCQkgICBwcmludGluZywgb25seSBib3RoZXIgaWYgbmVjZXNz YXJ5Ki8KCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfT3ZlcmZsb3dFcnJvciwKCQkJCSJubyBw cmludGYgZm9ybWF0dGVyIHRvIGRpc3BsYXkgIgoJCQkJInRoZSBzb2NrZXQgZGVzY3JpcHRv ciBpbiBkZWNpbWFsIik7CgkJcmV0dXJuIE5VTEw7Cgl9CiNlbmRpZgoJUHlPU19zbnByaW50 ZigKCQlidWYsIHNpemVvZihidWYpLAoJCSI8c29ja2V0IG9iamVjdCwgZmQ9JWxkLCBmYW1p bHk9JWQsIHR5cGU9JWQsIHByb3RvY29sPSVkPiIsCgkJKGxvbmcpcy0+c29ja19mZCwgcy0+ c29ja19mYW1pbHksCgkJcy0+c29ja190eXBlLAoJCXMtPnNvY2tfcHJvdG8pOwoJcmV0dXJu IFB5U3RyaW5nX0Zyb21TdHJpbmcoYnVmKTsKfQoKCi8qIENyZWF0ZSBhIG5ldywgdW5pbml0 aWFsaXplZCBzb2NrZXQgb2JqZWN0LiAqLwoKc3RhdGljIFB5T2JqZWN0ICoKc29ja19uZXco UHlUeXBlT2JqZWN0ICp0eXBlLCBQeU9iamVjdCAqYXJncywgUHlPYmplY3QgKmt3ZHMpCnsK CVB5T2JqZWN0ICpuZXc7CgoJbmV3ID0gdHlwZS0+dHBfYWxsb2ModHlwZSwgMCk7CglpZiAo bmV3ICE9IE5VTEwpIHsKCQkoKFB5U29ja2V0U29ja09iamVjdCAqKW5ldyktPnNvY2tfZmQg PSAtMTsKCQkoKFB5U29ja2V0U29ja09iamVjdCAqKW5ldyktPnNvY2tfdGltZW91dCA9IC0x LjA7CgkJKChQeVNvY2tldFNvY2tPYmplY3QgKiluZXcpLT5lcnJvcmhhbmRsZXIgPSAmc2V0 X2Vycm9yOwoJfQoJcmV0dXJuIG5ldzsKfQoKCi8qIEluaXRpYWxpemUgYSBuZXcgc29ja2V0 IG9iamVjdC4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgaW50CnNvY2tfaW5pdChQeU9iamVj dCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MsIFB5T2JqZWN0ICprd2RzKQp7CglQeVNvY2tldFNv Y2tPYmplY3QgKnMgPSAoUHlTb2NrZXRTb2NrT2JqZWN0ICopc2VsZjsKCVNPQ0tFVF9UIGZk OwoJaW50IGZhbWlseSA9IEFGX0lORVQsIHR5cGUgPSBTT0NLX1NUUkVBTSwgcHJvdG8gPSAw OwoJc3RhdGljIGNoYXIgKmtleXdvcmRzW10gPSB7ImZhbWlseSIsICJ0eXBlIiwgInByb3Rv IiwgMH07CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlQW5kS2V5d29yZHMoYXJncywga3dkcywK CQkJCQkgInxpaWk6c29ja2V0Iiwga2V5d29yZHMsCgkJCQkJICZmYW1pbHksICZ0eXBlLCAm cHJvdG8pKQoJCXJldHVybiAtMTsKCglQeV9CRUdJTl9BTExPV19USFJFQURTCglmZCA9IHNv Y2tldChmYW1pbHksIHR5cGUsIHByb3RvKTsKCVB5X0VORF9BTExPV19USFJFQURTCgojaWZk ZWYgTVNfV0lORE9XUwoJaWYgKGZkID09IElOVkFMSURfU09DS0VUKQojZWxzZQoJaWYgKGZk IDwgMCkKI2VuZGlmCgl7CgkJc2V0X2Vycm9yKCk7CgkJcmV0dXJuIC0xOwoJfQoJaW5pdF9z b2Nrb2JqZWN0KHMsIGZkLCBmYW1pbHksIHR5cGUsIHByb3RvKTsKCS8qIEZyb20gbm93IG9u LCBpZ25vcmUgU0lHUElQRSBhbmQgbGV0IHRoZSBlcnJvciBjaGVja2luZwoJICAgZG8gdGhl IHdvcmsuICovCiNpZmRlZiBTSUdQSVBFCgkodm9pZCkgc2lnbmFsKFNJR1BJUEUsIFNJR19J R04pOwojZW5kaWYKCglyZXR1cm4gMDsKCn0KCgovKiBUeXBlIG9iamVjdCBmb3Igc29ja2V0 IG9iamVjdHMuICovCgpzdGF0aWMgUHlUeXBlT2JqZWN0IHNvY2tfdHlwZSA9IHsKCVB5T2Jq ZWN0X0hFQURfSU5JVCgwKQkvKiBNdXN0IGZpbGwgaW4gdHlwZSB2YWx1ZSBsYXRlciAqLwoJ MCwJCQkJCS8qIG9iX3NpemUgKi8KCSJfc29ja2V0LnNvY2tldCIsCQkJLyogdHBfbmFtZSAq LwoJc2l6ZW9mKFB5U29ja2V0U29ja09iamVjdCksCQkvKiB0cF9iYXNpY3NpemUgKi8KCTAs CQkJCQkvKiB0cF9pdGVtc2l6ZSAqLwoJKGRlc3RydWN0b3Ipc29ja19kZWFsbG9jLAkJLyog dHBfZGVhbGxvYyAqLwoJMCwJCQkJCS8qIHRwX3ByaW50ICovCgkwLAkJCQkJLyogdHBfZ2V0 YXR0ciAqLwoJMCwJCQkJCS8qIHRwX3NldGF0dHIgKi8KCTAsCQkJCQkvKiB0cF9jb21wYXJl ICovCgkocmVwcmZ1bmMpc29ja19yZXByLAkJCS8qIHRwX3JlcHIgKi8KCTAsCQkJCQkvKiB0 cF9hc19udW1iZXIgKi8KCTAsCQkJCQkvKiB0cF9hc19zZXF1ZW5jZSAqLwoJMCwJCQkJCS8q IHRwX2FzX21hcHBpbmcgKi8KCTAsCQkJCQkvKiB0cF9oYXNoICovCgkwLAkJCQkJLyogdHBf Y2FsbCAqLwoJMCwJCQkJCS8qIHRwX3N0ciAqLwoJMCwJLyogc2V0IGJlbG93ICovCQkJLyog dHBfZ2V0YXR0cm8gKi8KCTAsCQkJCQkvKiB0cF9zZXRhdHRybyAqLwoJMCwJCQkJCS8qIHRw X2FzX2J1ZmZlciAqLwoJUHlfVFBGTEFHU19ERUZBVUxUIHwgUHlfVFBGTEFHU19CQVNFVFlQ RSwgLyogdHBfZmxhZ3MgKi8KCXNvY2tfZG9jLAkJCQkvKiB0cF9kb2MgKi8KCTAsCQkJCQkv KiB0cF90cmF2ZXJzZSAqLwoJMCwJCQkJCS8qIHRwX2NsZWFyICovCgkwLAkJCQkJLyogdHBf cmljaGNvbXBhcmUgKi8KCTAsCQkJCQkvKiB0cF93ZWFrbGlzdG9mZnNldCAqLwoJMCwJCQkJ CS8qIHRwX2l0ZXIgKi8KCTAsCQkJCQkvKiB0cF9pdGVybmV4dCAqLwoJc29ja19tZXRob2Rz LAkJCQkvKiB0cF9tZXRob2RzICovCgkwLAkJCQkJLyogdHBfbWVtYmVycyAqLwoJMCwJCQkJ CS8qIHRwX2dldHNldCAqLwoJMCwJCQkJCS8qIHRwX2Jhc2UgKi8KCTAsCQkJCQkvKiB0cF9k aWN0ICovCgkwLAkJCQkJLyogdHBfZGVzY3JfZ2V0ICovCgkwLAkJCQkJLyogdHBfZGVzY3Jf c2V0ICovCgkwLAkJCQkJLyogdHBfZGljdG9mZnNldCAqLwoJc29ja19pbml0LAkJCQkvKiB0 cF9pbml0ICovCgkwLAkvKiBzZXQgYmVsb3cgKi8JCQkvKiB0cF9hbGxvYyAqLwoJc29ja19u ZXcsCQkJCS8qIHRwX25ldyAqLwoJMCwJLyogc2V0IGJlbG93ICovCQkJLyogdHBfZnJlZSAq Lwp9OwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0aG9zdG5hbWUoKS4gKi8KCi8qQVJH U1VTRUQqLwpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfZ2V0aG9zdG5hbWUoUHlPYmplY3Qg KnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CgljaGFyIGJ1ZlsxMDI0XTsKCWludCByZXM7Cglp ZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgIjpnZXRob3N0bmFtZSIpKQoJCXJldHVybiBO VUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gZ2V0aG9zdG5hbWUoYnVmLCAo aW50KSBzaXplb2YgYnVmIC0gMSk7CglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJaWYgKHJlcyA8 IDApCgkJcmV0dXJuIHNldF9lcnJvcigpOwoJYnVmW3NpemVvZiBidWYgLSAxXSA9ICdcMCc7 CglyZXR1cm4gUHlTdHJpbmdfRnJvbVN0cmluZyhidWYpOwp9CgpzdGF0aWMgY2hhciBnZXRo b3N0bmFtZV9kb2NbXSA9CiJnZXRob3N0bmFtZSgpIC0+IHN0cmluZ1xuXApcblwKUmV0dXJu IHRoZSBjdXJyZW50IGhvc3QgbmFtZS4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0 aG9zdGJ5bmFtZShuYW1lKS4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgUHlPYmplY3QgKgpz b2NrZXRfZ2V0aG9zdGJ5bmFtZShQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsK CWNoYXIgKm5hbWU7CglzdHJ1Y3Qgc29ja2FkZHJfc3RvcmFnZSBhZGRyYnVmOwoKCWlmICgh UHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiczpnZXRob3N0YnluYW1lIiwgJm5hbWUpKQoJCXJl dHVybiBOVUxMOwoJaWYgKHNldGlwYWRkcihuYW1lLCAoc3RydWN0IHNvY2thZGRyICopJmFk ZHJidWYsIEFGX0lORVQpIDwgMCkKCQlyZXR1cm4gTlVMTDsKCXJldHVybiBtYWtlaXBhZGRy KChzdHJ1Y3Qgc29ja2FkZHIgKikmYWRkcmJ1ZiwKCQlzaXplb2Yoc3RydWN0IHNvY2thZGRy X2luKSk7Cn0KCnN0YXRpYyBjaGFyIGdldGhvc3RieW5hbWVfZG9jW10gPQoiZ2V0aG9zdGJ5 bmFtZShob3N0KSAtPiBhZGRyZXNzXG5cClxuXApSZXR1cm4gdGhlIElQIGFkZHJlc3MgKGEg c3RyaW5nIG9mIHRoZSBmb3JtICcyNTUuMjU1LjI1NS4yNTUnKSBmb3IgYSBob3N0LiI7CgoK LyogQ29udmVuaWVuY2UgZnVuY3Rpb24gY29tbW9uIHRvIGdldGhvc3RieW5hbWVfZXggYW5k IGdldGhvc3RieWFkZHIgKi8KCnN0YXRpYyBQeU9iamVjdCAqCmdldGhvc3RfY29tbW9uKHN0 cnVjdCBob3N0ZW50ICpoLCBzdHJ1Y3Qgc29ja2FkZHIgKmFkZHIsIGludCBhbGVuLCBpbnQg YWYpCnsKCWNoYXIgKipwY2g7CglQeU9iamVjdCAqcnRuX3R1cGxlID0gKFB5T2JqZWN0ICop TlVMTDsKCVB5T2JqZWN0ICpuYW1lX2xpc3QgPSAoUHlPYmplY3QgKilOVUxMOwoJUHlPYmpl Y3QgKmFkZHJfbGlzdCA9IChQeU9iamVjdCAqKU5VTEw7CglQeU9iamVjdCAqdG1wOwoKCWlm IChoID09IE5VTEwpIHsKCQkvKiBMZXQncyBnZXQgcmVhbCBlcnJvciBtZXNzYWdlIHRvIHJl dHVybiAqLwojaWZuZGVmIFJJU0NPUwoJCXNldF9oZXJyb3IoaF9lcnJubyk7CiNlbHNlCgkJ UHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwgImhvc3Qgbm90IGZvdW5kIik7CiNlbmRp ZgoJCXJldHVybiBOVUxMOwoJfQoKCWlmIChoLT5oX2FkZHJ0eXBlICE9IGFmKSB7CiNpZmRl ZiBIQVZFX1NUUkVSUk9SCgkJLyogTGV0J3MgZ2V0IHJlYWwgZXJyb3IgbWVzc2FnZSB0byBy ZXR1cm4gKi8KCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJKGNoYXIgKilz dHJlcnJvcihFQUZOT1NVUFBPUlQpKTsKI2Vsc2UKCQlQeUVycl9TZXRTdHJpbmcoCgkJCXNv Y2tldF9lcnJvciwKCQkJIkFkZHJlc3MgZmFtaWx5IG5vdCBzdXBwb3J0ZWQgYnkgcHJvdG9j b2wgZmFtaWx5Iik7CiNlbmRpZgoJCXJldHVybiBOVUxMOwoJfQoKCXN3aXRjaCAoYWYpIHsK CgljYXNlIEFGX0lORVQ6CgkJaWYgKGFsZW4gPCBzaXplb2Yoc3RydWN0IHNvY2thZGRyX2lu KSkKCQkJcmV0dXJuIE5VTEw7CgkJYnJlYWs7CgojaWZkZWYgRU5BQkxFX0lQVjYKCWNhc2Ug QUZfSU5FVDY6CgkJaWYgKGFsZW4gPCBzaXplb2Yoc3RydWN0IHNvY2thZGRyX2luNikpCgkJ CXJldHVybiBOVUxMOwoJCWJyZWFrOwojZW5kaWYKCgl9CgoJaWYgKChuYW1lX2xpc3QgPSBQ eUxpc3RfTmV3KDApKSA9PSBOVUxMKQoJCWdvdG8gZXJyOwoKCWlmICgoYWRkcl9saXN0ID0g UHlMaXN0X05ldygwKSkgPT0gTlVMTCkKCQlnb3RvIGVycjsKCglmb3IgKHBjaCA9IGgtPmhf YWxpYXNlczsgKnBjaCAhPSBOVUxMOyBwY2grKykgewoJCWludCBzdGF0dXM7CgkJdG1wID0g UHlTdHJpbmdfRnJvbVN0cmluZygqcGNoKTsKCQlpZiAodG1wID09IE5VTEwpCgkJCWdvdG8g ZXJyOwoKCQlzdGF0dXMgPSBQeUxpc3RfQXBwZW5kKG5hbWVfbGlzdCwgdG1wKTsKCQlQeV9E RUNSRUYodG1wKTsKCgkJaWYgKHN0YXR1cykKCQkJZ290byBlcnI7Cgl9CgoJZm9yIChwY2gg PSBoLT5oX2FkZHJfbGlzdDsgKnBjaCAhPSBOVUxMOyBwY2grKykgewoJCWludCBzdGF0dXM7 CgoJCXN3aXRjaCAoYWYpIHsKCgkJY2FzZSBBRl9JTkVUOgoJCSAgICB7CgkJCXN0cnVjdCBz b2NrYWRkcl9pbiBzaW47CgkJCW1lbXNldCgmc2luLCAwLCBzaXplb2Yoc2luKSk7CgkJCXNp bi5zaW5fZmFtaWx5ID0gYWY7CiNpZmRlZiBIQVZFX1NPQ0tBRERSX1NBX0xFTgoJCQlzaW4u c2luX2xlbiA9IHNpemVvZihzaW4pOwojZW5kaWYKCQkJbWVtY3B5KCZzaW4uc2luX2FkZHIs ICpwY2gsIHNpemVvZihzaW4uc2luX2FkZHIpKTsKCQkJdG1wID0gbWFrZWlwYWRkcigoc3Ry dWN0IHNvY2thZGRyICopJnNpbiwgc2l6ZW9mKHNpbikpOwoKCQkJaWYgKHBjaCA9PSBoLT5o X2FkZHJfbGlzdCAmJiBhbGVuID49IHNpemVvZihzaW4pKQoJCQkJbWVtY3B5KChjaGFyICop IGFkZHIsICZzaW4sIHNpemVvZihzaW4pKTsKCQkJYnJlYWs7CgkJICAgIH0KCiNpZmRlZiBF TkFCTEVfSVBWNgoJCWNhc2UgQUZfSU5FVDY6CgkJICAgIHsKCQkJc3RydWN0IHNvY2thZGRy X2luNiBzaW42OwoJCQltZW1zZXQoJnNpbjYsIDAsIHNpemVvZihzaW42KSk7CgkJCXNpbjYu c2luNl9mYW1pbHkgPSBhZjsKI2lmZGVmIEhBVkVfU09DS0FERFJfU0FfTEVOCgkJCXNpbjYu c2luNl9sZW4gPSBzaXplb2Yoc2luNik7CiNlbmRpZgoJCQltZW1jcHkoJnNpbjYuc2luNl9h ZGRyLCAqcGNoLCBzaXplb2Yoc2luNi5zaW42X2FkZHIpKTsKCQkJdG1wID0gbWFrZWlwYWRk cigoc3RydWN0IHNvY2thZGRyICopJnNpbjYsCgkJCQlzaXplb2Yoc2luNikpOwoKCQkJaWYg KHBjaCA9PSBoLT5oX2FkZHJfbGlzdCAmJiBhbGVuID49IHNpemVvZihzaW42KSkKCQkJCW1l bWNweSgoY2hhciAqKSBhZGRyLCAmc2luNiwgc2l6ZW9mKHNpbjYpKTsKCQkJYnJlYWs7CgkJ ICAgIH0KI2VuZGlmCgoJCWRlZmF1bHQ6CS8qIGNhbid0IGhhcHBlbiAqLwoJCQlQeUVycl9T ZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJCSJ1bnN1cHBvcnRlZCBhZGRyZXNzIGZhbWls eSIpOwoJCQlyZXR1cm4gTlVMTDsKCQl9CgoJCWlmICh0bXAgPT0gTlVMTCkKCQkJZ290byBl cnI7CgoJCXN0YXR1cyA9IFB5TGlzdF9BcHBlbmQoYWRkcl9saXN0LCB0bXApOwoJCVB5X0RF Q1JFRih0bXApOwoKCQlpZiAoc3RhdHVzKQoJCQlnb3RvIGVycjsKCX0KCglydG5fdHVwbGUg PSBQeV9CdWlsZFZhbHVlKCJzT08iLCBoLT5oX25hbWUsIG5hbWVfbGlzdCwgYWRkcl9saXN0 KTsKCiBlcnI6CglQeV9YREVDUkVGKG5hbWVfbGlzdCk7CglQeV9YREVDUkVGKGFkZHJfbGlz dCk7CglyZXR1cm4gcnRuX3R1cGxlOwp9CgoKLyogUHl0aG9uIGludGVyZmFjZSB0byBnZXRo b3N0YnluYW1lX2V4KG5hbWUpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq CnNvY2tldF9nZXRob3N0YnluYW1lX2V4KFB5T2JqZWN0ICpzZWxmLCBQeU9iamVjdCAqYXJn cykKewoJY2hhciAqbmFtZTsKCXN0cnVjdCBob3N0ZW50ICpoOwoJc3RydWN0IHNvY2thZGRy X3N0b3JhZ2UgYWRkcjsKCXN0cnVjdCBzb2NrYWRkciAqc2E7CglQeU9iamVjdCAqcmV0Owoj aWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1IKCXN0cnVjdCBob3N0ZW50IGhwX2FsbG9jYXRl ZDsKI2lmZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzNfQVJHCglzdHJ1Y3QgaG9zdGVudF9k YXRhIGRhdGE7CiNlbHNlCgljaGFyIGJ1ZlsxNjM4NF07CglpbnQgYnVmX2xlbiA9IChzaXpl b2YgYnVmKSAtIDE7CglpbnQgZXJybm9wOwojZW5kaWYKI2lmIGRlZmluZWQoSEFWRV9HRVRI T1NUQllOQU1FX1JfM19BUkcpIHx8IGRlZmluZWQoSEFWRV9HRVRIT1NUQllOQU1FX1JfNl9B UkcpCglpbnQgcmVzdWx0OwojZW5kaWYKI2VuZGlmIC8qIEhBVkVfR0VUSE9TVEJZTkFNRV9S ICovCgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJzOmdldGhvc3RieW5hbWVfZXgi LCAmbmFtZSkpCgkJcmV0dXJuIE5VTEw7CglpZiAoc2V0aXBhZGRyKG5hbWUsIChzdHJ1Y3Qg c29ja2FkZHIgKikmYWRkciwgUEZfSU5FVCkgPCAwKQoJCXJldHVybiBOVUxMOwoJUHlfQkVH SU5fQUxMT1dfVEhSRUFEUwojaWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1IKI2lmICAgZGVm aW5lZChIQVZFX0dFVEhPU1RCWU5BTUVfUl82X0FSRykKCXJlc3VsdCA9IGdldGhvc3RieW5h bWVfcihuYW1lLCAmaHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sCgkJCQkgJmgsICZlcnJu b3ApOwojZWxpZiBkZWZpbmVkKEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHKQoJaCA9IGdl dGhvc3RieW5hbWVfcihuYW1lLCAmaHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sICZlcnJu b3ApOwojZWxzZSAvKiBIQVZFX0dFVEhPU1RCWU5BTUVfUl8zX0FSRyAqLwoJbWVtc2V0KCh2 b2lkICopICZkYXRhLCAnXDAnLCBzaXplb2YoZGF0YSkpOwoJcmVzdWx0ID0gZ2V0aG9zdGJ5 bmFtZV9yKG5hbWUsICZocF9hbGxvY2F0ZWQsICZkYXRhKTsKCWggPSAocmVzdWx0ICE9IDAp ID8gTlVMTCA6ICZocF9hbGxvY2F0ZWQ7CiNlbmRpZgojZWxzZSAvKiBub3QgSEFWRV9HRVRI T1NUQllOQU1FX1IgKi8KI2lmZGVmIFVTRV9HRVRIT1NUQllOQU1FX0xPQ0sKCVB5VGhyZWFk X2FjcXVpcmVfbG9jayhnZXRob3N0YnluYW1lX2xvY2ssIDEpOwojZW5kaWYKCWggPSBnZXRo b3N0YnluYW1lKG5hbWUpOwojZW5kaWYgLyogSEFWRV9HRVRIT1NUQllOQU1FX1IgKi8KCVB5 X0VORF9BTExPV19USFJFQURTCgkvKiBTb21lIEMgbGlicmFyaWVzIHdvdWxkIHJlcXVpcmUg YWRkci5fX3NzX2ZhbWlseSBpbnN0ZWFkIG9mCgkgICBhZGRyLnNzX2ZhbWlseS4KCSAgIFRo ZXJlZm9yZSwgd2UgY2FzdCB0aGUgc29ja2FkZHJfc3RvcmFnZSBpbnRvIHNvY2thZGRyIHRv CgkgICBhY2Nlc3Mgc2FfZmFtaWx5LiAqLwoJc2EgPSAoc3RydWN0IHNvY2thZGRyKikmYWRk cjsKCXJldCA9IGdldGhvc3RfY29tbW9uKGgsIChzdHJ1Y3Qgc29ja2FkZHIgKikmYWRkciwg c2l6ZW9mKGFkZHIpLAoJCQkgICAgIHNhLT5zYV9mYW1pbHkpOwojaWZkZWYgVVNFX0dFVEhP U1RCWU5BTUVfTE9DSwoJUHlUaHJlYWRfcmVsZWFzZV9sb2NrKGdldGhvc3RieW5hbWVfbG9j ayk7CiNlbmRpZgoJcmV0dXJuIHJldDsKfQoKc3RhdGljIGNoYXIgZ2hibl9leF9kb2NbXSA9 CiJnZXRob3N0YnluYW1lX2V4KGhvc3QpIC0+IChuYW1lLCBhbGlhc2xpc3QsIGFkZHJlc3Ns aXN0KVxuXApcblwKUmV0dXJuIHRoZSB0cnVlIGhvc3QgbmFtZSwgYSBsaXN0IG9mIGFsaWFz ZXMsIGFuZCBhIGxpc3Qgb2YgSVAgYWRkcmVzc2VzLFxuXApmb3IgYSBob3N0LiAgVGhlIGhv c3QgYXJndW1lbnQgaXMgYSBzdHJpbmcgZ2l2aW5nIGEgaG9zdCBuYW1lIG9yIElQIG51bWJl ci4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0aG9zdGJ5YWRkcihJUCkuICovCgov KkFSR1NVU0VEKi8Kc3RhdGljIFB5T2JqZWN0ICoKc29ja2V0X2dldGhvc3RieWFkZHIoUHlP YmplY3QgKnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CiNpZmRlZiBFTkFCTEVfSVBWNgoJc3Ry dWN0IHNvY2thZGRyX3N0b3JhZ2UgYWRkcjsKI2Vsc2UKCXN0cnVjdCBzb2NrYWRkcl9pbiBh ZGRyOwojZW5kaWYKCXN0cnVjdCBzb2NrYWRkciAqc2EgPSAoc3RydWN0IHNvY2thZGRyICop JmFkZHI7CgljaGFyICppcF9udW07CglzdHJ1Y3QgaG9zdGVudCAqaDsKCVB5T2JqZWN0ICpy ZXQ7CiNpZmRlZiBIQVZFX0dFVEhPU1RCWU5BTUVfUgoJc3RydWN0IGhvc3RlbnQgaHBfYWxs b2NhdGVkOwojaWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcKCXN0cnVjdCBob3N0 ZW50X2RhdGEgZGF0YTsKI2Vsc2UKCWNoYXIgYnVmWzE2Mzg0XTsKCWludCBidWZfbGVuID0g KHNpemVvZiBidWYpIC0gMTsKCWludCBlcnJub3A7CiNlbmRpZgojaWYgZGVmaW5lZChIQVZF X0dFVEhPU1RCWU5BTUVfUl8zX0FSRykgfHwgZGVmaW5lZChIQVZFX0dFVEhPU1RCWU5BTUVf Ul82X0FSRykKCWludCByZXN1bHQ7CiNlbmRpZgojZW5kaWYgLyogSEFWRV9HRVRIT1NUQllO QU1FX1IgKi8KCWNoYXIgKmFwOwoJaW50IGFsOwoJaW50IGFmOwoKCWlmICghUHlBcmdfUGFy c2VUdXBsZShhcmdzLCAiczpnZXRob3N0YnlhZGRyIiwgJmlwX251bSkpCgkJcmV0dXJuIE5V TEw7CglhZiA9IFBGX1VOU1BFQzsKCWlmIChzZXRpcGFkZHIoaXBfbnVtLCBzYSwgYWYpIDwg MCkKCQlyZXR1cm4gTlVMTDsKCWFmID0gc2EtPnNhX2ZhbWlseTsKCWFwID0gTlVMTDsKCWFs ID0gMDsKCXN3aXRjaCAoYWYpIHsKCWNhc2UgQUZfSU5FVDoKCQlhcCA9IChjaGFyICopJigo c3RydWN0IHNvY2thZGRyX2luICopc2EpLT5zaW5fYWRkcjsKCQlhbCA9IHNpemVvZigoKHN0 cnVjdCBzb2NrYWRkcl9pbiAqKXNhKS0+c2luX2FkZHIpOwoJCWJyZWFrOwojaWZkZWYgRU5B QkxFX0lQVjYKCWNhc2UgQUZfSU5FVDY6CgkJYXAgPSAoY2hhciAqKSYoKHN0cnVjdCBzb2Nr YWRkcl9pbjYgKilzYSktPnNpbjZfYWRkcjsKCQlhbCA9IHNpemVvZigoKHN0cnVjdCBzb2Nr YWRkcl9pbjYgKilzYSktPnNpbjZfYWRkcik7CgkJYnJlYWs7CiNlbmRpZgoJZGVmYXVsdDoK CQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAidW5zdXBwb3J0ZWQgYWRkcmVzcyBm YW1pbHkiKTsKCQlyZXR1cm4gTlVMTDsKCX0KCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKI2lm ZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SCiNpZiAgIGRlZmluZWQoSEFWRV9HRVRIT1NUQllO QU1FX1JfNl9BUkcpCglyZXN1bHQgPSBnZXRob3N0YnlhZGRyX3IoYXAsIGFsLCBhZiwKCQkm aHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sCgkJJmgsICZlcnJub3ApOwojZWxpZiBkZWZp bmVkKEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHKQoJaCA9IGdldGhvc3RieWFkZHJfcihh cCwgYWwsIGFmLAoJCQkgICAgJmhwX2FsbG9jYXRlZCwgYnVmLCBidWZfbGVuLCAmZXJybm9w KTsKI2Vsc2UgLyogSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcgKi8KCW1lbXNldCgodm9p ZCAqKSAmZGF0YSwgJ1wwJywgc2l6ZW9mKGRhdGEpKTsKCXJlc3VsdCA9IGdldGhvc3RieWFk ZHJfcihhcCwgYWwsIGFmLCAmaHBfYWxsb2NhdGVkLCAmZGF0YSk7CgloID0gKHJlc3VsdCAh PSAwKSA/IE5VTEwgOiAmaHBfYWxsb2NhdGVkOwojZW5kaWYKI2Vsc2UgLyogbm90IEhBVkVf R0VUSE9TVEJZTkFNRV9SICovCiNpZmRlZiBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCglQeVRo cmVhZF9hY3F1aXJlX2xvY2soZ2V0aG9zdGJ5bmFtZV9sb2NrLCAxKTsKI2VuZGlmCgloID0g Z2V0aG9zdGJ5YWRkcihhcCwgYWwsIGFmKTsKI2VuZGlmIC8qIEhBVkVfR0VUSE9TVEJZTkFN RV9SICovCglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJcmV0ID0gZ2V0aG9zdF9jb21tb24oaCwg KHN0cnVjdCBzb2NrYWRkciAqKSZhZGRyLCBzaXplb2YoYWRkciksIGFmKTsKI2lmZGVmIFVT RV9HRVRIT1NUQllOQU1FX0xPQ0sKCVB5VGhyZWFkX3JlbGVhc2VfbG9jayhnZXRob3N0Ynlu YW1lX2xvY2spOwojZW5kaWYKCXJldHVybiByZXQ7Cn0KCnN0YXRpYyBjaGFyIGdldGhvc3Ri eWFkZHJfZG9jW10gPQoiZ2V0aG9zdGJ5YWRkcihob3N0KSAtPiAobmFtZSwgYWxpYXNsaXN0 LCBhZGRyZXNzbGlzdClcblwKXG5cClJldHVybiB0aGUgdHJ1ZSBob3N0IG5hbWUsIGEgbGlz dCBvZiBhbGlhc2VzLCBhbmQgYSBsaXN0IG9mIElQIGFkZHJlc3NlcyxcblwKZm9yIGEgaG9z dC4gIFRoZSBob3N0IGFyZ3VtZW50IGlzIGEgc3RyaW5nIGdpdmluZyBhIGhvc3QgbmFtZSBv ciBJUCBudW1iZXIuIjsKCgovKiBQeXRob24gaW50ZXJmYWNlIHRvIGdldHNlcnZieW5hbWUo bmFtZSkuCiAgIFRoaXMgb25seSByZXR1cm5zIHRoZSBwb3J0IG51bWJlciwgc2luY2UgdGhl IG90aGVyIGluZm8gaXMgYWxyZWFkeQogICBrbm93biBvciBub3QgdXNlZnVsIChsaWtlIHRo ZSBsaXN0IG9mIGFsaWFzZXMpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq CnNvY2tldF9nZXRzZXJ2YnluYW1lKFB5T2JqZWN0ICpzZWxmLCBQeU9iamVjdCAqYXJncykK ewoJY2hhciAqbmFtZSwgKnByb3RvOwoJc3RydWN0IHNlcnZlbnQgKnNwOwoJaWYgKCFQeUFy Z19QYXJzZVR1cGxlKGFyZ3MsICJzczpnZXRzZXJ2YnluYW1lIiwgJm5hbWUsICZwcm90bykp CgkJcmV0dXJuIE5VTEw7CglQeV9CRUdJTl9BTExPV19USFJFQURTCglzcCA9IGdldHNlcnZi eW5hbWUobmFtZSwgcHJvdG8pOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChzcCA9PSBO VUxMKSB7CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwgInNlcnZpY2UvcHJvdG8g bm90IGZvdW5kIik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4gUHlJbnRfRnJvbUxvbmco KGxvbmcpIG50b2hzKHNwLT5zX3BvcnQpKTsKfQoKc3RhdGljIGNoYXIgZ2V0c2VydmJ5bmFt ZV9kb2NbXSA9CiJnZXRzZXJ2YnluYW1lKHNlcnZpY2VuYW1lLCBwcm90b2NvbG5hbWUpIC0+ IGludGVnZXJcblwKXG5cClJldHVybiBhIHBvcnQgbnVtYmVyIGZyb20gYSBzZXJ2aWNlIG5h bWUgYW5kIHByb3RvY29sIG5hbWUuXG5cClRoZSBwcm90b2NvbCBuYW1lIHNob3VsZCBiZSAn dGNwJyBvciAndWRwJy4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0cHJvdG9ieW5h bWUobmFtZSkuCiAgIFRoaXMgb25seSByZXR1cm5zIHRoZSBwcm90b2NvbCBudW1iZXIsIHNp bmNlIHRoZSBvdGhlciBpbmZvIGlzCiAgIGFscmVhZHkga25vd24gb3Igbm90IHVzZWZ1bCAo bGlrZSB0aGUgbGlzdCBvZiBhbGlhc2VzKS4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgUHlP YmplY3QgKgpzb2NrZXRfZ2V0cHJvdG9ieW5hbWUoUHlPYmplY3QgKnNlbGYsIFB5T2JqZWN0 ICphcmdzKQp7CgljaGFyICpuYW1lOwoJc3RydWN0IHByb3RvZW50ICpzcDsKI2lmZGVmIF9f QkVPU19fCi8qIE5vdCBhdmFpbGFibGUgaW4gQmVPUyB5ZXQuIC0gW2NqaF0gKi8KCVB5RXJy X1NldFN0cmluZyhzb2NrZXRfZXJyb3IsICJnZXRwcm90b2J5bmFtZSBub3Qgc3VwcG9ydGVk Iik7CglyZXR1cm4gTlVMTDsKI2Vsc2UKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAi czpnZXRwcm90b2J5bmFtZSIsICZuYW1lKSkKCQlyZXR1cm4gTlVMTDsKCVB5X0JFR0lOX0FM TE9XX1RIUkVBRFMKCXNwID0gZ2V0cHJvdG9ieW5hbWUobmFtZSk7CglQeV9FTkRfQUxMT1df VEhSRUFEUwoJaWYgKHNwID09IE5VTEwpIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vy cm9yLCAicHJvdG9jb2wgbm90IGZvdW5kIik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4g UHlJbnRfRnJvbUxvbmcoKGxvbmcpIHNwLT5wX3Byb3RvKTsKI2VuZGlmCn0KCnN0YXRpYyBj aGFyIGdldHByb3RvYnluYW1lX2RvY1tdID0KImdldHByb3RvYnluYW1lKG5hbWUpIC0+IGlu dGVnZXJcblwKXG5cClJldHVybiB0aGUgcHJvdG9jb2wgbnVtYmVyIGZvciB0aGUgbmFtZWQg cHJvdG9jb2wuICAoUmFyZWx5IHVzZWQuKSI7CgoKI2lmbmRlZiBOT19EVVAKLyogQ3JlYXRl IGEgc29ja2V0IG9iamVjdCBmcm9tIGEgbnVtZXJpYyBmaWxlIGRlc2NyaXB0aW9uLgogICBV c2VmdWwgZS5nLiBpZiBzdGRpbiBpcyBhIHNvY2tldC4KICAgQWRkaXRpb25hbCBhcmd1bWVu dHMgYXMgZm9yIHNvY2tldCgpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq CnNvY2tldF9mcm9tZmQoUHlPYmplY3QgKnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CglQeVNv Y2tldFNvY2tPYmplY3QgKnM7CglTT0NLRVRfVCBmZDsKCWludCBmYW1pbHksIHR5cGUsIHBy b3RvID0gMDsKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiaWlpfGk6ZnJvbWZkIiwK CQkJICAgICAgJmZkLCAmZmFtaWx5LCAmdHlwZSwgJnByb3RvKSkKCQlyZXR1cm4gTlVMTDsK CS8qIER1cCB0aGUgZmQgc28gaXQgYW5kIHRoZSBzb2NrZXQgY2FuIGJlIGNsb3NlZCBpbmRl cGVuZGVudGx5ICovCglmZCA9IGR1cChmZCk7CglpZiAoZmQgPCAwKQoJCXJldHVybiBzZXRf ZXJyb3IoKTsKCXMgPSBuZXdfc29ja29iamVjdChmZCwgZmFtaWx5LCB0eXBlLCBwcm90byk7 CgkvKiBGcm9tIG5vdyBvbiwgaWdub3JlIFNJR1BJUEUgYW5kIGxldCB0aGUgZXJyb3IgY2hl Y2tpbmcKCSAgIGRvIHRoZSB3b3JrLiAqLwojaWZkZWYgU0lHUElQRQoJKHZvaWQpIHNpZ25h bChTSUdQSVBFLCBTSUdfSUdOKTsKI2VuZGlmCglyZXR1cm4gKFB5T2JqZWN0ICopIHM7Cn0K CnN0YXRpYyBjaGFyIGZyb21mZF9kb2NbXSA9CiJmcm9tZmQoZmQsIGZhbWlseSwgdHlwZVss IHByb3RvXSkgLT4gc29ja2V0IG9iamVjdFxuXApcblwKQ3JlYXRlIGEgc29ja2V0IG9iamVj dCBmcm9tIHRoZSBnaXZlbiBmaWxlIGRlc2NyaXB0b3IuXG5cClRoZSByZW1haW5pbmcgYXJn dW1lbnRzIGFyZSB0aGUgc2FtZSBhcyBmb3Igc29ja2V0KCkuIjsKCiNlbmRpZiAvKiBOT19E VVAgKi8KCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfbnRvaHMoUHlPYmplY3QgKnNlbGYs IFB5T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBs ZShhcmdzLCAiaTpudG9ocyIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gKGlu dCludG9ocygoc2hvcnQpeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3Rh dGljIGNoYXIgbnRvaHNfZG9jW10gPQoibnRvaHMoaW50ZWdlcikgLT4gaW50ZWdlclxuXApc blwKQ29udmVydCBhIDE2LWJpdCBpbnRlZ2VyIGZyb20gbmV0d29yayB0byBob3N0IGJ5dGUg b3JkZXIuIjsKCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfbnRvaGwoUHlPYmplY3QgKnNl bGYsIFB5T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VU dXBsZShhcmdzLCAiaTpudG9obCIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0g bnRvaGwoeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNoYXIg bnRvaGxfZG9jW10gPQoibnRvaGwoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29udmVy dCBhIDMyLWJpdCBpbnRlZ2VyIGZyb20gbmV0d29yayB0byBob3N0IGJ5dGUgb3JkZXIuIjsK CgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfaHRvbnMoUHlPYmplY3QgKnNlbGYsIFB5T2Jq ZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdz LCAiaTpodG9ucyIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gKGludClodG9u cygoc2hvcnQpeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNo YXIgaHRvbnNfZG9jW10gPQoiaHRvbnMoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29u dmVydCBhIDE2LWJpdCBpbnRlZ2VyIGZyb20gaG9zdCB0byBuZXR3b3JrIGJ5dGUgb3JkZXIu IjsKCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfaHRvbmwoUHlPYmplY3QgKnNlbGYsIFB5 T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShh cmdzLCAiaTpodG9ubCIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gaHRvbmwo eDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNoYXIgaHRvbmxf ZG9jW10gPQoiaHRvbmwoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29udmVydCBhIDMy LWJpdCBpbnRlZ2VyIGZyb20gaG9zdCB0byBuZXR3b3JrIGJ5dGUgb3JkZXIuIjsKCi8qIHNv Y2tldC5pbmV0X2F0b24oKSBhbmQgc29ja2V0LmluZXRfbnRvYSgpIGZ1bmN0aW9ucy4gKi8K CnN0YXRpYyBjaGFyIGluZXRfYXRvbl9kb2NbXSA9CiJpbmV0X2F0b24oc3RyaW5nKSAtPiBw YWNrZWQgMzItYml0IElQIHJlcHJlc2VudGF0aW9uXG5cClxuXApDb252ZXJ0IGFuIElQIGFk ZHJlc3MgaW4gc3RyaW5nIGZvcm1hdCAoMTIzLjQ1LjY3Ljg5KSB0byB0aGUgMzItYml0IHBh Y2tlZFxuXApiaW5hcnkgZm9ybWF0IHVzZWQgaW4gbG93LWxldmVsIG5ldHdvcmsgZnVuY3Rp b25zLiI7CgpzdGF0aWMgUHlPYmplY3QqCnNvY2tldF9pbmV0X2F0b24oUHlPYmplY3QgKnNl bGYsIFB5T2JqZWN0ICphcmdzKQp7CiNpZm5kZWYgSU5BRERSX05PTkUKI2RlZmluZSBJTkFE RFJfTk9ORSAoLTEpCiNlbmRpZgoKCS8qIEhhdmUgdG8gdXNlIGluZXRfYWRkcigpIGluc3Rl YWQgKi8KCWNoYXIgKmlwX2FkZHI7Cgl1bnNpZ25lZCBsb25nIHBhY2tlZF9hZGRyOwoKCWlm ICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiczppbmV0X2F0b24iLCAmaXBfYWRkcikpIHsK CQlyZXR1cm4gTlVMTDsKCX0KCXBhY2tlZF9hZGRyID0gaW5ldF9hZGRyKGlwX2FkZHIpOwoK CWlmIChwYWNrZWRfYWRkciA9PSBJTkFERFJfTk9ORSkgewkvKiBpbnZhbGlkIGFkZHJlc3Mg Ki8KCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkiaWxsZWdhbCBJUCBhZGRy ZXNzIHN0cmluZyBwYXNzZWQgdG8gaW5ldF9hdG9uIik7CgkJcmV0dXJuIE5VTEw7Cgl9CgoJ cmV0dXJuIFB5U3RyaW5nX0Zyb21TdHJpbmdBbmRTaXplKChjaGFyICopICZwYWNrZWRfYWRk ciwKCQkJCQkgIHNpemVvZihwYWNrZWRfYWRkcikpOwp9CgpzdGF0aWMgY2hhciBpbmV0X250 b2FfZG9jW10gPQoiaW5ldF9udG9hKHBhY2tlZF9pcCkgLT4gaXBfYWRkcmVzc19zdHJpbmdc blwKXG5cCkNvbnZlcnQgYW4gSVAgYWRkcmVzcyBmcm9tIDMyLWJpdCBwYWNrZWQgYmluYXJ5 IGZvcm1hdCB0byBzdHJpbmcgZm9ybWF0IjsKCnN0YXRpYyBQeU9iamVjdCoKc29ja2V0X2lu ZXRfbnRvYShQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsKCWNoYXIgKnBhY2tl ZF9zdHI7CglpbnQgYWRkcl9sZW47CglzdHJ1Y3QgaW5fYWRkciBwYWNrZWRfYWRkcjsKCglp ZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgInMjOmluZXRfbnRvYSIsICZwYWNrZWRfc3Ry LCAmYWRkcl9sZW4pKSB7CgkJcmV0dXJuIE5VTEw7Cgl9CgoJaWYgKGFkZHJfbGVuICE9IHNp emVvZihwYWNrZWRfYWRkcikpIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJ CQkicGFja2VkIElQIHdyb25nIGxlbmd0aCBmb3IgaW5ldF9udG9hIik7CgkJcmV0dXJuIE5V TEw7Cgl9CgoJbWVtY3B5KCZwYWNrZWRfYWRkciwgcGFja2VkX3N0ciwgYWRkcl9sZW4pOwoK CXJldHVybiBQeVN0cmluZ19Gcm9tU3RyaW5nKGluZXRfbnRvYShwYWNrZWRfYWRkcikpOwp9 CgovKiBQeXRob24gaW50ZXJmYWNlIHRvIGdldGFkZHJpbmZvKGhvc3QsIHBvcnQpLiAqLwoK LypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tldF9nZXRhZGRyaW5mbyhQeU9i amVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsKCXN0cnVjdCBhZGRyaW5mbyBoaW50cywg KnJlczsKCXN0cnVjdCBhZGRyaW5mbyAqcmVzMCA9IE5VTEw7CglQeU9iamVjdCAqcG9iaiA9 IChQeU9iamVjdCAqKU5VTEw7CgljaGFyIHBidWZbMzBdOwoJY2hhciAqaHB0ciwgKnBwdHI7 CglpbnQgZmFtaWx5LCBzb2NrdHlwZSwgcHJvdG9jb2wsIGZsYWdzOwoJaW50IGVycm9yOwoJ UHlPYmplY3QgKmFsbCA9IChQeU9iamVjdCAqKU5VTEw7CglQeU9iamVjdCAqc2luZ2xlID0g KFB5T2JqZWN0ICopTlVMTDsKCglmYW1pbHkgPSBzb2NrdHlwZSA9IHByb3RvY29sID0gZmxh Z3MgPSAwOwoJZmFtaWx5ID0gUEZfVU5TUEVDOwoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFy Z3MsICJ6T3xpaWlpOmdldGFkZHJpbmZvIiwKCSAgICAmaHB0ciwgJnBvYmosICZmYW1pbHks ICZzb2NrdHlwZSwKCQkJJnByb3RvY29sLCAmZmxhZ3MpKSB7CgkJcmV0dXJuIE5VTEw7Cgl9 CglpZiAoUHlJbnRfQ2hlY2socG9iaikpIHsKCQlQeU9TX3NucHJpbnRmKHBidWYsIHNpemVv ZihwYnVmKSwgIiVsZCIsIFB5SW50X0FzTG9uZyhwb2JqKSk7CgkJcHB0ciA9IHBidWY7Cgl9 IGVsc2UgaWYgKFB5U3RyaW5nX0NoZWNrKHBvYmopKSB7CgkJcHB0ciA9IFB5U3RyaW5nX0Fz U3RyaW5nKHBvYmopOwoJfSBlbHNlIGlmIChwb2JqID09IFB5X05vbmUpIHsKCQlwcHRyID0g KGNoYXIgKilOVUxMOwoJfSBlbHNlIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9y LCAiSW50IG9yIFN0cmluZyBleHBlY3RlZCIpOwoJCXJldHVybiBOVUxMOwoJfQoJbWVtc2V0 KCZoaW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9mYW1pbHkgPSBmYW1pbHk7 CgloaW50cy5haV9zb2NrdHlwZSA9IHNvY2t0eXBlOwoJaGludHMuYWlfcHJvdG9jb2wgPSBw cm90b2NvbDsKCWhpbnRzLmFpX2ZsYWdzID0gZmxhZ3M7CgllcnJvciA9IGdldGFkZHJpbmZv KGhwdHIsIHBwdHIsICZoaW50cywgJnJlczApOwoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVy cm9yKGVycm9yKTsKCQlyZXR1cm4gTlVMTDsKCX0KCglpZiAoKGFsbCA9IFB5TGlzdF9OZXco MCkpID09IE5VTEwpCgkJZ290byBlcnI7Cglmb3IgKHJlcyA9IHJlczA7IHJlczsgcmVzID0g cmVzLT5haV9uZXh0KSB7CgkJUHlPYmplY3QgKmFkZHIgPQoJCQltYWtlc29ja2FkZHIoLTEs IHJlcy0+YWlfYWRkciwgcmVzLT5haV9hZGRybGVuKTsKCQlpZiAoYWRkciA9PSBOVUxMKQoJ CQlnb3RvIGVycjsKCQlzaW5nbGUgPSBQeV9CdWlsZFZhbHVlKCJpaWlzTyIsIHJlcy0+YWlf ZmFtaWx5LAoJCQlyZXMtPmFpX3NvY2t0eXBlLCByZXMtPmFpX3Byb3RvY29sLAoJCQlyZXMt PmFpX2Nhbm9ubmFtZSA/IHJlcy0+YWlfY2Fub25uYW1lIDogIiIsCgkJCWFkZHIpOwoJCVB5 X0RFQ1JFRihhZGRyKTsKCQlpZiAoc2luZ2xlID09IE5VTEwpCgkJCWdvdG8gZXJyOwoKCQlp ZiAoUHlMaXN0X0FwcGVuZChhbGwsIHNpbmdsZSkpCgkJCWdvdG8gZXJyOwoJCVB5X1hERUNS RUYoc2luZ2xlKTsKCX0KCXJldHVybiBhbGw7CiBlcnI6CglQeV9YREVDUkVGKHNpbmdsZSk7 CglQeV9YREVDUkVGKGFsbCk7CglpZiAocmVzMCkKCQlmcmVlYWRkcmluZm8ocmVzMCk7Cgly ZXR1cm4gKFB5T2JqZWN0ICopTlVMTDsKfQoKc3RhdGljIGNoYXIgZ2V0YWRkcmluZm9fZG9j W10gPQoic29ja2V0LmdldGFkZHJpbmZvKGhvc3QsIHBvcnQgWywgZmFtaWx5LCBzb2NrdHlw ZSwgcHJvdG8sIGZsYWdzXSlcblwKCS0tPiBMaXN0IG9mIChmYW1pbHksIHNvY2t0eXBlLCBw cm90bywgY2Fub25uYW1lLCBzb2NrYWRkcilcblwKXG5cClJlc29sdmUgaG9zdCBhbmQgcG9y dCBpbnRvIGFkZHJpbmZvIHN0cnVjdC4iOwoKLyogUHl0aG9uIGludGVyZmFjZSB0byBnZXRu YW1laW5mbyhzYSwgZmxhZ3MpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq CnNvY2tldF9nZXRuYW1laW5mbyhQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsK CVB5T2JqZWN0ICpzYSA9IChQeU9iamVjdCAqKU5VTEw7CglpbnQgZmxhZ3M7CgljaGFyICpo b3N0cDsKCWludCBwb3J0LCBmbG93aW5mbywgc2NvcGVfaWQ7CgljaGFyIGhidWZbTklfTUFY SE9TVF0sIHBidWZbTklfTUFYU0VSVl07CglzdHJ1Y3QgYWRkcmluZm8gaGludHMsICpyZXMg PSBOVUxMOwoJaW50IGVycm9yOwoJUHlPYmplY3QgKnJldCA9IChQeU9iamVjdCAqKU5VTEw7 CgoJZmxhZ3MgPSBmbG93aW5mbyA9IHNjb3BlX2lkID0gMDsKCWlmICghUHlBcmdfUGFyc2VU dXBsZShhcmdzLCAiT2k6Z2V0bmFtZWluZm8iLCAmc2EsICZmbGFncykpCgkJcmV0dXJuIE5V TEw7CglpZiAgKCFQeUFyZ19QYXJzZVR1cGxlKHNhLCAic2l8aWkiLAoJCQkgICAgICAgJmhv c3RwLCAmcG9ydCwgJmZsb3dpbmZvLCAmc2NvcGVfaWQpKQoJCXJldHVybiBOVUxMOwoJUHlP U19zbnByaW50ZihwYnVmLCBzaXplb2YocGJ1ZiksICIlZCIsIHBvcnQpOwoJbWVtc2V0KCZo aW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9mYW1pbHkgPSBQRl9VTlNQRUM7 CgloaW50cy5haV9zb2NrdHlwZSA9IFNPQ0tfREdSQU07CS8qIG1ha2UgbnVtZXJpYyBwb3J0 IGhhcHB5ICovCgllcnJvciA9IGdldGFkZHJpbmZvKGhvc3RwLCBwYnVmLCAmaGludHMsICZy ZXMpOwoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVycm9yKGVycm9yKTsKCQlnb3RvIGZhaWw7 Cgl9CglpZiAocmVzLT5haV9uZXh0KSB7CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJv ciwKCQkJInNvY2thZGRyIHJlc29sdmVkIHRvIG11bHRpcGxlIGFkZHJlc3NlcyIpOwoJCWdv dG8gZmFpbDsKCX0KCXN3aXRjaCAocmVzLT5haV9mYW1pbHkpIHsKCWNhc2UgQUZfSU5FVDoK CSAgICB7CgkJY2hhciAqdDE7CgkJaW50IHQyOwoJCWlmIChQeUFyZ19QYXJzZVR1cGxlKHNh LCAic2kiLCAmdDEsICZ0MikgPT0gMCkgewoJCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vy cm9yLAoJCQkJIklQdjQgc29ja2FkZHIgbXVzdCBiZSAyIHR1cGxlIik7CgkJCWdvdG8gZmFp bDsKCQl9CgkJYnJlYWs7CgkgICAgfQojaWZkZWYgRU5BQkxFX0lQVjYKCWNhc2UgQUZfSU5F VDY6CgkgICAgewoJCXN0cnVjdCBzb2NrYWRkcl9pbjYgKnNpbjY7CgkJc2luNiA9IChzdHJ1 Y3Qgc29ja2FkZHJfaW42ICopcmVzLT5haV9hZGRyOwoJCXNpbjYtPnNpbjZfZmxvd2luZm8g PSBmbG93aW5mbzsKCQlzaW42LT5zaW42X3Njb3BlX2lkID0gc2NvcGVfaWQ7CgkJYnJlYWs7 CgkgICAgfQojZW5kaWYKCX0KCWVycm9yID0gZ2V0bmFtZWluZm8ocmVzLT5haV9hZGRyLCBy ZXMtPmFpX2FkZHJsZW4sCgkJCWhidWYsIHNpemVvZihoYnVmKSwgcGJ1Ziwgc2l6ZW9mKHBi dWYpLCBmbGFncyk7CglpZiAoZXJyb3IpIHsKCQlzZXRfZ2FpZXJyb3IoZXJyb3IpOwoJCWdv dG8gZmFpbDsKCX0KCXJldCA9IFB5X0J1aWxkVmFsdWUoInNzIiwgaGJ1ZiwgcGJ1Zik7Cgpm YWlsOgoJaWYgKHJlcykKCQlmcmVlYWRkcmluZm8ocmVzKTsKCXJldHVybiByZXQ7Cn0KCnN0 YXRpYyBjaGFyIGdldG5hbWVpbmZvX2RvY1tdID0KInNvY2tldC5nZXRuYW1laW5mbyhzb2Nr YWRkciwgZmxhZ3MpIC0tPiAoaG9zdCwgcG9ydClcblwKXG5cCkdldCBob3N0IGFuZCBwb3J0 IGZvciBhIHNvY2thZGRyLiI7CgovKiBMaXN0IG9mIGZ1bmN0aW9ucyBleHBvcnRlZCBieSB0 aGlzIG1vZHVsZS4gKi8KCnN0YXRpYyBQeU1ldGhvZERlZiBzb2NrZXRfbWV0aG9kc1tdID0g ewoJeyJnZXRob3N0YnluYW1lIiwJc29ja2V0X2dldGhvc3RieW5hbWUsCgkgTUVUSF9WQVJB UkdTLCBnZXRob3N0YnluYW1lX2RvY30sCgl7ImdldGhvc3RieW5hbWVfZXgiLAlzb2NrZXRf Z2V0aG9zdGJ5bmFtZV9leCwKCSBNRVRIX1ZBUkFSR1MsIGdoYm5fZXhfZG9jfSwKCXsiZ2V0 aG9zdGJ5YWRkciIsCXNvY2tldF9nZXRob3N0YnlhZGRyLAoJIE1FVEhfVkFSQVJHUywgZ2V0 aG9zdGJ5YWRkcl9kb2N9LAoJeyJnZXRob3N0bmFtZSIsCQlzb2NrZXRfZ2V0aG9zdG5hbWUs CgkgTUVUSF9WQVJBUkdTLCBnZXRob3N0bmFtZV9kb2N9LAoJeyJnZXRzZXJ2YnluYW1lIiwJ c29ja2V0X2dldHNlcnZieW5hbWUsCgkgTUVUSF9WQVJBUkdTLCBnZXRzZXJ2YnluYW1lX2Rv Y30sCgl7ImdldHByb3RvYnluYW1lIiwJc29ja2V0X2dldHByb3RvYnluYW1lLAoJIE1FVEhf VkFSQVJHUyxnZXRwcm90b2J5bmFtZV9kb2N9LAojaWZuZGVmIE5PX0RVUAoJeyJmcm9tZmQi LAkJc29ja2V0X2Zyb21mZCwKCSBNRVRIX1ZBUkFSR1MsIGZyb21mZF9kb2N9LAojZW5kaWYK CXsibnRvaHMiLAkJc29ja2V0X250b2hzLAoJIE1FVEhfVkFSQVJHUywgbnRvaHNfZG9jfSwK CXsibnRvaGwiLAkJc29ja2V0X250b2hsLAoJIE1FVEhfVkFSQVJHUywgbnRvaGxfZG9jfSwK CXsiaHRvbnMiLAkJc29ja2V0X2h0b25zLAoJIE1FVEhfVkFSQVJHUywgaHRvbnNfZG9jfSwK CXsiaHRvbmwiLAkJc29ja2V0X2h0b25sLAoJIE1FVEhfVkFSQVJHUywgaHRvbmxfZG9jfSwK CXsiaW5ldF9hdG9uIiwJCXNvY2tldF9pbmV0X2F0b24sCgkgTUVUSF9WQVJBUkdTLCBpbmV0 X2F0b25fZG9jfSwKCXsiaW5ldF9udG9hIiwJCXNvY2tldF9pbmV0X250b2EsCgkgTUVUSF9W QVJBUkdTLCBpbmV0X250b2FfZG9jfSwKCXsiZ2V0YWRkcmluZm8iLAkJc29ja2V0X2dldGFk ZHJpbmZvLAoJIE1FVEhfVkFSQVJHUywgZ2V0YWRkcmluZm9fZG9jfSwKCXsiZ2V0bmFtZWlu Zm8iLAkJc29ja2V0X2dldG5hbWVpbmZvLAoJIE1FVEhfVkFSQVJHUywgZ2V0bmFtZWluZm9f ZG9jfSwKCXtOVUxMLAkJCU5VTEx9CQkgLyogU2VudGluZWwgKi8KfTsKCgojaWZkZWYgUklT Q09TCiNkZWZpbmUgT1NfSU5JVF9ERUZJTkVECgpzdGF0aWMgaW50Cm9zX2luaXQodm9pZCkK ewoJX2tlcm5lbF9zd2lfcmVncyByOwoKCXIuclswXSA9IDA7Cglfa2VybmVsX3N3aSgweDQz MzgwLCAmciwgJnIpOwoJdGFza3dpbmRvdyA9IHIuclswXTsKCglyZXR1cm4gMDsKfQoKI2Vu ZGlmIC8qIFJJU0NPUyAqLwoKCiNpZmRlZiBNU19XSU5ET1dTCiNkZWZpbmUgT1NfSU5JVF9E RUZJTkVECgovKiBBZGRpdGlvbmFsIGluaXRpYWxpemF0aW9uIGFuZCBjbGVhbnVwIGZvciBX aW5kb3dzICovCgpzdGF0aWMgdm9pZApvc19jbGVhbnVwKHZvaWQpCnsKCVdTQUNsZWFudXAo KTsKfQoKc3RhdGljIGludApvc19pbml0KHZvaWQpCnsKCVdTQURBVEEgV1NBRGF0YTsKCWlu dCByZXQ7CgljaGFyIGJ1ZlsxMDBdOwoJcmV0ID0gV1NBU3RhcnR1cCgweDAxMDEsICZXU0FE YXRhKTsKCXN3aXRjaCAocmV0KSB7CgljYXNlIDA6CS8qIE5vIGVycm9yICovCgkJYXRleGl0 KG9zX2NsZWFudXApOwoJCXJldHVybiAxOyAvKiBTdWNjZXNzICovCgljYXNlIFdTQVNZU05P VFJFQURZOgoJCVB5RXJyX1NldFN0cmluZyhQeUV4Y19JbXBvcnRFcnJvciwKCQkJCSJXU0FT dGFydHVwIGZhaWxlZDogbmV0d29yayBub3QgcmVhZHkiKTsKCQlicmVhazsKCWNhc2UgV1NB VkVSTk9UU1VQUE9SVEVEOgoJY2FzZSBXU0FFSU5WQUw6CgkJUHlFcnJfU2V0U3RyaW5nKAoJ CQlQeUV4Y19JbXBvcnRFcnJvciwKCQkJIldTQVN0YXJ0dXAgZmFpbGVkOiByZXF1ZXN0ZWQg dmVyc2lvbiBub3Qgc3VwcG9ydGVkIik7CgkJYnJlYWs7CglkZWZhdWx0OgoJCVB5T1Nfc25w cmludGYoYnVmLCBzaXplb2YoYnVmKSwKCQkJICAgICAgIldTQVN0YXJ0dXAgZmFpbGVkOiBl cnJvciBjb2RlICVkIiwgcmV0KTsKCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfSW1wb3J0RXJy b3IsIGJ1Zik7CgkJYnJlYWs7Cgl9CglyZXR1cm4gMDsgLyogRmFpbHVyZSAqLwp9CgojZW5k aWYgLyogTVNfV0lORE9XUyAqLwoKCiNpZmRlZiBQWU9TX09TMgojZGVmaW5lIE9TX0lOSVRf REVGSU5FRAoKLyogQWRkaXRpb25hbCBpbml0aWFsaXphdGlvbiBmb3IgT1MvMiAqLwoKc3Rh dGljIGludApvc19pbml0KHZvaWQpCnsKI2lmbmRlZiBQWUNDX0dDQwoJY2hhciByZWFzb25b NjRdOwoJaW50IHJjID0gc29ja19pbml0KCk7CgoJaWYgKHJjID09IDApIHsKCQlyZXR1cm4g MTsgLyogU3VjY2VzcyAqLwoJfQoKCVB5T1Nfc25wcmludGYocmVhc29uLCBzaXplb2YocmVh c29uKSwKCQkgICAgICAiT1MvMiBUQ1AvSVAgRXJyb3IjICVkIiwgc29ja19lcnJubygpKTsK CVB5RXJyX1NldFN0cmluZyhQeUV4Y19JbXBvcnRFcnJvciwgcmVhc29uKTsKCglyZXR1cm4g MDsgIC8qIEZhaWx1cmUgKi8KI2Vsc2UKCS8qIE5vIG5lZWQgdG8gaW5pdGlhbGlzZSBzb2Nr ZXRzIHdpdGggR0NDL0VNWCAqLwoJcmV0dXJuIDE7IC8qIFN1Y2Nlc3MgKi8KI2VuZGlmCn0K CiNlbmRpZiAvKiBQWU9TX09TMiAqLwoKCiNpZm5kZWYgT1NfSU5JVF9ERUZJTkVECnN0YXRp YyBpbnQKb3NfaW5pdCh2b2lkKQp7CglyZXR1cm4gMTsgLyogU3VjY2VzcyAqLwp9CiNlbmRp ZgoKCi8qIEMgQVBJIHRhYmxlIC0gYWx3YXlzIGFkZCBuZXcgdGhpbmdzIHRvIHRoZSBlbmQg Zm9yIGJpbmFyeQogICBjb21wYXRpYmlsaXR5LiAqLwpzdGF0aWMKUHlTb2NrZXRNb2R1bGVf QVBJT2JqZWN0IFB5U29ja2V0TW9kdWxlQVBJID0KewoJJnNvY2tfdHlwZSwKfTsKCgovKiBJ bml0aWFsaXplIHRoZSBfc29ja2V0IG1vZHVsZS4KCiAgIFRoaXMgbW9kdWxlIGlzIGFjdHVh bGx5IGNhbGxlZCAiX3NvY2tldCIsIGFuZCB0aGVyZSdzIGEgd3JhcHBlcgogICAic29ja2V0 LnB5IiB3aGljaCBpbXBsZW1lbnRzIHNvbWUgYWRkaXRpb25hbCBmdW5jdGlvbmFsaXR5LiAg T24gc29tZQogICBwbGF0Zm9ybXMgKGUuZy4gV2luZG93cyBhbmQgT1MvMiksIHNvY2tldC5w eSBhbHNvIGltcGxlbWVudHMgYQogICB3cmFwcGVyIGZvciB0aGUgc29ja2V0IHR5cGUgdGhh dCBwcm92aWRlcyBtaXNzaW5nIGZ1bmN0aW9uYWxpdHkgc3VjaAogICBhcyBtYWtlZmlsZSgp LCBkdXAoKSBhbmQgZnJvbWZkKCkuICBUaGUgaW1wb3J0IG9mICJfc29ja2V0IiBtYXkgZmFp bAogICB3aXRoIGFuIEltcG9ydEVycm9yIGV4Y2VwdGlvbiBpZiBvcy1zcGVjaWZpYyBpbml0 aWFsaXphdGlvbiBmYWlscy4KICAgT24gV2luZG93cywgdGhpcyBkb2VzIFdJTlNPQ0sgaW5p dGlhbGl6YXRpb24uICBXaGVuIFdJTlNPQ0sgaXMKICAgaW5pdGlhbGl6ZWQgc3VjY2VzZnVs bHksIGEgY2FsbCB0byBXU0FDbGVhbnVwKCkgaXMgc2NoZWR1bGVkIHRvIGJlCiAgIG1hZGUg YXQgZXhpdCB0aW1lLgoqLwoKc3RhdGljIGNoYXIgc29ja2V0X2RvY1tdID0KIkltcGxlbWVu dGF0aW9uIG1vZHVsZSBmb3Igc29ja2V0IG9wZXJhdGlvbnMuICBTZWUgdGhlIHNvY2tldCBt b2R1bGVcblwKZm9yIGRvY3VtZW50YXRpb24uIjsKCkRMX0VYUE9SVCh2b2lkKQppbml0X3Nv Y2tldCh2b2lkKQp7CglQeU9iamVjdCAqbTsKCglpZiAoIW9zX2luaXQoKSkKCQlyZXR1cm47 CgoJc29ja190eXBlLm9iX3R5cGUgPSAmUHlUeXBlX1R5cGU7Cglzb2NrX3R5cGUudHBfZ2V0 YXR0cm8gPSBQeU9iamVjdF9HZW5lcmljR2V0QXR0cjsKCXNvY2tfdHlwZS50cF9hbGxvYyA9 IFB5VHlwZV9HZW5lcmljQWxsb2M7Cglzb2NrX3R5cGUudHBfZnJlZSA9IFB5T2JqZWN0X0Rl bDsKCW0gPSBQeV9Jbml0TW9kdWxlMyhQeVNvY2tldF9NT0RVTEVfTkFNRSwKCQkJICAgc29j a2V0X21ldGhvZHMsCgkJCSAgIHNvY2tldF9kb2MpOwoKCXNvY2tldF9lcnJvciA9IFB5RXJy X05ld0V4Y2VwdGlvbigic29ja2V0LmVycm9yIiwgTlVMTCwgTlVMTCk7CglpZiAoc29ja2V0 X2Vycm9yID09IE5VTEwpCgkJcmV0dXJuOwoJUHlfSU5DUkVGKHNvY2tldF9lcnJvcik7CglQ eU1vZHVsZV9BZGRPYmplY3QobSwgImVycm9yIiwgc29ja2V0X2Vycm9yKTsKCXNvY2tldF9o ZXJyb3IgPSBQeUVycl9OZXdFeGNlcHRpb24oInNvY2tldC5oZXJyb3IiLAoJCQkJCSAgIHNv Y2tldF9lcnJvciwgTlVMTCk7CglpZiAoc29ja2V0X2hlcnJvciA9PSBOVUxMKQoJCXJldHVy bjsKCVB5X0lOQ1JFRihzb2NrZXRfaGVycm9yKTsKCVB5TW9kdWxlX0FkZE9iamVjdChtLCAi aGVycm9yIiwgc29ja2V0X2hlcnJvcik7Cglzb2NrZXRfZ2FpZXJyb3IgPSBQeUVycl9OZXdF eGNlcHRpb24oInNvY2tldC5nYWllcnJvciIsIHNvY2tldF9lcnJvciwKCSAgICBOVUxMKTsK CWlmIChzb2NrZXRfZ2FpZXJyb3IgPT0gTlVMTCkKCQlyZXR1cm47CglQeV9JTkNSRUYoc29j a2V0X2dhaWVycm9yKTsKCVB5TW9kdWxlX0FkZE9iamVjdChtLCAiZ2FpZXJyb3IiLCBzb2Nr ZXRfZ2FpZXJyb3IpOwoJUHlfSU5DUkVGKChQeU9iamVjdCAqKSZzb2NrX3R5cGUpOwoJaWYg KFB5TW9kdWxlX0FkZE9iamVjdChtLCAiU29ja2V0VHlwZSIsCgkJCSAgICAgICAoUHlPYmpl Y3QgKikmc29ja190eXBlKSAhPSAwKQoJCXJldHVybjsKCVB5X0lOQ1JFRigoUHlPYmplY3Qg Kikmc29ja190eXBlKTsKCWlmIChQeU1vZHVsZV9BZGRPYmplY3QobSwgInNvY2tldCIsCgkJ CSAgICAgICAoUHlPYmplY3QgKikmc29ja190eXBlKSAhPSAwKQoJCXJldHVybjsKCgkvKiBF eHBvcnQgQyBBUEkgKi8KCWlmIChQeU1vZHVsZV9BZGRPYmplY3QobSwgUHlTb2NrZXRfQ0FQ SV9OQU1FLAoJICAgICAgIFB5Q09iamVjdF9Gcm9tVm9pZFB0cigodm9pZCAqKSZQeVNvY2tl dE1vZHVsZUFQSSwgTlVMTCkKCQkJCSApICE9IDApCgkJcmV0dXJuOwoKCS8qIEFkZHJlc3Mg ZmFtaWxpZXMgKHdlIG9ubHkgc3VwcG9ydCBBRl9JTkVUIGFuZCBBRl9VTklYKSAqLwojaWZk ZWYgQUZfVU5TUEVDCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfVU5TUEVDIiwg QUZfVU5TUEVDKTsKI2VuZGlmCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfSU5F VCIsIEFGX0lORVQpOwojaWZkZWYgQUZfSU5FVDYKCVB5TW9kdWxlX0FkZEludENvbnN0YW50 KG0sICJBRl9JTkVUNiIsIEFGX0lORVQ2KTsKI2VuZGlmIC8qIEFGX0lORVQ2ICovCiNpZmRl ZiBBRl9VTklYCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfVU5JWCIsIEFGX1VO SVgpOwojZW5kaWYgLyogQUZfVU5JWCAqLwojaWZkZWYgQUZfQVgyNQoJLyogQW1hdGV1ciBS YWRpbyBBWC4yNSAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFGX0FYMjUiLCBB Rl9BWDI1KTsKI2VuZGlmCiNpZmRlZiBBRl9JUFgKCVB5TW9kdWxlX0FkZEludENvbnN0YW50 KG0sICJBRl9JUFgiLCBBRl9JUFgpOyAvKiBOb3ZlbGwgSVBYICovCiNlbmRpZgojaWZkZWYg QUZfQVBQTEVUQUxLCgkvKiBBcHBsZXRhbGsgRERQICovCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiQUZfQVBQTEVUQUxLIiwgQUZfQVBQTEVUQUxLKTsKI2VuZGlmCiNpZmRlZiBB Rl9ORVRST00KCS8qIEFtYXRldXIgcmFkaW8gTmV0Uk9NICovCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiQUZfTkVUUk9NIiwgQUZfTkVUUk9NKTsKI2VuZGlmCiNpZmRlZiBBRl9C UklER0UKCS8qIE11bHRpcHJvdG9jb2wgYnJpZGdlICovCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiQUZfQlJJREdFIiwgQUZfQlJJREdFKTsKI2VuZGlmCiNpZmRlZiBBRl9BQUw1 CgkvKiBSZXNlcnZlZCBmb3IgV2VybmVyJ3MgQVRNICovCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiQUZfQUFMNSIsIEFGX0FBTDUpOwojZW5kaWYKI2lmZGVmIEFGX1gyNQoJLyog UmVzZXJ2ZWQgZm9yIFguMjUgcHJvamVjdCAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo bSwgIkFGX1gyNSIsIEFGX1gyNSk7CiNlbmRpZgojaWZkZWYgQUZfSU5FVDYKCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJBRl9JTkVUNiIsIEFGX0lORVQ2KTsgLyogSVAgdmVyc2lv biA2ICovCiNlbmRpZgojaWZkZWYgQUZfUk9TRQoJLyogQW1hdGV1ciBSYWRpbyBYLjI1IFBM UCAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFGX1JPU0UiLCBBRl9ST1NFKTsK I2VuZGlmCiNpZmRlZiBIQVZFX05FVFBBQ0tFVF9QQUNLRVRfSAoJUHlNb2R1bGVfQWRkSW50 Q29uc3RhbnQobSwgIkFGX1BBQ0tFVCIsIEFGX1BBQ0tFVCk7CglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiUEZfUEFDS0VUIiwgUEZfUEFDS0VUKTsKCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJQQUNLRVRfSE9TVCIsIFBBQ0tFVF9IT1NUKTsKCVB5TW9kdWxlX0FkZElu dENvbnN0YW50KG0sICJQQUNLRVRfQlJPQURDQVNUIiwgUEFDS0VUX0JST0FEQ0FTVCk7CglQ eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiUEFDS0VUX01VTFRJQ0FTVCIsIFBBQ0tFVF9N VUxUSUNBU1QpOwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlBBQ0tFVF9PVEhFUkhP U1QiLCBQQUNLRVRfT1RIRVJIT1NUKTsKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJQ QUNLRVRfT1VUR09JTkciLCBQQUNLRVRfT1VUR09JTkcpOwoJUHlNb2R1bGVfQWRkSW50Q29u c3RhbnQobSwgIlBBQ0tFVF9MT09QQkFDSyIsIFBBQ0tFVF9MT09QQkFDSyk7CglQeU1vZHVs ZV9BZGRJbnRDb25zdGFudChtLCAiUEFDS0VUX0ZBU1RST1VURSIsIFBBQ0tFVF9GQVNUUk9V VEUpOwojZW5kaWYKCgkvKiBTb2NrZXQgdHlwZXMgKi8KCVB5TW9kdWxlX0FkZEludENvbnN0 YW50KG0sICJTT0NLX1NUUkVBTSIsIFNPQ0tfU1RSRUFNKTsKCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJTT0NLX0RHUkFNIiwgU09DS19ER1JBTSk7CiNpZm5kZWYgX19CRU9TX18K LyogV2UgaGF2ZSBpbmNvbXBsZXRlIHNvY2tldCBzdXBwb3J0LiAqLwoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIlNPQ0tfUkFXIiwgU09DS19SQVcpOwoJUHlNb2R1bGVfQWRkSW50 Q29uc3RhbnQobSwgIlNPQ0tfU0VRUEFDS0VUIiwgU09DS19TRVFQQUNLRVQpOwoJUHlNb2R1 bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPQ0tfUkRNIiwgU09DS19SRE0pOwojZW5kaWYKCiNp ZmRlZglTT19ERUJVRwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX0RFQlVHIiwg U09fREVCVUcpOwojZW5kaWYKI2lmZGVmCVNPX0FDQ0VQVENPTk4KCVB5TW9kdWxlX0FkZElu dENvbnN0YW50KG0sICJTT19BQ0NFUFRDT05OIiwgU09fQUNDRVBUQ09OTik7CiNlbmRpZgoj aWZkZWYJU09fUkVVU0VBRERSCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fUkVV U0VBRERSIiwgU09fUkVVU0VBRERSKTsKI2VuZGlmCiNpZmRlZglTT19LRUVQQUxJVkUKCVB5 TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT19LRUVQQUxJVkUiLCBTT19LRUVQQUxJVkUp OwojZW5kaWYKI2lmZGVmCVNPX0RPTlRST1VURQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo bSwgIlNPX0RPTlRST1VURSIsIFNPX0RPTlRST1VURSk7CiNlbmRpZgojaWZkZWYJU09fQlJP QURDQVNUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fQlJPQURDQVNUIiwgU09f QlJPQURDQVNUKTsKI2VuZGlmCiNpZmRlZglTT19VU0VMT09QQkFDSwoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIlNPX1VTRUxPT1BCQUNLIiwgU09fVVNFTE9PUEJBQ0spOwojZW5k aWYKI2lmZGVmCVNPX0xJTkdFUgoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX0xJ TkdFUiIsIFNPX0xJTkdFUik7CiNlbmRpZgojaWZkZWYJU09fT09CSU5MSU5FCglQeU1vZHVs ZV9BZGRJbnRDb25zdGFudChtLCAiU09fT09CSU5MSU5FIiwgU09fT09CSU5MSU5FKTsKI2Vu ZGlmCiNpZmRlZglTT19SRVVTRVBPUlQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJT T19SRVVTRVBPUlQiLCBTT19SRVVTRVBPUlQpOwojZW5kaWYKI2lmZGVmCVNPX1NOREJVRgoJ UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1NOREJVRiIsIFNPX1NOREJVRik7CiNl bmRpZgojaWZkZWYJU09fUkNWQlVGCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09f UkNWQlVGIiwgU09fUkNWQlVGKTsKI2VuZGlmCiNpZmRlZglTT19TTkRMT1dBVAoJUHlNb2R1 bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1NORExPV0FUIiwgU09fU05ETE9XQVQpOwojZW5k aWYKI2lmZGVmCVNPX1JDVkxPV0FUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09f UkNWTE9XQVQiLCBTT19SQ1ZMT1dBVCk7CiNlbmRpZgojaWZkZWYJU09fU05EVElNRU8KCVB5 TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT19TTkRUSU1FTyIsIFNPX1NORFRJTUVPKTsK I2VuZGlmCiNpZmRlZglTT19SQ1ZUSU1FTwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwg IlNPX1JDVlRJTUVPIiwgU09fUkNWVElNRU8pOwojZW5kaWYKI2lmZGVmCVNPX0VSUk9SCglQ eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fRVJST1IiLCBTT19FUlJPUik7CiNlbmRp ZgojaWZkZWYJU09fVFlQRQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1RZUEUi LCBTT19UWVBFKTsKI2VuZGlmCgoJLyogTWF4aW11bSBudW1iZXIgb2YgY29ubmVjdGlvbnMg Zm9yICJsaXN0ZW4iICovCiNpZmRlZglTT01BWENPTk4KCVB5TW9kdWxlX0FkZEludENvbnN0 YW50KG0sICJTT01BWENPTk4iLCBTT01BWENPTk4pOwojZWxzZQoJUHlNb2R1bGVfQWRkSW50 Q29uc3RhbnQobSwgIlNPTUFYQ09OTiIsIDUpOyAvKiBDb21tb24gdmFsdWUgKi8KI2VuZGlm CgoJLyogRmxhZ3MgZm9yIHNlbmQsIHJlY3YgKi8KI2lmZGVmCU1TR19PT0IKCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJNU0dfT09CIiwgTVNHX09PQik7CiNlbmRpZgojaWZkZWYJ TVNHX1BFRUsKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfUEVFSyIsIE1TR19Q RUVLKTsKI2VuZGlmCiNpZmRlZglNU0dfRE9OVFJPVVRFCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiTVNHX0RPTlRST1VURSIsIE1TR19ET05UUk9VVEUpOwojZW5kaWYKI2lmZGVm CU1TR19ET05UV0FJVAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIk1TR19ET05UV0FJ VCIsIE1TR19ET05UV0FJVCk7CiNlbmRpZgojaWZkZWYJTVNHX0VPUgoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIk1TR19FT1IiLCBNU0dfRU9SKTsKI2VuZGlmCiNpZmRlZglNU0df VFJVTkMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfVFJVTkMiLCBNU0dfVFJV TkMpOwojZW5kaWYKI2lmZGVmCU1TR19DVFJVTkMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50 KG0sICJNU0dfQ1RSVU5DIiwgTVNHX0NUUlVOQyk7CiNlbmRpZgojaWZkZWYJTVNHX1dBSVRB TEwKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfV0FJVEFMTCIsIE1TR19XQUlU QUxMKTsKI2VuZGlmCiNpZmRlZglNU0dfQlRBRwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo bSwgIk1TR19CVEFHIiwgTVNHX0JUQUcpOwojZW5kaWYKI2lmZGVmCU1TR19FVEFHCglQeU1v ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTVNHX0VUQUciLCBNU0dfRVRBRyk7CiNlbmRpZgoK CS8qIFByb3RvY29sIGxldmVsIGFuZCBudW1iZXJzLCB1c2FibGUgZm9yIFtnc11ldHNvY2tv cHQgKi8KI2lmZGVmCVNPTF9TT0NLRVQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJT T0xfU09DS0VUIiwgU09MX1NPQ0tFVCk7CiNlbmRpZgojaWZkZWYJU09MX0lQCglQeU1vZHVs ZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0lQIiwgU09MX0lQKTsKI2Vsc2UKCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJTT0xfSVAiLCAwKTsKI2VuZGlmCiNpZmRlZglTT0xfSVBY CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0lQWCIsIFNPTF9JUFgpOwojZW5k aWYKI2lmZGVmCVNPTF9BWDI1CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0FY MjUiLCBTT0xfQVgyNSk7CiNlbmRpZgojaWZkZWYJU09MX0FUQUxLCglQeU1vZHVsZV9BZGRJ bnRDb25zdGFudChtLCAiU09MX0FUQUxLIiwgU09MX0FUQUxLKTsKI2VuZGlmCiNpZmRlZglT T0xfTkVUUk9NCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX05FVFJPTSIsIFNP TF9ORVRST00pOwojZW5kaWYKI2lmZGVmCVNPTF9ST1NFCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiU09MX1JPU0UiLCBTT0xfUk9TRSk7CiNlbmRpZgojaWZkZWYJU09MX1RDUAoJ UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPTF9UQ1AiLCBTT0xfVENQKTsKI2Vsc2UK CVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT0xfVENQIiwgNik7CiNlbmRpZgojaWZk ZWYJU09MX1VEUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPTF9VRFAiLCBTT0xf VURQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT0xfVURQIiwgMTcp OwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0s ICJJUFBST1RPX0lQIiwgSVBQUk9UT19JUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiSVBQUk9UT19JUCIsIDApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSE9QT1BU UwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSE9QT1BUUyIsIElQUFJP VE9fSE9QT1BUUyk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19JQ01QCglQeU1vZHVsZV9BZGRJ bnRDb25zdGFudChtLCAiSVBQUk9UT19JQ01QIiwgSVBQUk9UT19JQ01QKTsKI2Vsc2UKCVB5 TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0lDTVAiLCAxKTsKI2VuZGlmCiNp ZmRlZglJUFBST1RPX0lHTVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RP X0lHTVAiLCBJUFBST1RPX0lHTVApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fR0dQCglQeU1v ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19HR1AiLCBJUFBST1RPX0dHUCk7CiNl bmRpZgojaWZkZWYJSVBQUk9UT19JUFY0CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAi SVBQUk9UT19JUFY0IiwgSVBQUk9UT19JUFY0KTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lQ SVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0lQSVAiLCBJUFBST1RP X0lQSVApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fVENQCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiSVBQUk9UT19UQ1AiLCBJUFBST1RPX1RDUCk7CiNlbHNlCglQeU1vZHVsZV9B ZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19UQ1AiLCA2KTsKI2VuZGlmCiNpZmRlZglJUFBS T1RPX0VHUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fRUdQIiwgSVBQ Uk9UT19FR1ApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fUFVQCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSVBQUk9UT19QVVAiLCBJUFBST1RPX1BVUCk7CiNlbmRpZgojaWZkZWYJ SVBQUk9UT19VRFAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX1VEUCIs IElQUFJPVE9fVURQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBS T1RPX1VEUCIsIDE3KTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lEUAoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIklQUFJPVE9fSURQIiwgSVBQUk9UT19JRFApOwojZW5kaWYKI2lm ZGVmCUlQUFJPVE9fSEVMTE8KCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RP X0hFTExPIiwgSVBQUk9UT19IRUxMTyk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19ORAoJUHlN b2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fTkQiLCBJUFBST1RPX05EKTsKI2Vu ZGlmCiNpZmRlZglJUFBST1RPX1RQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQ Uk9UT19UUCIsIElQUFJPVE9fVFApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSVBWNgoJUHlN b2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSVBWNiIsIElQUFJPVE9fSVBWNik7 CiNlbmRpZgojaWZkZWYJSVBQUk9UT19ST1VUSU5HCglQeU1vZHVsZV9BZGRJbnRDb25zdGFu dChtLCAiSVBQUk9UT19ST1VUSU5HIiwgSVBQUk9UT19ST1VUSU5HKTsKI2VuZGlmCiNpZmRl ZglJUFBST1RPX0ZSQUdNRU5UCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9U T19GUkFHTUVOVCIsIElQUFJPVE9fRlJBR01FTlQpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9f UlNWUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fUlNWUCIsIElQUFJP VE9fUlNWUCk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19HUkUKCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJJUFBST1RPX0dSRSIsIElQUFJPVE9fR1JFKTsKI2VuZGlmCiNpZmRlZglJ UFBST1RPX0VTUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fRVNQIiwg SVBQUk9UT19FU1ApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fQUgKCVB5TW9kdWxlX0FkZElu dENvbnN0YW50KG0sICJJUFBST1RPX0FIIiwgSVBQUk9UT19BSCk7CiNlbmRpZgojaWZkZWYJ SVBQUk9UT19NT0JJTEUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX01P QklMRSIsIElQUFJPVE9fTU9CSUxFKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lDTVBWNgoJ UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSUNNUFY2IiwgSVBQUk9UT19J Q01QVjYpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fTk9ORQoJUHlNb2R1bGVfQWRkSW50Q29u c3RhbnQobSwgIklQUFJPVE9fTk9ORSIsIElQUFJPVE9fTk9ORSk7CiNlbmRpZgojaWZkZWYJ SVBQUk9UT19EU1RPUFRTCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19E U1RPUFRTIiwgSVBQUk9UT19EU1RPUFRTKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX1hUUAoJ UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fWFRQIiwgSVBQUk9UT19YVFAp OwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fRU9OCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht LCAiSVBQUk9UT19FT04iLCBJUFBST1RPX0VPTik7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19Q SU0KCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX1BJTSIsIElQUFJPVE9f UElNKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lQQ09NUAoJUHlNb2R1bGVfQWRkSW50Q29u c3RhbnQobSwgIklQUFJPVE9fSVBDT01QIiwgSVBQUk9UT19JUENPTVApOwojZW5kaWYKI2lm ZGVmCUlQUFJPVE9fVlJSUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9f VlJSUCIsIElQUFJPVE9fVlJSUCk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19CSVAKCVB5TW9k dWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0JJUCIsIElQUFJPVE9fQklQKTsKI2Vu ZGlmCi8qKi8KI2lmZGVmCUlQUFJPVE9fUkFXCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht LCAiSVBQUk9UT19SQVciLCBJUFBST1RPX1JBVyk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSVBQUk9UT19SQVciLCAyNTUpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9f TUFYCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19NQVgiLCBJUFBST1RP X01BWCk7CiNlbmRpZgoKCS8qIFNvbWUgcG9ydCBjb25maWd1cmF0aW9uICovCiNpZmRlZglJ UFBPUlRfUkVTRVJWRUQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBPUlRfUkVT RVJWRUQiLCBJUFBPUlRfUkVTRVJWRUQpOwojZWxzZQoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh bnQobSwgIklQUE9SVF9SRVNFUlZFRCIsIDEwMjQpOwojZW5kaWYKI2lmZGVmCUlQUE9SVF9V U0VSUkVTRVJWRUQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBPUlRfVVNFUlJF U0VSVkVEIiwgSVBQT1JUX1VTRVJSRVNFUlZFRCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSVBQT1JUX1VTRVJSRVNFUlZFRCIsIDUwMDApOwojZW5kaWYKCgkvKiBT b21lIHJlc2VydmVkIElQIHYuNCBhZGRyZXNzZXMgKi8KI2lmZGVmCUlOQUREUl9BTlkKCVB5 TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJTkFERFJfQU5ZIiwgSU5BRERSX0FOWSk7CiNl bHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0FOWSIsIDB4MDAwMDAw MDApOwojZW5kaWYKI2lmZGVmCUlOQUREUl9CUk9BRENBU1QKCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJJTkFERFJfQlJPQURDQVNUIiwgSU5BRERSX0JST0FEQ0FTVCk7CiNlbHNl CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0JST0FEQ0FTVCIsIDB4ZmZm ZmZmZmYpOwojZW5kaWYKI2lmZGVmCUlOQUREUl9MT09QQkFDSwoJUHlNb2R1bGVfQWRkSW50 Q29uc3RhbnQobSwgIklOQUREUl9MT09QQkFDSyIsIElOQUREUl9MT09QQkFDSyk7CiNlbHNl CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0xPT1BCQUNLIiwgMHg3RjAw MDAwMSk7CiNlbmRpZgojaWZkZWYJSU5BRERSX1VOU1BFQ19HUk9VUAoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIklOQUREUl9VTlNQRUNfR1JPVVAiLCBJTkFERFJfVU5TUEVDX0dS T1VQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJTkFERFJfVU5TUEVD X0dST1VQIiwgMHhlMDAwMDAwMCk7CiNlbmRpZgojaWZkZWYJSU5BRERSX0FMTEhPU1RTX0dS T1VQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0FMTEhPU1RTX0dST1VQ IiwKCQkJCUlOQUREUl9BTExIT1NUU19HUk9VUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSU5BRERSX0FMTEhPU1RTX0dST1VQIiwgMHhlMDAwMDAwMSk7CiNlbmRp ZgojaWZkZWYJSU5BRERSX01BWF9MT0NBTF9HUk9VUAoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh bnQobSwgIklOQUREUl9NQVhfTE9DQUxfR1JPVVAiLAoJCQkJSU5BRERSX01BWF9MT0NBTF9H Uk9VUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX01BWF9M T0NBTF9HUk9VUCIsIDB4ZTAwMDAwZmYpOwojZW5kaWYKI2lmZGVmCUlOQUREUl9OT05FCglQ eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX05PTkUiLCBJTkFERFJfTk9ORSk7 CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX05PTkUiLCAweGZm ZmZmZmZmKTsKI2VuZGlmCgoJLyogSVB2NCBbZ3NdZXRzb2Nrb3B0IG9wdGlvbnMgKi8KI2lm ZGVmCUlQX09QVElPTlMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9PUFRJT05T IiwgSVBfT1BUSU9OUyk7CiNlbmRpZgojaWZkZWYJSVBfSERSSU5DTAoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIklQX0hEUklOQ0wiLCBJUF9IRFJJTkNMKTsKI2VuZGlmCiNpZmRl ZglJUF9UT1MKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9UT1MiLCBJUF9UT1Mp OwojZW5kaWYKI2lmZGVmCUlQX1RUTAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQ X1RUTCIsIElQX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfUkVDVk9QVFMKCVB5TW9kdWxlX0Fk ZEludENvbnN0YW50KG0sICJJUF9SRUNWT1BUUyIsIElQX1JFQ1ZPUFRTKTsKI2VuZGlmCiNp ZmRlZglJUF9SRUNWUkVUT1BUUwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQX1JF Q1ZSRVRPUFRTIiwgSVBfUkVDVlJFVE9QVFMpOwojZW5kaWYKI2lmZGVmCUlQX1JFQ1ZEU1RB RERSCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBfUkVDVkRTVEFERFIiLCBJUF9S RUNWRFNUQUREUik7CiNlbmRpZgojaWZkZWYJSVBfUkVUT1BUUwoJUHlNb2R1bGVfQWRkSW50 Q29uc3RhbnQobSwgIklQX1JFVE9QVFMiLCBJUF9SRVRPUFRTKTsKI2VuZGlmCiNpZmRlZglJ UF9NVUxUSUNBU1RfSUYKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNB U1RfSUYiLCBJUF9NVUxUSUNBU1RfSUYpOwojZW5kaWYKI2lmZGVmCUlQX01VTFRJQ0FTVF9U VEwKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNBU1RfVFRMIiwgSVBf TVVMVElDQVNUX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfTVVMVElDQVNUX0xPT1AKCVB5TW9k dWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNBU1RfTE9PUCIsIElQX01VTFRJQ0FT VF9MT09QKTsKI2VuZGlmCiNpZmRlZglJUF9BRERfTUVNQkVSU0hJUAoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIklQX0FERF9NRU1CRVJTSElQIiwgSVBfQUREX01FTUJFUlNISVAp OwojZW5kaWYKI2lmZGVmCUlQX0RST1BfTUVNQkVSU0hJUAoJUHlNb2R1bGVfQWRkSW50Q29u c3RhbnQobSwgIklQX0RST1BfTUVNQkVSU0hJUCIsIElQX0RST1BfTUVNQkVSU0hJUCk7CiNl bmRpZgojaWZkZWYJSVBfREVGQVVMVF9NVUxUSUNBU1RfVFRMCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSVBfREVGQVVMVF9NVUxUSUNBU1RfVFRMIiwKCQkJCUlQX0RFRkFVTFRf TVVMVElDQVNUX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfREVGQVVMVF9NVUxUSUNBU1RfTE9P UAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQX0RFRkFVTFRfTVVMVElDQVNUX0xP T1AiLAoJCQkJSVBfREVGQVVMVF9NVUxUSUNBU1RfTE9PUCk7CiNlbmRpZgojaWZkZWYJSVBf TUFYX01FTUJFUlNISVBTCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBfTUFYX01F TUJFUlNISVBTIiwgSVBfTUFYX01FTUJFUlNISVBTKTsKI2VuZGlmCgoJLyogSVB2NiBbZ3Nd ZXRzb2Nrb3B0IG9wdGlvbnMsIGRlZmluZWQgaW4gUkZDMjU1MyAqLwojaWZkZWYJSVBWNl9K T0lOX0dST1VQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBWNl9KT0lOX0dST1VQ IiwgSVBWNl9KT0lOX0dST1VQKTsKI2VuZGlmCiNpZmRlZglJUFY2X0xFQVZFX0dST1VQCglQ eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBWNl9MRUFWRV9HUk9VUCIsIElQVjZfTEVB VkVfR1JPVVApOwojZW5kaWYKI2lmZGVmCUlQVjZfTVVMVElDQVNUX0hPUFMKCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJJUFY2X01VTFRJQ0FTVF9IT1BTIiwgSVBWNl9NVUxUSUNB U1RfSE9QUyk7CiNlbmRpZgojaWZkZWYJSVBWNl9NVUxUSUNBU1RfSUYKCVB5TW9kdWxlX0Fk ZEludENvbnN0YW50KG0sICJJUFY2X01VTFRJQ0FTVF9JRiIsIElQVjZfTVVMVElDQVNUX0lG KTsKI2VuZGlmCiNpZmRlZglJUFY2X01VTFRJQ0FTVF9MT09QCglQeU1vZHVsZV9BZGRJbnRD b25zdGFudChtLCAiSVBWNl9NVUxUSUNBU1RfTE9PUCIsIElQVjZfTVVMVElDQVNUX0xPT1Ap OwojZW5kaWYKI2lmZGVmCUlQVjZfVU5JQ0FTVF9IT1BTCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiSVBWNl9VTklDQVNUX0hPUFMiLCBJUFY2X1VOSUNBU1RfSE9QUyk7CiNlbmRp ZgoKCS8qIFRDUCBvcHRpb25zICovCiNpZmRlZglUQ1BfTk9ERUxBWQoJUHlNb2R1bGVfQWRk SW50Q29uc3RhbnQobSwgIlRDUF9OT0RFTEFZIiwgVENQX05PREVMQVkpOwojZW5kaWYKI2lm ZGVmCVRDUF9NQVhTRUcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1BfTUFYU0VH IiwgVENQX01BWFNFRyk7CiNlbmRpZgojaWZkZWYJVENQX0NPUksKCVB5TW9kdWxlX0FkZElu dENvbnN0YW50KG0sICJUQ1BfQ09SSyIsIFRDUF9DT1JLKTsKI2VuZGlmCiNpZmRlZglUQ1Bf S0VFUElETEUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1BfS0VFUElETEUiLCBU Q1BfS0VFUElETEUpOwojZW5kaWYKI2lmZGVmCVRDUF9LRUVQSU5UVkwKCVB5TW9kdWxlX0Fk ZEludENvbnN0YW50KG0sICJUQ1BfS0VFUElOVFZMIiwgVENQX0tFRVBJTlRWTCk7CiNlbmRp ZgojaWZkZWYJVENQX0tFRVBDTlQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1Bf S0VFUENOVCIsIFRDUF9LRUVQQ05UKTsKI2VuZGlmCiNpZmRlZglUQ1BfU1lOQ05UCglQeU1v ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1NZTkNOVCIsIFRDUF9TWU5DTlQpOwojZW5k aWYKI2lmZGVmCVRDUF9MSU5HRVIyCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQ X0xJTkdFUjIiLCBUQ1BfTElOR0VSMik7CiNlbmRpZgojaWZkZWYJVENQX0RFRkVSX0FDQ0VQ VAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlRDUF9ERUZFUl9BQ0NFUFQiLCBUQ1Bf REVGRVJfQUNDRVBUKTsKI2VuZGlmCiNpZmRlZglUQ1BfV0lORE9XX0NMQU1QCglQeU1vZHVs ZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1dJTkRPV19DTEFNUCIsIFRDUF9XSU5ET1dfQ0xB TVApOwojZW5kaWYKI2lmZGVmCVRDUF9JTkZPCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht LCAiVENQX0lORk8iLCBUQ1BfSU5GTyk7CiNlbmRpZgojaWZkZWYJVENQX1FVSUNLQUNLCglQ eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1FVSUNLQUNLIiwgVENQX1FVSUNLQUNL KTsKI2VuZGlmCgoKCS8qIElQWCBvcHRpb25zICovCiNpZmRlZglJUFhfVFlQRQoJUHlNb2R1 bGVfQWRkSW50Q29uc3RhbnQobSwgIklQWF9UWVBFIiwgSVBYX1RZUEUpOwojZW5kaWYKCgkv KiBnZXR7YWRkcixuYW1lfWluZm8gcGFyYW1ldGVycyAqLwojaWZkZWYgRUFJX0FERFJGQU1J TFkKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJFQUlfQUREUkZBTUlMWSIsIEVBSV9B RERSRkFNSUxZKTsKI2VuZGlmCiNpZmRlZiBFQUlfQUdBSU4KCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJFQUlfQUdBSU4iLCBFQUlfQUdBSU4pOwojZW5kaWYKI2lmZGVmIEVBSV9C QURGTEFHUwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkVBSV9CQURGTEFHUyIsIEVB SV9CQURGTEFHUyk7CiNlbmRpZgojaWZkZWYgRUFJX0ZBSUwKCVB5TW9kdWxlX0FkZEludENv bnN0YW50KG0sICJFQUlfRkFJTCIsIEVBSV9GQUlMKTsKI2VuZGlmCiNpZmRlZiBFQUlfRkFN SUxZCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX0ZBTUlMWSIsIEVBSV9GQU1J TFkpOwojZW5kaWYKI2lmZGVmIEVBSV9NRU1PUlkKCVB5TW9kdWxlX0FkZEludENvbnN0YW50 KG0sICJFQUlfTUVNT1JZIiwgRUFJX01FTU9SWSk7CiNlbmRpZgojaWZkZWYgRUFJX05PREFU QQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkVBSV9OT0RBVEEiLCBFQUlfTk9EQVRB KTsKI2VuZGlmCiNpZmRlZiBFQUlfTk9OQU1FCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht LCAiRUFJX05PTkFNRSIsIEVBSV9OT05BTUUpOwojZW5kaWYKI2lmZGVmIEVBSV9TRVJWSUNF CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1NFUlZJQ0UiLCBFQUlfU0VSVklD RSk7CiNlbmRpZgojaWZkZWYgRUFJX1NPQ0tUWVBFCglQeU1vZHVsZV9BZGRJbnRDb25zdGFu dChtLCAiRUFJX1NPQ0tUWVBFIiwgRUFJX1NPQ0tUWVBFKTsKI2VuZGlmCiNpZmRlZiBFQUlf U1lTVEVNCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1NZU1RFTSIsIEVBSV9T WVNURU0pOwojZW5kaWYKI2lmZGVmIEVBSV9CQURISU5UUwoJUHlNb2R1bGVfQWRkSW50Q29u c3RhbnQobSwgIkVBSV9CQURISU5UUyIsIEVBSV9CQURISU5UUyk7CiNlbmRpZgojaWZkZWYg RUFJX1BST1RPQ09MCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1BST1RPQ09M IiwgRUFJX1BST1RPQ09MKTsKI2VuZGlmCiNpZmRlZiBFQUlfTUFYCglQeU1vZHVsZV9BZGRJ bnRDb25zdGFudChtLCAiRUFJX01BWCIsIEVBSV9NQVgpOwojZW5kaWYKI2lmZGVmIEFJX1BB U1NJVkUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJBSV9QQVNTSVZFIiwgQUlfUEFT U0lWRSk7CiNlbmRpZgojaWZkZWYgQUlfQ0FOT05OQU1FCglQeU1vZHVsZV9BZGRJbnRDb25z dGFudChtLCAiQUlfQ0FOT05OQU1FIiwgQUlfQ0FOT05OQU1FKTsKI2VuZGlmCiNpZmRlZiBB SV9OVU1FUklDSE9TVAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFJX05VTUVSSUNI T1NUIiwgQUlfTlVNRVJJQ0hPU1QpOwojZW5kaWYKI2lmZGVmIEFJX01BU0sKCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJBSV9NQVNLIiwgQUlfTUFTSyk7CiNlbmRpZgojaWZkZWYg QUlfQUxMCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUlfQUxMIiwgQUlfQUxMKTsK I2VuZGlmCiNpZmRlZiBBSV9WNE1BUFBFRF9DRkcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50 KG0sICJBSV9WNE1BUFBFRF9DRkciLCBBSV9WNE1BUFBFRF9DRkcpOwojZW5kaWYKI2lmZGVm IEFJX0FERFJDT05GSUcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJBSV9BRERSQ09O RklHIiwgQUlfQUREUkNPTkZJRyk7CiNlbmRpZgojaWZkZWYgQUlfVjRNQVBQRUQKCVB5TW9k dWxlX0FkZEludENvbnN0YW50KG0sICJBSV9WNE1BUFBFRCIsIEFJX1Y0TUFQUEVEKTsKI2Vu ZGlmCiNpZmRlZiBBSV9ERUZBVUxUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUlf REVGQVVMVCIsIEFJX0RFRkFVTFQpOwojZW5kaWYKI2lmZGVmIE5JX01BWEhPU1QKCVB5TW9k dWxlX0FkZEludENvbnN0YW50KG0sICJOSV9NQVhIT1NUIiwgTklfTUFYSE9TVCk7CiNlbmRp ZgojaWZkZWYgTklfTUFYU0VSVgoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIk5JX01B WFNFUlYiLCBOSV9NQVhTRVJWKTsKI2VuZGlmCiNpZmRlZiBOSV9OT0ZRRE4KCVB5TW9kdWxl X0FkZEludENvbnN0YW50KG0sICJOSV9OT0ZRRE4iLCBOSV9OT0ZRRE4pOwojZW5kaWYKI2lm ZGVmIE5JX05VTUVSSUNIT1NUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTklfTlVN RVJJQ0hPU1QiLCBOSV9OVU1FUklDSE9TVCk7CiNlbmRpZgojaWZkZWYgTklfTkFNRVJFUUQK CVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJOSV9OQU1FUkVRRCIsIE5JX05BTUVSRVFE KTsKI2VuZGlmCiNpZmRlZiBOSV9OVU1FUklDU0VSVgoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh bnQobSwgIk5JX05VTUVSSUNTRVJWIiwgTklfTlVNRVJJQ1NFUlYpOwojZW5kaWYKI2lmZGVm IE5JX0RHUkFNCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTklfREdSQU0iLCBOSV9E R1JBTSk7CiNlbmRpZgoKCS8qIEluaXRpYWxpemUgZ2V0aG9zdGJ5bmFtZSBsb2NrICovCiNp ZmRlZiBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCglnZXRob3N0YnluYW1lX2xvY2sgPSBQeVRo cmVhZF9hbGxvY2F0ZV9sb2NrKCk7CiNlbmRpZgp9CgoKI2lmbmRlZiBIQVZFX0lORVRfUFRP TgoKLyogU2ltcGxpc3RpYyBlbXVsYXRpb24gY29kZSBmb3IgaW5ldF9wdG9uIHRoYXQgb25s eSB3b3JrcyBmb3IgSVB2NCAqLwoKaW50CmluZXRfcHRvbihpbnQgYWYsIGNvbnN0IGNoYXIg KnNyYywgdm9pZCAqZHN0KQp7CglpZiAoYWYgPT0gQUZfSU5FVCkgewoJCWxvbmcgcGFja2Vk X2FkZHI7CgkJcGFja2VkX2FkZHIgPSBpbmV0X2FkZHIoc3JjKTsKCQlpZiAocGFja2VkX2Fk ZHIgPT0gSU5BRERSX05PTkUpCgkJCXJldHVybiAwOwoJCW1lbWNweShkc3QsICZwYWNrZWRf YWRkciwgNCk7CgkJcmV0dXJuIDE7Cgl9CgkvKiBTaG91bGQgc2V0IGVycm5vIHRvIEVBRk5P U1VQUE9SVCAqLwoJcmV0dXJuIC0xOwp9Cgpjb25zdCBjaGFyICoKaW5ldF9udG9wKGludCBh ZiwgY29uc3Qgdm9pZCAqc3JjLCBjaGFyICpkc3QsIHNvY2tsZW5fdCBzaXplKQp7CglpZiAo YWYgPT0gQUZfSU5FVCkgewoJCXN0cnVjdCBpbl9hZGRyIHBhY2tlZF9hZGRyOwoJCWlmIChz aXplIDwgMTYpCgkJCS8qIFNob3VsZCBzZXQgZXJybm8gdG8gRU5PU1BDLiAqLwoJCQlyZXR1 cm4gTlVMTDsKCQltZW1jcHkoJnBhY2tlZF9hZGRyLCBzcmMsIHNpemVvZihwYWNrZWRfYWRk cikpOwoJCXJldHVybiBzdHJuY3B5KGRzdCwgaW5ldF9udG9hKHBhY2tlZF9hZGRyKSwgc2l6 ZSk7Cgl9CgkvKiBTaG91bGQgc2V0IGVycm5vIHRvIEVBRk5PU1VQUE9SVCAqLwoJcmV0dXJu IE5VTEw7Cn0KCiNlbmRpZgo= --------------67BDF2B7F2453AAB6129FCFC-- From drifty@bigfoot.com Sun Jul 14 20:57:31 2002 From: drifty@bigfoot.com (Brett Cannon) Date: Sun, 14 Jul 2002 12:57:31 -0700 (PDT) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020714110513.GA2280@hishome.net> Message-ID: [Oren Tirosh] > On Sun, Jul 14, 2002 at 01:16:10AM -0700, Brett Cannon wrote: > > [Oren Tirosh] > > This anthropomorphic description has too many irrelevant associations. Let's > leave the parents out of this. =) OK. > The logic is simple: StopIteration is not an error. It's not even a warning, > it's a normal part of program operation. It uses the exception mechanism > because it is the most convenient form of out-of-band signalling. The > hypothetical IteratorExhausted is an error. The fact that both of them > happen to be exceptions is almost a coincidence. > It still doesn't "feel" right. I completely understand this explanation for what StopIteration is supposed to be viewed as, but I just don't naturally view it that way. And it isn't because it is an exception. I am sure all of us have used exceptions to break out of some deeply nested loops or pull off some other fancy control flow. I think my view stems from what it is saying "I have reached what I believe to be the end or what you have requested to be the end". I don't think that should be some notice. > > definitely will be. I know I thought that StopIteration was continuously > > raised until the emails on this subject started. > > For most Python iterators it is. This behavior is OK but it could be changed And if most are already like that, then maybe it should be the default behavior. Unless my understanding is faulty, two-arg iterators are a convenience to make an iterator out of a callable function by specifying where it the iterator will raise StopIteration. My view is that if you doin't want to stop where the iterator says you said to signal, then an explicit 'if' would be better. I mean you don't see iterators on lists raising IndexError if you keep calling .next(). I am obviously +1 on forcing StopIteration to be permanently raised. If you need to go beyond the signalled end, you can do the old-fashioned way before we had iterators. I say make iterators so that they have the least chance of causing errors. They are supposed to simplify our lives not cause us to have a new possible bug to watch out for in code. Anyway, since I am not about to come up with some clever code chunk that will show why the current state of affairs is bad beyond my opinion I will leave it up to Guido to make a choice. At least, based on the tone of these emails on this topic, this is not going to be a decision that is going to ruffle some feathers. -Brett C. From tim.one@comcast.net Sun Jul 14 21:42:13 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 14 Jul 2002 16:42:13 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <15665.37242.446627.141013@anthem.wooz.org> Message-ID: [Barry] > I think it would be fine to leave the situation as is > (i.e. undefined). As is, the PEP makes promises the implementation doesn't keep, so that part isn't fine. If we don't want to change the implementation(s), then that part of the PEP should be changed to, e.g., Resolution: once StopIteration is raised, the effect of calling it.next() again isn't defined by the iteration protocol. Code manipulating arbitrary iterators must therefore not rely on any particular behavior in this case. For example, a given iterator may choose to raise StopIteration again, raise some other exception, return a value, play the theme music for Monty Python's Flying Circus, or decline to define the effect. That's a start. Then another round of decisions needs to be made for each iterator Python supplies: should it define the effect or not, and if so what is it, or if not should it be explicit about not defining it? Generators already do: If an unhandled exception-- including, but not limited to, StopIteration --is raised by, or passes through, a generator function, then the exception is passed on to the caller in the usual way, and subsequent attempts to resume the generator function raise StopIteration. The current docs for two-argument iter() also tell the truth about what happens. I'm not sure anything else does, unless we take the absence of docs as implying that an iterator explicitly refuses to define what happens. So from Fred's POV , it would be easier to change the implementations to match the current PEP wording. I'll note one pragmatic concern. This idiom is becoming mildly popular: for x in someiterator: if is_boundary_marker(x): break else: do_something_with(x) followed by (in time, not necessarily in a physically distinct loop): for x in someiterator: # and we expect this to pick up where the last loop left off If StopIteration isn't a sink state, this falls under the "code manipulating arbitrary iterators must therefore not rely on any particular behavior in this case" warning in the reworded docs. That is, if the first loop terminated via iterator exhaustion, the obvious intent is that the second loop never enter its body. This is reliably true if and only if StopIteration is guaranteed to be a sink state. The more I ponder that, the more I'm inclined to believe that the PEP made the right decision the first time: guaranteeing *something* makes it possible to write a larger class of generic code. From mal@lemburg.com Sun Jul 14 21:44:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 22:44:26 +0200 Subject: [Python-Dev] python package References: <3D319681.9530.100DED6A@localhost> Message-ID: <3D31E2AA.4030804@lemburg.com> Gordon McMillan wrote: >>You haven't commented on the sys.modules trick yet. >>This one doesn't even use the __path__ hackery :-) >> >>DateTime.py: >>import sys >>import mx.DateTime >>sys.modules[__name__] = mx.DateTime > > [...] > >>See: it's the same module :-) > > > Anytime x != sys.modules[x].__name__, > someone, sometime will suffer. > > Installer and (I believe) py2exe have hooks > so that this gets analyzed properly. The hook > is keyed by "DateTime". > > If you really find it intolerable to stick your > users with making a one line change in their > code, you might consider contributing hooks > to Installer (or patches to py2exe). I don't. I'm just using my package series as example of how moving a set of top-level modules/packages to a single package can be accomplished. That's all. I told my users to upgrade their applications from 1.x to 2.0 by switching from 'import DateTime' to 'from mx import DateTime' when I made the move and indeed, only one user complained -- which is why I provided him with a backwards compatiblity package along the lines of what I've posted here. He only needed it to be able to read back pickled data, BTW. > Particularly for your non-free packages, since > I'm not going to download those and reverse-engineer > them. Hmm, I don't understand this comment. > Or perhaps you could do like Pmw, and > include a "bundle" script. py2exe works just fine with the mx stuff. I suppose your installer does too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sun Jul 14 22:09:12 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 14 Jul 2002 23:09:12 +0200 Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid> <3D31A50E.4050800@lemburg.com> <3D31AE95.6070804@lemburg.com> Message-ID: <3D31E878.3020304@lemburg.com> Martin v. Loewis wrote: > "M.-A. Lemburg" writes: >>Of course, we no longer need to convert the tokenizer to >>work on Py_UNICODE, so the updated text should mention >>that compile() encodes Unicode input to UTF-8 to the continue >>with the usual processing. > > > The PEP currently does not say that. I know, it should be updated to the solution found by Hisao. >>>2. convert to byte string using "utf-8" encoding, >> > [...] > >>Option 2. > > > I think this contradicts the current wording of the PEP. It says > > "5. ... and creating string objects from the Unicode literal data by > first reencoding the UTF-8 data into 8-bit string data using the given > file encoding" > > The phrasing "the given file encoding" is a bit lax, but given the > string > > u""" > # -*- coding: iso-8859-1 -*- > s = 'some latin-1 text' > """ > > I would expect that the encoding "given" is iso-8859-1, not utf-8. > Now, I interpret your message to mean that s will be encoded in > utf-8. Correct? Hmm, good point. 8-bit string literals will have to be reencoded using the encoding stated in the coding comment... skipping that comment for Unicode argument to compile() would break this. > If so, I think Fredrik is right, and > > compile(unicode(script, extract_encoding(script))) > > does indeed something different than > > compile(script) > > as the latter would give the string value assigned to s in its > original encoding, i.e. latin-1. Right. We don't want that. compile(unicode(script, extract_encoding(script))) should be the same as compile(script) >>Ideal would be to have the tokenizer skip the encoding declaration >>detection and start directly with the UTF-8 string > > > "skip the encoding declaration" can't really work; you have to parse > the source code line by line. You can tell the implementation to > ignore the encoding declaration, if desired. No, this wouldn't be right. I withdraw that comment :-) >>(this also solves the problems you'd run into in case the Unicode >>source code has a source code encoding comment). > > > Well, that is precisely the issue that I'm trying to address here. I > still believe that the resulting behaviour is not specified in the PEP > at the moment (which is no big deal, since the current implementation > does not touch compile() at all). I'll try to come up with a proper wording tomorrow. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gward@python.net Sun Jul 14 22:26:20 2002 From: gward@python.net (Greg Ward) Date: Sun, 14 Jul 2002 17:26:20 -0400 Subject: [Python-Dev] PEP 11: unsupported platforms In-Reply-To: References: Message-ID: <20020714212620.GA3192@cthulhu.gerg.ca> On 13 July 2002, Martin v. Loewis said: > PEP: 11 > Title: Unsupported Platforms The only feedback I have is consider changing the name to "Removing Support for Obsolete Platforms", since that's what most of the PEP is about. However, since it also includes a list of those obsolete platforms, your title is not without merit. Greg -- Greg Ward - Linux nerd gward@python.net http://starship.python.net/~gward/ I love ROCK 'N ROLL! I memorized all the WORDS to "WIPE-OUT" in 1965!! From guido@python.org Sun Jul 14 23:19:36 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 14 Jul 2002 18:19:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sun, 14 Jul 2002 16:42:13 EDT." References: Message-ID: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> > I'll note one pragmatic concern. This idiom is becoming mildly popular: > > for x in someiterator: > if is_boundary_marker(x): > break > else: > do_something_with(x) > > followed by (in time, not necessarily in a physically distinct loop): > > for x in someiterator: > # and we expect this to pick up where the last loop left off > > If StopIteration isn't a sink state, this falls under the "code > manipulating arbitrary iterators must therefore not rely on any > particular behavior in this case" warning in the reworded docs. > That is, if the first loop terminated via iterator exhaustion, the > obvious intent is that the second loop never enter its body. This > is reliably true if and only if StopIteration is guaranteed to be a > sink state. The more I ponder that, the more I'm inclined to > believe that the PEP made the right decision the first time: > guaranteeing *something* makes it possible to write a larger class > of generic code. But if you fall through the end of the first loop, i.e. you exhaust the iterator prematurely, you should do something else in your logic. An else clause on the for loop might be a good place to do something appropriate. I haven't used this idiom often enough to know whether that places an undue burden on the programmer. I think the reported cases fall mostly in the category "I didn't know it could do that and it took me a long time to track it down." I also note that even if the PEP specifies that StopIteration is a sink state and we fix all built-in iterators to make it so, it's easy for an iterator implementation to do the wrong thing (especially since often an extra state bit is necessary to implement the sinkstate property). The question is, should we place the burden on iterator users to avoid calling next() after the first StopIteration, or should we place the burden on iterator implementations? Since by far the most common iterator use case is still a single for loop, which already does the right thing, it's not at all clear to me which is worse. --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney@avaya.com Sun Jul 14 23:46:01 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Mon, 15 Jul 2002 08:46:01 +1000 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: > From: David Abrahams [mailto:david.abrahams@rcn.com] > > One other possibility: if x.__iter__() is x, it's a > single-pass sequence. I > realize this involves actually invoking the __iter__ method > and conjuring > up a new iterator, but that's generally a lightweight operation... Definitely not reliable - it will fail for a file object ... (even with the changes currently going in). What would be more reliable (but still not infallible) would be: if iter(x) == iter(x): # this is *definitely* a single-pass iterable All iterators are by definition single-pass iterables, and with the changes being made to the file object, the above code would work for all builtin iterable types as well. Tim Delaney From tdelaney@avaya.com Sun Jul 14 23:51:36 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Mon, 15 Jul 2002 08:51:36 +1000 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: > From: Delaney, Timothy [mailto:tdelaney@avaya.com] > > if iter(x) == iter(x): > # this is *definitely* a single-pass iterable Of course, that should have been if iter(x) is iter(x): # this is *definitely* a single-pass iterable Too much damned Java at the moment :( Tim Delaney From tim.one@comcast.net Mon Jul 15 01:03:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 14 Jul 2002 20:03:33 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > But if you fall through the end of the first loop, i.e. you exhaust > the iterator prematurely, you should do something else in your logic. I'm not clear on why falling through should be consided premature termination of the iterator. If you're looking for a boundary, it may be normal for it not to be there. For example, here's something that suppresses #if 0 blocks, copying everything else to stdout; there's really nothing special about an input file that doesn't have any #if 0 blocks. """ f = file("some_file") get = iter(f.readline, "") depth = 0 while True: # Copy lines until #if 0. for line in get: if line == "#if 0\n": depth = 1 break else: print line, # Ignore lines through matching #endif. for line in get: if line.startswith("#if "): depth += 1 elif line == "#endif\n": depth -=1 if depth == 0: break else: break if depth: raise SyntaxError("%d unclosed #if blocks" % depth) """ This is quite natural -- even elegant. > An else clause on the for loop might be a good place to do something > appropriate. It is, but doing it on more than one of the loops is clutter provided that StopIteration is sticky (if it is, either loop can detect EOF, and there's no need for both to). > I haven't used this idiom often enough to know whether that places an > undue burden on the programmer. I think the reported cases fall > mostly in the category "I didn't know it could do that and it took me > a long time to track it down." If you can't guess what .next() might do after raising StopIteration the first time, that can't make things easier to track down . > I also note that even if the PEP specifies that StopIteration is a > sink state and we fix all built-in iterators to make it so, it's easy > for an iterator implementation to do the wrong thing (especially since > often an extra state bit is necessary to implement the sinkstate > property). I agree, although if the docs are clear about the requirement, it's not beyond ordinary skill to implement it. > The question is, should we place the burden on iterator users to avoid > calling next() after the first StopIteration, or should we place the > burden on iterator implementations? I don't think that's the real choice. If it's left undefined by the protocol, then some iterators will be deliberately defined to "do something useful" if called after their first StopIteration. Then the burden isn't on the user to avoid it, but to keep track of which iterators do and don't "do something useful" after they said they stopped. > Since by far the most common iterator use case is still a single for > loop, which already does the right thing, it's not at all clear to me > which is worse. Well, there are more users of iterators than implementers. Or if there aren't, we screwed up . From greg@cosc.canterbury.ac.nz Mon Jul 15 01:45:11 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 12:45:11 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020712171626.A2253@hishome.net> Message-ID: <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz> Oren Tirosh : > A file object is an iterator pretending to be a container. For historical > reasons it uses 'readline' instead of 'next' I think it's more complicated than that. If the file object were to become an object obeying the iterator protocol, its next() method should really return the next *byte* of the file. Then you'd still want methods like read(), readline() etc. for reading in larger chunks. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 02:40:41 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 13:40:41 +1200 (NZST) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020714110513.GA2280@hishome.net> Message-ID: <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz> Oren Tirosh : > The hypothetical IteratorExhausted is an error. Calling it IteratorExhaustedError would make this clearer. But I'm not sure it would be a good idea to complexify the iterator protocol any more than absolutely necessary, and thus place an extra burden on all iterator implementors. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 02:51:56 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 13:51:56 +1200 (NZST) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207150151.g6F1puv11830@oma.cosc.canterbury.ac.nz> > What about this example? > >>> l = [] > >>> li = iter(l) > >>> li.next() > Traceback (most recent call last): > File "", line 1, in ? > StopIteration > >>> l.extend([1, 2, 3]) > >>> li.next() > 1 > > does the list iterator violate the proposed behavior? Perhaps the docs should say something like "The next() method raises StopIteration if there are no more items remaining in the sequence at the time of the call." This would both imply the repeated raising of StopIteration in the case where the sequence hasn't been modified in the meantime, and also allow the above behaviour (which seems entirely logical, to my way of thinking). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 02:05:10 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 13:05:10 +1200 (NZST) Subject: [Python-Dev] python package In-Reply-To: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207150105.g6F15Au11634@oma.cosc.canterbury.ac.nz> Guido: > [Michael] > > I've read the entire thread and still do not understand why you are > > suggesting the new standard package hirearchy should be named > > "new". > > Uh? Who is proposing to name it "new"? Maybe he's getting it mixed up with the thead about the "new" module? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 02:12:41 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 13:12:41 +1200 (NZST) Subject: Further suggestion (RE: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting) In-Reply-To: Message-ID: <200207150112.g6F1Cfv11646@oma.cosc.canterbury.ac.nz> Tim: > [Aahz] > > I've used "%20s" * 5 frequently enough in the past to do crude tables. > > That's not a feature I'd like to lose. > > So has Guido -- he'll remember that before it's too late . Ditto "-" > to switch string justification. Addendum to my suggestion: The "{...}" plays the role of the "s" in a normal string format, so that you can do %-20{foo} etc. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 01:48:44 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 12:48:44 +1200 (NZST) Subject: Suggestion for fixing %(foo)s (Re: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting) In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz> Guido: > Somebody else: > > Guido, can you please, for our enlightenment, tell us what are the > > reasons you feel %(foo)s was a mistake? > > Because of the trailing 's'. It's very easy to leave it out by > mistake How about introducing a new format %{foo} which is defined to be the same as %(foo)s. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 15 02:00:03 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 15 Jul 2002 13:00:03 +1200 (NZST) Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <20020712194151.A6406@hishome.net> Message-ID: <200207150100.g6F103f11614@oma.cosc.canterbury.ac.nz> Oren Tirosh : > The primary objection was that the documentation for the types module > says that names exported by future versions will all end in "Type". Suggestion: Introduce a new module called "newtypes". You can interpret this name two ways: the module containing all the new names for the types, and the module you use when you want to create a new instance of a type! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@zope.com Mon Jul 15 03:41:16 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 14 Jul 2002 22:41:16 -0400 Subject: [Python-Dev] Termination of two-arg iter() References: <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> <15665.37242.446627.141013@anthem.wooz.org> <20020714160611.GA25950@hishome.net> Message-ID: <15666.13900.581653.909094@anthem.wooz.org> The problem that Jeff Epler brought up (extending the list after StopIterator was returned, and having a subsequent .next() not give StopIterator) has a precedence in dict iterators: -------------------- snip snip -------------------- >>> d = {1:2, 3:4} >>> it = iter(d) >>> for x in d: print x ... 1 3 >>> d[5] = 6 >>> it.next() Traceback (most recent call last): File "", line 1, in ? RuntimeError: dictionary changed size during iteration -------------------- snip snip -------------------- So why doesn't that last .next() also return StopIterator? . StopIterator is a sink state for dict iterators if I don't change the size of the dict. Shouldn't list and dict iterators should behave similarly for mutation (or at least resizing) between .next() calls? -Barry From cce@clarkevans.com Mon Jul 15 05:06:51 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Mon, 15 Jul 2002 00:06:51 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: ; from aleax@aleax.it on Fri, Jul 12, 2002 at 06:16:54PM +0200 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <15662.65488.741894.155099@anthem.wooz.org> Message-ID: <20020715000651.A35319@doublegemini.com> On Fri, Jul 12, 2002 at 06:16:54PM +0200, Alex Martelli wrote: | On Friday 12 July 2002 06:12 pm, Barry A. Warsaw wrote: | > >>>>> "AM" == Alex Martelli writes: | > >> I think Alex is in a great position to become co-author of PEP | > >> 246. | > | > AM> Aye aye, cap'n. What's the procedure for "becoming co-author" | > AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff | > AM> to Barry, or ... ? | > | > That would work fine, although I would like to get /some/ | > acknowledgement from Clark Evans that passing the torch (or sharing | > the flame as it were) was okay with him. | | Makes sense (& thanks to the others who suggested the same thing). | I mailed Clark and I'll wait to hear from him. Wow! I'm thrilled to hear that this PEP hasn't died of neglect. When I wrote it I was relatively new to Python. Python makes you think differently about a great many things. What it means to be a particular "type" is one of those mind-bending experiences I had. If it looks like a file, acts like a file, it's a file. I love this straight-forward mentality and this clarity of thought which carries through all of Python makes it a true pleasure to code with. This PEP was written just by listening to people (and getting their feedback) on the interfaces list. It just seemed to me that people wanted a way to ask Python: Hey is this object a Thymagig? Although this is a nice question to ask; as a programmer with 10+ years of building components, and using other's components to build larger applications I often ask a similar but related question: Well, if it ain't a Thymagig, where is the wrapper so I can treat it like one? It is this second question that the Object Adaptation PEP is based. To me, this is the stuff FAQ's are made for ... and I wonder, why can't the language do this for me? Why not have a language where the library writers (who usually know each other) can't build in the glue to connect their components in such a way that the application builder doesn't have to do the "interface hunt". Speaking of which, I personally don't feel that interfaces is the way to go... there are many reasons why I'm using Python and not using Java. Interfaces are too inflexible and often times can cause more headaches than they save with additional typing. Frankly, I think that the whole "interface paradaigm" brings with it alot of extra baggage to the "Is this object a Thymagig?" question; and I think this extra baggage is just not needed -- especially for Python. For example, interface inheritance is one of those bits of baggage (that others may disagree with me on). Interface inheritance is one of those "givens" that one must do to do interfaces right. Interface inheritance isn't needed. Why? Mix-ins are far more powerful mechanism as they make you think about operations which are othogonal to each other. You think that interface inheritance helps, but in my experience it just screws with your thought process... ;) Anyway, I'm so glad that Alex has taken up the cause; I'm not all that actively involved in Python internals... but as a user I can't advocate more for something like this. Alex, I'm delighted if you would take ownership of the PEP. ON A RELATED NOTE, if you have not otherwise found out, YAML (YAML Ain't Markup Language) is doing wonderfully. It is a pythonish serialization format for native data structures of Python/Perl/Ruby/Etc. You should check it out... http://yaml.org ; we will have a last call for our working draft on Sept 1st. YAML is progressing nicely and feedback from the core Python team would be wonderful. There is a pure Python implementation written by Steve Howell and a "C" library written by Neil Watkiss with python glue written by yours-truly. The "C" library is still private for another few weeks, but the pure Python one is available now as a work-in-progress. The specification itself is very near the finish line. Kind Regards, Clark Yo! Try YAML on fer size. YAML is serialization for the masses. -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From tim.one@comcast.net Mon Jul 15 05:22:19 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 00:22:19 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <15666.13900.581653.909094@anthem.wooz.org> Message-ID: [Barry] > The problem that Jeff Epler brought up (extending the list after > StopIterator was returned, and having a subsequent .next() not give > StopIterator) has a precedence in dict iterators: > > -------------------- snip snip -------------------- > >>> d = {1:2, 3:4} > >>> it = iter(d) > >>> for x in d: print x > ... > 1 > 3 > >>> d[5] = 6 > >>> it.next() > Traceback (most recent call last): > File "", line 1, in ? > RuntimeError: dictionary changed size during iteration > -------------------- snip snip -------------------- > > So why doesn't that last .next() also return StopIterator? . According to the PEP as it exists, it should. > StopIterator is a sink state for dict iterators if I don't change the > size of the dict. That's an illusion . See below for why. > Shouldn't list and dict iterators should behave similarly for > mutation (or at least resizing) between .next() calls? Within a single for-loop, list iterators are constrained to be compatible with their previous implementation via the __getitem__ protocol. So, for example, this must work: >>> x = [1] >>> for y in x: ... print y ... x.append(y+1) ... if y == 5: ... break ... 1 2 3 4 5 >>> because that's the way "for elt in list" has always worked. Nothing about that violates what the PEP says, though (in particular, StopIteration isn't an issue there, as it's never raised). It's too difficult to do something similar for dict iterators, and that's why they raise an exception if they detect a size change. However, they *really* want to raise an exception if the dict mutates, but that's also too hard to do -- checking for a size change is a cheap & easy but vulnerable approximation. Ponder the output from this on a run, and then across several runs: from random import random for limit in range(1, 100): d = {} for i in range(limit): d[random()] = 1 i = d.iterkeys() x = list(i) # exhausts the iterator d.popitem() d[random()] = 1 # probably mutated, but # of elements is the same try: print i.next() # tries poking the iterator again print limit, list(i) except StopIteration: pass You'll find that this *usually* raises StopIteration on the lone i.next() call (and you don't get output in those cases). However, for *some* list sizes, it's quite likely that poking the iterator again not only produces another value, but that it can produce several more values. There's no predicting how many, which or when, though. It's a bit of a stretch to call that "a feature" too . From tim.one@comcast.net Mon Jul 15 05:54:17 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 00:54:17 -0400 Subject: [Python-Dev] AtExit Functions In-Reply-To: <3D2AFA97.2030402@lemburg.com> Message-ID: [Guido] >> I think you may be making a wrong use of Py_AtExit(). The docs state >> (since 1998): >> >> Since Python's internal finallization will have completed before the >> cleanup function, no Python APIs should be called by *func*. [Guido] > Hmm, and that includes Py_DECREF() and PyObject_Del() ? Certainly. In particular, Py_DECREF() can end up calling any Python code at all, via __del__ methods. > In that case, I have a problem since I'm using those > two to clean up caches and free lists in the mx tools. We have two sets of exit-function gimmicks, one that runs at the very start of Py_Finalize(), and the other at the very end. If you need to clean up Python objects, you have to get into the first group. The interpreter has been torn down beyond usefulness by the time we get to the second group (that's only useful for low-level OS and external non-Python C library cleanup). >> You may want to use the atexit.py module instead to schedule your >> module's cleanup action; these exit functions are called much earlier. > That's difficult to get right since I have to register such a > function from C. ? You know how to write Python-callable C functions. I'm not sure why you would need to call atexit.register from C, but if you must then that's easy too (PyObject_Call). > Also, atexit.py is not present in Python 1.5.2. What's that ? From oren-py-d@hishome.net Mon Jul 15 06:27:09 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 15 Jul 2002 01:27:09 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz> References: <20020714110513.GA2280@hishome.net> <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz> Message-ID: <20020715052709.GA13426@hishome.net> On Mon, Jul 15, 2002 at 01:40:41PM +1200, Greg Ewing wrote: > Oren Tirosh : > > > The hypothetical IteratorExhausted is an error. > > Calling it IteratorExhaustedError would make this clearer. > > But I'm not sure it would be a good idea to complexify > the iterator protocol any more than absolutely necessary, > and thus place an extra burden on all iterator implementors. Let's look at the options: (are there any I forgot?) 1. Define StopIteration as a sticky state. People will write code that relies on this behavior. The code will sometimes fail when run on 2.2.x or with certain existing user iterators. It's probably the worst possible combination: you have to implement this in your iterators but you can't rely on it in code that may run on 2.2 or get iterators from libraries written before this was made into a requirement. 2. Leave things the way they are. Since *almost* all builtin iterators behave this way people will continue to write code that relies on this. It will silently fail for some builtin iterators and user iterators. 3. Silently fix all iterators to be in a StopIteration sink state. Even worse than #2. It looks like version 2.2 is going to live a long time. This will cause subtle and hard-to-find differences in behavior between 2.2 and 2.3. 4. Require iterators to raise an exception. Places an extra burden on all iterator implementors. A lot of existing code will suddenly be redefined as not kosher. 5. Leave it officially undefined but raise an exception for all or even some builtin iterators. Raising an exception for even one popular type (listiter) would be more than enough to discourage code that relies on this behavior. No extra burden is placed on iterator implementers. No change to iterator protocol definition. No existing code is suddenly non-conforming. A small amount of code may break but at least it will raise a meaningful exception. silent-errors-delenda-est-ly yours, Oren From aleax@aleax.it Mon Jul 15 07:15:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 08:15:11 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020715000651.A35319@doublegemini.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> Message-ID: On Monday 15 July 2002 06:06 am, Clark C . Evans wrote: ... > Although this is a nice question to ask; as a programmer with 10+ > years of building components, and using other's components to > build larger applications I often ask a similar but related question: > > Well, if it ain't a Thymagig, where is the > wrapper so I can treat it like one? > > It is this second question that the Object Adaptation PEP is based. Right. The concept of adaptation definitely comes from the Components world, where "is-A" is a dirty word. High time the Objects world listened:-). > Speaking of which, I personally don't feel that interfaces > is the way to go... there are many reasons why I'm using Python > and not using Java. Interfaces are too inflexible and often times > can cause more headaches than they save with additional typing. A suitably Pythonic approach to interfaces won't cause any more headaches than, say, booleans (check Google for the firestorm that the Booleans PEP caused...:-). We do need the CONCEPT of an interface, just as we need the concept of true and false values. Whether reifying the concept into a type or language-blessed category buys you more than it costs -- much depends on how it's done. Personally, I'd rather have that superset of "interface" that is known in Haskell as a "typeclass" -- and I'd settle for that middle ground that is known in C++ or Java as an "abstract class". If interfaces/typeclasses/&c came with Eiffelish 'contracts', so much the better. But it's all in the narrow range between -0 and +0 for me. > especially for Python. For example, interface inheritance is > one of those bits of baggage (that others may disagree with me on). > Interface inheritance is one of those "givens" that one must do to I'm not sure what you mean by "interface inheritance". The ability to define an interface by adding some stuff to another interface -- that's the only sense in which one could possibly speak of such a thing in Java, say -- is extremely convenient but not a must-do... COM manages without, not conveniently but OK all the same. I suspect you mean something quite different. > do interfaces right. Interface inheritance isn't needed. Why? > Mix-ins are far more powerful mechanism as they make you think about Mix-in inheritance is basically about _implementation_. Just like most inheritance in Python most of the time -- issubclass, isinstance and exception handling being the exceptions to the 'most'. Nobody's planning to take it away from Python, anyway. > operations which are othogonal to each other. You think that > interface inheritance helps, but in my experience it just screws > with your thought process... ;) I've had no particular trouble designing interfaces either in Java, with interfaces able to inherit from each other, or in COM, without such convenience. In the latter case one ends up with a bit of copy and paste, not pleasant, but WTH. I see it mostly as a bug in COM's MIDL and supporting tools/wizards -- they let you ask for interface inheritance even though the underlying object model does not support it (by doing the copy-and-paste implicitly) but then don't go all the way (e.g., they'll freely and erroneously reuse in the inheriting interface some method dispatch-IDs that the interface inherited-from has already assigned -- eccch). The big question is rather: given that Isub inherits from Isuper, does any object implementing Isub also implicitly implement Isuper? That's the object-philosophy, where inheritance is thought to reflect deep IS-A relations. It's NOT the component-philosophy, where inheritance is an implementation-convenience detail. I much prefer the component-approach, where my component has full control, if it wants to, on exactly what interfaces it exposes. This lets you factor out any commonality between interfaces without giving any actual reality to the factored-out common subset -- it can even stay an "abstract interface" that no object actually supplies, if you want. I do think it's a respectable thesis, though I don't agree with it, that the OO rather than component approach to inheritance is, while less flexible for advanced uses, easier to use -- made simpler by conflating different concerns (how is this interface exactly -- what set of interfaces can I get from this object) into one powerful general concept. I think it's harder when you have to learn that said "one powerful general concept" has several rather separate uses, and the ability to have some specific uses fall in the gray zone between the typical use cases does not exactly help learning and understanding, either. But I do think that quite a reasonable debate could be held about this. > Anyway, I'm so glad that Alex has taken up the cause; I'm not all > that actively involved in Python internals... but as a user I can't > advocate more for something like this. Alex, I'm delighted if > you would take ownership of the PEP. OK, thanks! Alex From tim.one@comcast.net Mon Jul 15 07:18:32 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 02:18:32 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020715052709.GA13426@hishome.net> Message-ID: [Oren Tirosh] > Let's look at the options: (are there any I forgot?) We can always pretend the issue was never raised <0.9 wink>. > 1. Define StopIteration as a sticky state. People will write code that > relies on this behavior. You say that like it's a bad thing. > The code will sometimes fail when run on 2.2.x or with certain existing > user iterators. But it's unlikely. Most people stick to "for x in object:". Those who don't and rely on anything other than StopIteration being sticky are relying on things explictly documented as not kosher. If we did decide to enforce the PEP, everything in the core that doesn't follow it is "a bug", and fixes would get backported to the 2.2 line. > It's probably the worst possible combination: you have to implement this > in your iterators but you can't rely on it in code that may run on 2.2 or > get iterators from libraries written before this was made into a > requirement. But it's already a requirement, according to the PEP. Regardless of what the PEP says or what things do, the safest course for users is not to provoke the issue, i.e. never to write code that pokes an iterator after it raises StopIteration. All these choices become irrelevant to code doing so. Your code may be an exception, but I'm sure the vast bulk of 2.2 code already plays that way; for example, I doubt there's any code in the std distribution that cares. > 2. Leave things the way they are. Since *almost* all builtin iterators > behave this way people will continue to write code that relies on this. It appears that more than half of the builtin iterators don't arrange to make StopIteration sticky (sequence iterators and three flavors of dict iterators and two-argument iter() iterators definitely do not; generator iterators definitely do; Zope3 BTree iterators definitely do, but they're not part of the Python core; the meta-rule here is that an iterator follows the PEP if and only if I wrote it ). > It will silently fail for some builtin iterators and user iterators. I'm not sure what "fail" means here. > 3. Silently fix all iterators to be in a StopIteration sink state. Even > worse than #2. It looks like version 2.2 is going to live a long time. > This will cause subtle and hard-to-find differences in behavior > between 2.2 and 2.3. We actively backport bugfixes to the 2.2 line. > 4. Require iterators to raise an exception. Places an extra burden on all > iterator implementors. A lot of existing code will suddenly be redefined > as not kosher. Raising any exception other than StopIteration is going to be a very hard sell. > 5. Leave it officially undefined but raise an exception for all > or even some builtin iterators. Raising an exception for even one > popular type (listiter) would be more than enough to discourage > code that relies on this behavior. But not to stop it, and then users can't predict what will happen. > No extra burden is placed on iterator implementers. Didn't you just propose raising exceptions in "all or even some" builtin iterators? They weren't implemented by elves . > No change to iterator protocol definition. The only way to achieve that is your #1: the current definition *is* sticky state, albeit honored mostly in the breach. If Guido doesn't want that now, the definition has to change. > No existing code is suddenly non-conforming. Any existing code that relies on, or supplies, anything other than sticky state is non-conforming right now. You could turn that into "an advantage" by flipping the claim to: Some non-conforming existing code would suddenly become officially blessed. From martin@v.loewis.de Mon Jul 15 08:04:54 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jul 2002 09:04:54 +0200 Subject: [Python-Dev] PEP 11: unsupported platforms In-Reply-To: <20020714212620.GA3192@cthulhu.gerg.ca> References: <20020714212620.GA3192@cthulhu.gerg.ca> Message-ID: Greg Ward writes: > The only feedback I have is consider changing the name to "Removing > Support for Obsolete Platforms", since that's what most of the PEP is > about. However, since it also includes a list of those obsolete > platforms, your title is not without merit. I deliberately did not chose the word "obsolete platform", since this PEP does not judge the obsoleteness of the platform: we do not recommend to use other platforms instead, and so forth. Instead, all this PEP says that we won't support Python anymore on those platforms, as we believe that nobody is interested in Python on those systems (for whatever reasons - mostly because the platform itself is dead). Regards, Martin From fredrik@pythonware.com Mon Jul 15 08:37:21 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 15 Jul 2002 09:37:21 +0200 Subject: [Python-Dev] PEP 11: unsupported platforms References: <20020714212620.GA3192@cthulhu.gerg.ca> Message-ID: <00b501c22bd2$77fbea80$ced241d5@hagrid> martin wrote: > I deliberately did not chose the word "obsolete platform", since this > PEP does not judge the obsoleteness of the platform: we do not > recommend to use other platforms instead, and so forth. Instead, all > this PEP says that we won't support Python anymore on those platforms, well, the title "unsupported platforms" sort of implies that if my favourite oddball platform is not mentioned in there, it is supported. wouldn't something like "no longer supported platforms" or "removing support for little used platforms" be more accurate? From mal@lemburg.com Mon Jul 15 08:52:55 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 15 Jul 2002 09:52:55 +0200 Subject: [Python-Dev] AtExit Functions References: Message-ID: <3D327F57.4040705@lemburg.com> Tim Peters wrote: > [Guido] > >>>I think you may be making a wrong use of Py_AtExit(). The docs state >>>(since 1998): >>> >>> Since Python's internal finallization will have completed before the >>> cleanup function, no Python APIs should be called by *func*. >> > > [Marc] > >>Hmm, and that includes Py_DECREF() and PyObject_Del() ? > > > Certainly. In particular, Py_DECREF() can end up calling any Python code at > all, via __del__ methods. PyObject_Del() as well ? >>In that case, I have a problem since I'm using those >>two to clean up caches and free lists in the mx tools. > > > We have two sets of exit-function gimmicks, one that runs at the very start > of Py_Finalize(), and the other at the very end. If you need to clean up > Python objects, you have to get into the first group. The interpreter has > been torn down beyond usefulness by the time we get to the second group > (that's only useful for low-level OS and external non-Python C library > cleanup). I suppose the first one is what the atexit module exposes in Python 2.0+, right ? The problem with that approach is that there may still be some references to objects left in lists and dicts which are cleaned up after having called the atexit functions. This is not so much a problem in my cases, but something to watch out in other applications which use C level Python objects as globals. >>>You may want to use the atexit.py module instead to schedule your >>>module's cleanup action; these exit functions are called much earlier. >> > >>That's difficult to get right since I have to register such a >>function from C. > > > ? You know how to write Python-callable C functions. I'm not sure why you > would need to call atexit.register from C, but if you must then that's easy > too (PyObject_Call). Well, yeah :-) >>Also, atexit.py is not present in Python 1.5.2. > > > What's that ? That's the Python version which was brand new just 3 years ago. I know... in US terms that's for history books ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mhammond@skippinet.com.au Mon Jul 15 09:37:05 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 15 Jul 2002 18:37:05 +1000 Subject: [Python-Dev] threads, SIGINT and time.sleep() Message-ID: Tim and I have been thrashing around in http://python.org/sf/581232 trying to make time.sleep() interruptible for Windows. This turns out to be quite simple, but has unearthed some questions about thread interactions, and seems to have changed semantics on Linux. While I understand that the docs make almost no guarantees WRT threads and signals, I am wondering what the "most desirable" semantics would be assuming the platform supports it. Consider a Python program with a main thread + 2 extra threads. The 2 extra threads are both in time.sleep(). When Ctrl+C is pressed, the docs seem to clearly state that only the main thread should see a KeyboardInterrupt. My question is: what should happen to the time.sleep() threads? It seems that Python 1.5.2 on Linux (as supplied by RedHat) would interrupt the 2 threads with IOError(EINTR). CVS Python currently seems to not interrupt the threads at all, allowing the sleep() to continue the full period. (A time.sleep() in the main thread *is* interrupted in both versions) For Windows I can do either. However, the Python 1.5.2 semantics seems to make the most sense to me. Was this change in behaviour post 1.5 intentional? The code does not imply the new behaviour is intented (but the code doesn't imply much at all!) Test code and results below. All clues welcomed! Thanks, Mark. Test code: ---------- import time, threading threads=[] for i in range(2): t=threading.Thread(target=time.sleep, args=(30,)) t.start() threads.append(t) for t in threads: t.join() Python 1.5.2 on Linux: ---------------------- Exception in thread Thread-1: Traceback (innermost last): ... File "/usr/lib/python1.5/site-packages/threading.py", line 364, in run apply(self.__target, self.__args, self.__kwargs) IOError: [Errno 4] Interrupted system call Exception in thread Thread-2: Traceback (innermost last): ... File "/usr/lib/python1.5/site-packages/threading.py", line 364, in run apply(self.__target, self.__args, self.__kwargs) IOError: [Errno 4] Interrupted system call Traceback (innermost last): ... File "/usr/lib/python1.5/threading.py", line 189, in wait waiter.acquire() KeyboardInterrupt Current CVS on Linux: --------------------- [Pressing Ctrl+C has no effect - sleep() period expires, then...] Traceback (most recent call last): ... File "/home/skip/src/python/dist/src/Lib/threading.py", line 190, in wait waiter.acquire() KeyboardInterrupt From mwh@python.net Mon Jul 15 10:36:12 2002 From: mwh@python.net (Michael Hudson) Date: 15 Jul 2002 10:36:12 +0100 Subject: [Python-Dev] Python version of PySlice_GetIndicesEx In-Reply-To: Guido van Rossum's message of "Fri, 12 Jul 2002 14:38:31 -0400" References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2m8z4ds583.fsf@starship.python.net> Guido van Rossum writes: > (I changed the subject) > > > When I was going through the sources of sliceobject.c I found the function > > PySlice_GetIndicesEx. It performs the magic of trimming a slice into the > > range of indices of a sequence, including negative indices and intervals > > with None as start or stop value. A comment in this function says: > > > > /* this is harder to get right than you might think */ And it is. > > Wouldn't it be a good idea to expose this nontrivial functionality to > > Python code as a method of slice objects? > > I dunno. It seems that most code that actually uses slices is written > in C anyway. > > > The method would take an integer argument (length) and return an > > xrange object. > > Why an xrange object? That's not inspectable. *If* we were to do > this (which I doubt) it should return a tuple of three ints. Yes. > > It should make it much > > easier to implement user types that support extended slicing: > > > > def __getitem__(self, index): > > if isinstance(index, slice): > > return [get_item_at(i) for i in index.trim(len(self))] > > else: > > return get_item_at(index) > > > > Suggestions for a better name than trim? > > getindices() When I was debugging this function, I wrote a method called indices(). Actually, I think I'm probably in favour of adding this method, if only to make writing clearer test cases easier. [...] [Tim] > Just to be helpfully irritating, I'll note that Zope's C > implementation of slice index normalization for BTreeItems objects > was off in nearly every way possible, until a few weeks ago. It > really is difficult to get this right. No kidding. Cheers, M. -- I think perhaps we should have electoral collages and construct our representatives entirely of little bits of cloth and papier mache. -- Owen Dunn, ucam.chat, from his review of the year From mwh@python.net Mon Jul 15 11:03:33 2002 From: mwh@python.net (Michael Hudson) Date: 15 Jul 2002 11:03:33 +0100 Subject: [Python-Dev] threads, SIGINT and time.sleep() In-Reply-To: "Mark Hammond"'s message of "Mon, 15 Jul 2002 18:37:05 +1000" References: Message-ID: <2m3culs3yi.fsf@starship.python.net> "Mark Hammond" writes: > Tim and I have been thrashing around in http://python.org/sf/581232 trying > to make time.sleep() interruptible for Windows. This turns out to be quite > simple, but has unearthed some questions about thread interactions, and > seems to have changed semantics on Linux. > > While I understand that the docs make almost no guarantees WRT threads and > signals, Don't go there. > I am wondering what the "most desirable" semantics would be assuming > the platform supports it. > > Consider a Python program with a main thread + 2 extra threads. The 2 extra > threads are both in time.sleep(). When Ctrl+C is pressed, the docs seem to > clearly state that only the main thread should see a KeyboardInterrupt. My > question is: what should happen to the time.sleep() threads? > > It seems that Python 1.5.2 on Linux (as supplied by RedHat) would interrupt > the 2 threads with IOError(EINTR). CVS Python currently seems to not > interrupt the threads at all, allowing the sleep() to continue the full > period. (A time.sleep() in the main thread *is* interrupted in both > versions) Are you saying that your patch changes behaviour, or that behaviour changed somewhere between 1.5.2 and current CVS? Or between 2.2 and current CVS? These lines: /* Mask all signals in the current thread before creating the new * thread. This causes the new thread to start with all signals * blocked. */ sigfillset(&newmask); SET_THREAD_SIGMASK(SIG_BLOCK, &newmask, &oldmask); might have something to do with it. Does anyone know where they come from? > For Windows I can do either. However, the Python 1.5.2 semantics seems to > make the most sense to me. Was this change in behaviour post 1.5 > intentional? The code does not imply the new behaviour is intented (but the > code doesn't imply much at all!) I'd expect the 1.5.2 semantics, but ... > Test code and results below. All clues welcomed! Well, the behaviour is probably (pairwise) different on FreeBSD, Darwin and Solaris. "Cheers", M. -- Gullible editorial staff continues to post links to any and all articles that vaguely criticize Linux in any way. -- Reason #4 for quitting slashdot today, from http://www.cs.washington.edu/homes/klee/misc/slashdot.html From cce@clarkevans.com Mon Jul 15 12:24:00 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Mon, 15 Jul 2002 07:24:00 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: ; from aleax@aleax.it on Mon, Jul 15, 2002 at 08:15:11AM +0200 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> Message-ID: <20020715072400.B40101@doublegemini.com> On Mon, Jul 15, 2002 at 08:15:11AM +0200, Alex Martelli wrote: | > | > Well, if it ain't a Thymagig, where is the | > wrapper so I can treat it like one? | > | | Right. The concept of adaptation definitely comes from the Components | world, where "is-A" is a dirty word. High time the Objects world | listened:-). *grins* | Whether reifying the concept into a type or language-blessed | category buys you more than it costs -- much depends on how it's | done. Personally, I'd rather have that superset of "interface" that | is known in Haskell as a "typeclass" -- and I'd settle for that middle | ground that is known in C++ or Java as an "abstract class". If | interfaces/typeclasses/&c came with Eiffelish 'contracts', so much | the better. But it's all in the narrow range between -0 and +0 for me. I could see where something like Eiffel's "contracts" would be a neat addition to Python, but I must confess I don't have any serious experience with them. | I'm not sure what you mean by "interface inheritance". The ability | to define an interface by adding some stuff to another interface -- | that's the only sense in which one could possibly speak of such | a thing in Java, say -- is extremely convenient but not a must-do... My exposition was awful, sorry. Perhaps a specific (albeit contrived) example would better reflect the intent. Suppose that I have a iterator with one method, next(). Now suppose that I want a "mutable iterator", one which adds the change() method. This is well and good, but the concept of something being mutable is quite othogonal to iteration and perhaps should have its own interface rather than using inheritance. So, what I'm asserting is that once the inheritance "feature" is there... people use it even though other approaches are available. Once headed down this path, people start defining these massivly ugly interfaces (see XML's DOM) where "lazy" implementations throw a NotImplemented error if the object doesn't support particular methods of the interface. Yikes. The other thing that interface inheritance implies is substitutabilty; yet in practice this isn't always practical (or even needed). But, alas we are digressing here. The point of the PEP wasn't to confront interfaces; as people who believe in them won't be swayed. But frankly... I love python without interfaces. | The big question is rather: given that Isub inherits from Isuper, | does any object implementing Isub also implicitly implement Isuper? | | That's the object-philosophy, where inheritance is thought to reflect | deep IS-A relations. It's NOT the component-philosophy, where | inheritance is an implementation-convenience detail. I much prefer | the component-approach, where my component has full control, if | it wants to, on exactly what interfaces it exposes. This lets you | factor out any commonality between interfaces without giving any | actual reality to the factored-out common subset -- it can even stay | an "abstract interface" that no object actually supplies, if you want. Yes, this is the bigger question. And I'd rather see python swing more towards the component/delegation model; it isn't really strictly object oriented as it is, and I'd hate to see it become that way. Best, Clark Yo! Check out YAML! http://yaml.org Serialization for the masses From skip@pobox.com Mon Jul 15 13:43:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 15 Jul 2002 07:43:37 -0500 Subject: [Python-Dev] AtExit Functions In-Reply-To: <3D327F57.4040705@lemburg.com> References: <3D327F57.4040705@lemburg.com> Message-ID: <15666.50041.362691.287914@localhost.localdomain> mal> I suppose the first one is what the atexit module exposes in Python mal> 2.0+, right ? Not really. The atexit module is just a wrapper around sys.exitfunc which provides a standard protocol for registering more than one function to be called at exit. You should be able to easily backport it to 1.5.2 and deliver it with your package for installation on systems still running 1.5.2. Or, just deal directly with sys.exitfunc. Before 2.0 there was no rational way to use sys.exitfunc. The application, libraries, and core code had no rules about who could or couldn't set sys.exitfunc. Skip From mhammond@skippinet.com.au Mon Jul 15 14:03:23 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 15 Jul 2002 23:03:23 +1000 Subject: [Python-Dev] threads, SIGINT and time.sleep() In-Reply-To: <2m3culs3yi.fsf@starship.python.net> Message-ID: [Michael] > > While I understand that the docs make almost no guarantees WRT > > threads and signals, > > Don't go there. Didn't mean to > Are you saying that your patch changes behaviour, or that behaviour > changed somewhere between 1.5.2 and current CVS? Or between 2.2 and > current CVS? My patch is for Windows only. While examining my Linux builds to seek out the most desirable behaviour, I stumbled across the difference between my Linux 1.5.2 and CVS builds. I have no other Linux builds to try, but if someone has a few versions handy it would be interesting to know exactly where it changes (or indeed if others can even repro this behaviour). It appears that cygwin on Windows aborts the 2 threads without error - ie, sleep() silently returns early. > I'd expect the 1.5.2 semantics, but ... > > > Test code and results below. All clues welcomed! > > Well, the behaviour is probably (pairwise) different on FreeBSD, > Darwin and Solaris. Yeah, I appreciate that there will always be platform differences - but I still wouldn't mind knowing a "most desirable" behaviour should the platform support it and anyone be bothered - if for no better reason than for me to know what behaviour to check in for Windows! Thanks, Mark. From guido@python.org Mon Jul 15 14:05:29 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 09:05:29 -0400 Subject: Suggestion for fixing %(foo)s (Re: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting) In-Reply-To: Your message of "Mon, 15 Jul 2002 12:48:44 +1200." <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz> References: <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz> Message-ID: <200207151305.g6FD5Tp30343@pcp02138704pcs.reston01.va.comcast.net> > > > Guido, can you please, for our enlightenment, tell us what are the > > > reasons you feel %(foo)s was a mistake? > > > > Because of the trailing 's'. It's very easy to leave it out by > > mistake > > How about introducing a new format > > %{foo} > > which is defined to be the same as %(foo)s. Maybe too subtle (you'd really have to explain the history to make people understand why there's both %() and %()), and doesn't solve the compile time / run time issue IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 14:27:53 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 09:27:53 -0400 Subject: [Python-Dev] Python version of PySlice_GetIndicesEx In-Reply-To: Your message of "Mon, 15 Jul 2002 10:36:12 BST." <2m8z4ds583.fsf@starship.python.net> References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net> <2m8z4ds583.fsf@starship.python.net> Message-ID: <200207151327.g6FDRrL30498@pcp02138704pcs.reston01.va.comcast.net> > > > Suggestions for a better name than trim? > > > > getindices() > > When I was debugging this function, I wrote a method called indices(). > Actually, I think I'm probably in favour of adding this method, if > only to make writing clearer test cases easier. OK. Michael, if you want to check in indices(), go ahead. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 14:28:54 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 09:28:54 -0400 Subject: [Python-Dev] Python version of PySlice_GetIndicesEx In-Reply-To: Your message of "Mon, 15 Jul 2002 09:27:53 EDT." Message-ID: <200207151328.g6FDSsI30513@pcp02138704pcs.reston01.va.comcast.net> > OK. Michael, if you want to check in indices(), go ahead. Of course, the possibility exists that indices() fails, when one of the indices is not an int or None. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 14:38:45 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 09:38:45 -0400 Subject: [Python-Dev] threads, SIGINT and time.sleep() In-Reply-To: Your message of "Mon, 15 Jul 2002 18:37:05 +1000." References: Message-ID: <200207151338.g6FDcjm30615@pcp02138704pcs.reston01.va.comcast.net> Python has always documented that signals go only to the main thread. Apparently in 2.1 and before this wasn't implemented properly (for Linux; I don't know about other platforms and this is notoriously platform-dependent). I think that since ^C doesn't interrupt regular Python code running in a thread, it's strange that time.sleep() (and presumably other I/O!) would be interrupted. So I'd like to see the CVS behavior. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 15:04:54 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 10:04:54 -0400 Subject: [Python-Dev] PEP 246 - Object Adaptation In-Reply-To: Your message of "Mon, 15 Jul 2002 08:15:11 +0200." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> Message-ID: <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net> (Changing the subject) > The big question is rather: given that Isub inherits from Isuper, > does any object implementing Isub also implicitly implement Isuper? This probably shows my naivete more than anything else... I'd say "of course", based on an example where Isuper is FileOpenForReading and Isub is FileOpenForReadingAndWriting. It would be strange if a file open for reading and writing was not acceptable in a place where a file open for reading is accepted (because it implements all the right methods). Or is the fact that it implements *more* the problem? Am I missing something? I also thought that there's a different dimension of interface inheritance: if class C implements interface I, and class D derives from class C, does D implicitly implement I also? Again, I'd say yes. But I believe Jim Fulton disagrees with me. And again, I haven't tried to use interfaces enough to understand what problems you could get into by this assumption. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 15:15:28 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 10:15:28 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Mon, 15 Jul 2002 12:45:11 +1200." <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz> References: <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz> Message-ID: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net> > I think it's more complicated than that. If the file > object were to become an object obeying the iterator > protocol, its next() method should really return the > next *byte* of the file. Then you'd still want methods > like read(), readline() etc. for reading in larger > chunks. I don't think so. We should pick the most convenient chunking for the default iterator, and provide explicit ways to ask for other iterators (like dict iterators). Also, since "for line in file" already works, there's a strong precedent for iterating by line by default. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Jul 15 15:19:33 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 15 Jul 2002 10:19:33 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <200207150615.g6F6FJq28099@smtp.zope.com> Message-ID: <15666.55797.351811.317428@anthem.wooz.org> >>>>> "AM" == Alex Martelli writes: AM> The big question is rather: given that Isub inherits from AM> Isuper, does any object implementing Isub also implicitly AM> implement Isuper? There's another issue that Jim Fulton likes to bring up, IIRC. If class Super implements IInterface, does class Sub(Super) also (automatically) implement IInterface? I could be totally misremembering, but I believe that Jim would say "no". Class Sub would have to explicitly declare that it also implements IInterface. -Barry From aahz@pythoncraft.com Mon Jul 15 15:22:25 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 10:22:25 -0400 Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: <20020715072400.B40101@doublegemini.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <20020715072400.B40101@doublegemini.com> Message-ID: <20020715142225.GA9006@panix.com> On Mon, Jul 15, 2002, Clark C . Evans wrote: > > My exposition was awful, sorry. Perhaps a specific (albeit contrived) > example would better reflect the intent. Suppose that I have a iterator > with one method, next(). Now suppose that I want a "mutable iterator", > one which adds the change() method. This is well and good, but the > concept of something being mutable is quite othogonal to iteration and > perhaps should have its own interface rather than using inheritance. > So, what I'm asserting is that once the inheritance "feature" is > there... people use it even though other approaches are available. On the whole, I'd say that Python is actually *less* prone to this problem (because using an object doesn't generally require inheritance, just protocol); in fact, it suffers from the obverse problem. Consider this: class C: def open(self, name, flags=None): def read(self): def write(self, value): def close(self): Can instances of C be used where a file object is expected? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From barry@zope.com Mon Jul 15 15:32:18 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 15 Jul 2002 10:32:18 -0400 Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com> Message-ID: <15666.56562.63130.943725@anthem.wooz.org> >>>>> "A" == Aahz writes: A> On the whole, I'd say that Python is actually *less* prone to A> this problem (because using an object doesn't generally require A> inheritance, just protocol); in fact, it suffers from the A> obverse problem. Consider this: | class C: | def open(self, name, flags=None): | def read(self): | def write(self, value): | def close(self): A> Can instances of C be used where a file object is expected? Maybe . That's why you tend to see things described like: "argument f must have a write() method that accepts a string." WIBNI we could define a protocol/interface/thingie that encapsulated that requirement? I'd even be happy to start out with no officially blessed interfaces, to give time to see what cream rises to the top. Zope's Interface and component model stuff is a good way to get some real experience with using these concepts in Python. -Barry From guido@python.org Mon Jul 15 15:39:51 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 10:39:51 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 12 Jul 2002 00:59:28 +0300." <20020712005928.A9833@hishome.net> References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> <20020712005928.A9833@hishome.net> Message-ID: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> > http://www.python.org/sf/580331 > > No, it's not a complete rewrite of file buffering. This patch > implements Just's idea of xreadlines caching in the file object. It > also makes a file into an iterator: __iter__ returns self and next > calls the next method of the cached xreadlines object. Hm. What happens to the xreadlines object when you do a seek() on the file? With the old semantics, you could do f.seek(0) and get another iterator (assuming it's a seekable file of course). With the new semantics, the cached iterator keeps getting in the way. Maybe the xreadlines object could grow a flush() method that throws away its buffer, and f.seek() could call that if there's a cached xreadlines iterator? > See my previous postings for why I think a file should be an iterator. Haven't seen them but I would agree that this makes sense. > With this patch any combination of multiple xreadlines and iterator > protocol operations on a file object is safe. Using > xreadlines/iterator followed by regular readline has the same > buffering problem as before. Agreed. I just realized that the (existing) file_xreadlines() function has a subtle bug. It uses a local static variable to cache the function xreadlines imported from the module xreadlines. But if there are multiple interpreters or Py_Finalize() is called and then Py_Initialize() again, the cache is invalid. Would you mind fixing this? I think the caching just isn't worth it -- just do the import every time (it's fast enough if sys.modules['xreadlines'] already exists). --Guido van Rossum (home page: http://www.python.org/~guido/) From jmiller@stsci.edu Mon Jul 15 15:41:58 2002 From: jmiller@stsci.edu (Todd Miller) Date: Mon, 15 Jul 2002 10:41:58 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() References: Message-ID: <3D32DF36.5080906@stsci.edu> Here's the numarray perspective on things. Tim Peters wrote: >[Tim] > >>Fredrik pressed for details, but we haven't seen any concrete use cases. >>In the absence of the latter, it's impossible to guess what would be >>backward compatible for MAL's purposes. >> I updated my CVS copy of Python and tried out MAL's patch with numarray. Nothing broke as far as I can tell. I guess it probably doesn't matter anyway given that both buffer() and MAL's patch are headed to oblivion. >> > >[M.-A. Lemburg] > >>For my purposes, the strategy buffer slice returns a buffer >>would be more appropriate because it would save the buffer type >>information across the slicing operation... I mean, you don't >>want to get bananas when you slice an apple in real life either ;-) >> >>I use buffers to mean: this is a chunk of binary data. The purpose >>is to recognize this type of data for pickling via xml-rpc, >>soap and other rpc mechanisms etc. >> > >How do you use buffers? > We use buffers in numarray to store our array data. We use readinto to load array buffers efficiently from a file. We operate on the buffer data in-place. Since numarrays are python classe instances, buffers provide a place for the data to live. >Do you stick to their C API? > We use the C-API, and currently use the buffer object too. Using the buffer object has always seemed like a necessary evil, but having reviewed numarray usage of buffer(), ditching it sounds good to me. >Do you use the >Python-level buffer() function? > Yes. We go one step further, and expose writeable buffers using our own extension function. I had a feeling I was on thin ice when I did this. >If the latter, what do you do in Python >code with a buffer object after you get one? The only use I've seen made of >a buffer object in Python code is as a way to trick the interpreter into >crashing (via recycling the memory the buffer object points to). > I'm getting the following things by using the buffer object: 1. Knowledge that the C-type the buffer refers to meets the buffer C-API. 2. Mutable string behavior for any object which meets the buffer C-API. 3. Storage. At least we used to get storage until we found out that there's no guarantee on double alignment. I plan to work around each of these uses as follows: 1. Create an extension function which determines whether an object meets the buffer C-API. 2. Create an extension function which copies from one buffer region to another buffer region. 3. We already have our own memory object which is now typically referenced by a buffer object. With the above extensions, I don't need a buffer "wrapper" object around it anymore. > > >And from where do you get a buffer? There are darned few types in Python > We get ours from mmap and our own homegrown memory object. > >that buffer() accepts as an argument. Do your extension types implement >tp_as_buffer? I'm blindly casting for a reason why your appreciation of the > > >buffer object seems unique. > Numarray uses buffer() too, but dumping it sounds OK. Todd -- Todd Miller Space Telescope Science Institute From guido@python.org Mon Jul 15 15:50:32 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 10:50:32 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Mon, 15 Jul 2002 10:41:58 EDT." <3D32DF36.5080906@stsci.edu> References: <3D32DF36.5080906@stsci.edu> Message-ID: <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> > >How do you use buffers? > We use buffers in numarray to store our array data. We use readinto to > load array buffers efficiently from a file. We operate on the buffer > data in-place. Since numarrays are python classe instances, buffers > provide a place for the data to live. AFAIK the buffer() function can only create read-only buffers. How do you create your buffers? If you're just using the C buffer API, that's not going away. > >Do you stick to their C API? > > > We use the C-API, and currently use the buffer object too. Using the > buffer object has always seemed like a necessary evil, but having > reviewed numarray usage of buffer(), ditching it sounds good to me. Good. > >And from where do you get a buffer? There are darned few types in Python > We get ours from mmap and our own homegrown memory object. Maybe instead of the buffer() function/type, there should be a way to allocate raw memory? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 16:15:58 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 11:15:58 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Mon, 15 Jul 2002 02:18:32 EDT." References: Message-ID: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> I'm still only considering two options: (a) leave the status quo, or (b) implement (and document!) the "sink-state" rule from the PEP. If we end up adopting (b), what can we do to Python 2.2 that doesn't break the "bug-fixes-only" promise of that branch? If there's code that depends on the extendibility of list iterators, are we breaking our promise by breaking that code? OTOH, I just noticed that the general sequence iterator has a different behavior than the list iterator (both in 2.3 and in 2.2): the general sequence iterator increments its index before checking for IndexError, while the list iterator only increments when it knows it's got a valid item. That means that if you use the general list iterator over an extensible sequence, you miss an item! ----------------------------- class Seq: def __init__(self, n): self.n = n def __getitem__(self, i): if 0 <= i < self.n: return i else: raise IndexError a = Seq(3) it = iter(a) for i in it: print i, a.n = 5 for i in it: print i, ----------------------------- This prints "0 1 2 4". This is sufficiently braindead that we can assume that *if* this behavior is relied upon, it's only for lists. Still, the question is, could "fixing" the list iterator in 2.2.2 become a problem? I'd like to think that more people are surprised by this behavior than rely on it, but I'm not sure. A simple fix for the sequence and dict iterators is to let a negative index signal exhaustion. A simple fix for the callable iter is to set the callable to NULL to signal exhaustion. (Setting the main object to NULL could also work for the others, actually, and has the added -- minuscule -- advantage of releasing a reference early.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jmiller@stsci.edu Mon Jul 15 16:17:37 2002 From: jmiller@stsci.edu (Todd Miller) Date: Mon, 15 Jul 2002 11:17:37 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() References: <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D32E791.40809@stsci.edu> Guido van Rossum wrote: >>>How do you use buffers? >>> > >>We use buffers in numarray to store our array data. We use readinto to >>load array buffers efficiently from a file. We operate on the buffer >>data in-place. Since numarrays are python classe instances, buffers >>provide a place for the data to live. >> > >AFAIK the buffer() function can only create read-only buffers. How do you... > We have a very small extension function which creates writeable buffer objects using the buffer type C-API. We also wrap suitable type instances with a "buffer object wrapper". I'm slowly gathering that this is unsafe. :-( > >you create your buffers? If you're just using the C buffer API, >that's not going away. > >>>Do you stick to their C API? >>> >>We use the C-API, and currently use the buffer object too. Using the >>buffer object has always seemed like a necessary evil, but having >>reviewed numarray usage of buffer(), ditching it sounds good to me. >> > >Good. > >>>And from where do you get a buffer? There are darned few types in Python >>> > >>We get ours from mmap and our own homegrown memory object. >> > >Maybe instead of the buffer() function/type, there should be a way to >allocate raw memory? > Yes. It would also be nice to be able to: 1. Know (at the python level) that a type supports the buffer C-API. 2. Copy bytes from one buffer to another (writeable buffer). > > >--Guido van Rossum (home page: http://www.python.org/~guido/) > Todd -- Todd Miller Space Telescope Science Institute From guido@python.org Mon Jul 15 16:18:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 11:18:59 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Sun, 14 Jul 2002 22:41:16 EDT." <15666.13900.581653.909094@anthem.wooz.org> References: <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> <15665.37242.446627.141013@anthem.wooz.org> <20020714160611.GA25950@hishome.net> <15666.13900.581653.909094@anthem.wooz.org> Message-ID: <200207151518.g6FFIxp31610@pcp02138704pcs.reston01.va.comcast.net> > StopIterator is a sink state for dict iterators if I don't change the > size of the dict. Shouldn't list and dict iterators should behave > similarly for mutation (or at least resizing) between .next() calls? No, mutating a list while the iterator is not exhausted is perfectly well defined: the iterator's state has the next index to try. This is totally predictable, and useful or not depending on what you're trying to do. The dict iterator tests for mutating the dict because the rehashing possibility makes this unpredictable. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 16:34:50 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 11:34:50 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Mon, 15 Jul 2002 11:17:37 EDT." <3D32E791.40809@stsci.edu> References: <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> <3D32E791.40809@stsci.edu> Message-ID: <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net> > We have a very small extension function which creates writeable buffer > objects using the buffer type C-API. That's how the buffer API was supposed to be used. > We also wrap suitable type instances with a "buffer object wrapper". > I'm slowly gathering that this is unsafe. :-( I don't understand what you say, but I believe you. > >Maybe instead of the buffer() function/type, there should be a way to > >allocate raw memory? > Yes. It would also be nice to be able to: > > 1. Know (at the python level) that a type supports the buffer C-API. Good idea. (I guess right now you can see if calling buffer() with an instance as argument works. :-) > 2. Copy bytes from one buffer to another (writeable buffer). Maybe you would like to work on a requirements gathering for a memory object? --Guido van Rossum (home page: http://www.python.org/~guido/) From cce@clarkevans.com Mon Jul 15 16:58:37 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Mon, 15 Jul 2002 11:58:37 -0400 Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: <20020715142225.GA9006@panix.com>; from aahz@pythoncraft.com on Mon, Jul 15, 2002 at 10:22:25AM -0400 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com> Message-ID: <20020715115837.A45200@doublegemini.com> On Mon, Jul 15, 2002 at 10:22:25AM -0400, Aahz wrote: | in fact, it suffers from the obverse problem. Consider this: | | class C: | def open(self, name, flags=None): | def read(self): | def write(self, value): | def close(self): | | Can instances of C be used where a file object is expected? >From the "Object Adaptation" perspective, you would have a file protocol (perhaps the built-in File object works). And then you could call "check()" or "adapt()" built-in functions. check() at a high level, this built-in function first asks the object iself directly: "Hey are you a File?" if the response is affirmative or negative, then the search is done. If the object doesn't respond (either it lacks __check or __check returns None) then the built-in then goes and asks the protocol object if the file complies. When all else fails, the built-in could use some default logic of its own. adapt() returns the object itself if check() is true; otherwise it asks the object and then the protocol to provide a wrapper. If neither provide the wrapper, then an error is thrown. The key thing about the Object Adaptation proposal is that it leaves wide open what it means to comply. This flexibility is necessary since the methods for determining compliance may vary from situation to situation; no size fits all. With this proposal, both the Object and the Protocol can use what ever methods are at their disposal to gauge compliance and/or create an adaptative wrapper. That said, what built-in compliance systems Python may choose to integrate into the core system are othogonal; or optionally, Python could have multiple complance mechanism; Eiffelish contract based mechanism for those who are in that school of thought, or a "type-safe" interface based complance for those who think this is the best approach. This proposal leaves all of those options open and favors no-one. So, I'm sorry if I diverted this into "Are interfaces good or bad". Clearly the idea of a protocol is good, interfaces are OK, but I have my doubts about them being good enough ballence between power and complexity. It's nice to see a simple yet powerful mechanism like this being considered... thanks! Best, Clark From cce@clarkevans.com Mon Jul 15 17:01:33 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Mon, 15 Jul 2002 12:01:33 -0400 Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: <15666.56562.63130.943725@anthem.wooz.org>; from barry@zope.com on Mon, Jul 15, 2002 at 10:32:18AM -0400 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com> <15666.56562.63130.943725@anthem.wooz.org> Message-ID: <20020715120133.B45200@doublegemini.com> On Mon, Jul 15, 2002 at 10:32:18AM -0400, Barry A. Warsaw wrote: | | class C: | | def open(self, name, flags=None): | | def read(self): | | def write(self, value): | | def close(self): | | A> Can instances of C be used where a file object is expected? | | Maybe . | | That's why you tend to see things described like: "argument f must | have a write() method that accepts a string." WIBNI we could define a | protocol/interface/thingie that encapsulated that requirement? Even if write accepts a string it may not do what you expect. *grin* | I'd even be happy to start out with no officially blessed interfaces, to | give time to see what cream rises to the top. Zope's Interface and | component model stuff is a good way to get some real experience with | using these concepts in Python. Well, with the Object Adaptation proposal you don't even need to bless a particular compliance mechanism. For ease of use, the built-in may want to use a few mechanisms; but this is a distinct difference. What ever is chosen, I hope the core syntax isn't mucked with! Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From aahz@pythoncraft.com Mon Jul 15 16:54:05 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 11:54:05 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020715155405.GA7009@panix.com> On Mon, Jul 15, 2002, Guido van Rossum wrote: > > (b) implement (and document!) the "sink-state" rule from the PEP. > > If we end up adopting (b), what can we do to Python 2.2 that doesn't > break the "bug-fixes-only" promise of that branch? Well, from my POV, given that the PEP is mostly clear about the intent, fixing the implementation to match the PEP precisely matches the "bug-fix only" rule. We've been trying to move away from "reference defined by implementation", and this seems like a perfect opportunity to exercise it. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aleax@aleax.it Mon Jul 15 16:56:56 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 17:56:56 +0200 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Monday 15 July 2002 05:15 pm, Guido van Rossum wrote: > I'm still only considering two options: > > (a) leave the status quo, or > (b) implement (and document!) the "sink-state" rule from the PEP. For what it's worth, I strongly prefer (b). > If we end up adopting (b), what can we do to Python 2.2 that doesn't > break the "bug-fixes-only" promise of that branch? > > If there's code that depends on the extendibility of list iterators, > are we breaking our promise by breaking that code? I have no opinion on this specific issue. Every other iterator could surely be made to implement the sink behavior, but I do not know if the empirically observed behavior of iterators on list could be classified as a bug (I sure wish it could). Alex From aleax@aleax.it Mon Jul 15 17:08:08 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 18:08:08 +0200 Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: <20020715115837.A45200@doublegemini.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715142225.GA9006@panix.com> <20020715115837.A45200@doublegemini.com> Message-ID: On Monday 15 July 2002 05:58 pm, Clark C . Evans wrote: > On Mon, Jul 15, 2002 at 10:22:25AM -0400, Aahz wrote: > | in fact, it suffers from the obverse problem. Consider this: > | > | class C: > | def open(self, name, flags=None): > | def read(self): > | def write(self, value): > | def close(self): > | > | Can instances of C be used where a file object is expected? > > From the "Object Adaptation" perspective, you would have a file > protocol (perhaps the built-in File object works). And then If so, then presumably the answer is "no", since the built-in file object has many more important methods such as seek and tell. If the file type itself serves as the protocol, surely that should mean "implement all of the methods" rather than just some of them. Moreover, a file's read method accepts an optional integer. Class C's read method does not. So, even the methods that C does supply are not compliant with those of a file object. Some, but not all, current uses of "file-like objects" may be satisfied with just a .read method that must be called without arguments -- other would need the argument to be accepted, others yet would need readline instead, not to speak of seeking behavior (which all file object expose, but not all _implement_...). To use adaptation, we may need to be more precise than just saying "a file object is expected" -- IF only a SUBSET of the file object's methods (or a subset of their signatures) is indeed expected. > you could call "check()" or "adapt()" built-in functions. > > check() at a high level, this built-in function first asks > the object iself directly: "Hey are you a File?" > if the response is affirmative or negative, then the > search is done. If the object doesn't respond (either > it lacks __check or __check returns None) then the > built-in then goes and asks the protocol object if the > file complies. When all else fails, the built-in > could use some default logic of its own. > > adapt() returns the object itself if check() is true; otherwise > it asks the object and then the protocol to provide > a wrapper. If neither provide the wrapper, then an > error is thrown. I don't see the need or opportunity to have a check() that is separate from adapt(). COM's QueryInterface only has the equivalent of adapt(), and that's quite enough. PEP 246 does not specify a check() built-in, either. > The key thing about the Object Adaptation proposal is that it > leaves wide open what it means to comply. This flexibility is Yes, but I see it as a minimum that a "compliant" object has a set of methods callable with given signatures. If a protocol is represented by a type, the set should comprise the type's methods. While it WOULD be nice to extend this further, we can see just from examining file objects that this is probably impractical -- they all do have (e.g.) methods write and seek, but if you call those methods on a given file object f, f may raise exceptions because it's not really writable or seekable. So "having a method" is not a sufficient condition for REALLY having it, if you see what I mean. Alex From barry@zope.com Mon Jul 15 17:09:33 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 15 Jul 2002 12:09:33 -0400 Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715000651.A35319@doublegemini.com> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com> <15666.56562.63130.943725@anthem.wooz.org> <20020715120133.B45200@doublegemini.com> Message-ID: <15666.62397.411994.862638@anthem.wooz.org> >>>>> "CC" == Clark C writes: CC> Even if write accepts a string it may not do what you expect. CC> *grin* Very true. It's the best we can do now, but we can do better by being more explicit. :) CC> Well, with the Object Adaptation proposal you don't even need CC> to bless a particular compliance mechanism. For ease of use, CC> the built-in may want to use a few mechanisms; but this is a CC> distinct difference. What ever is chosen, I hope the core CC> syntax isn't mucked with! Me too! I need to go re-read that PEP now. -Barry From guido@python.org Mon Jul 15 17:12:23 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 12:12:23 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Mon, 15 Jul 2002 11:54:05 EDT." <20020715155405.GA7009@panix.com> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> <20020715155405.GA7009@panix.com> Message-ID: <200207151612.g6FGCNb32176@pcp02138704pcs.reston01.va.comcast.net> > > If we end up adopting (b), what can we do to Python 2.2 that doesn't > > break the "bug-fixes-only" promise of that branch? > > Well, from my POV, given that the PEP is mostly clear about the > intent, fixing the implementation to match the PEP precisely matches > the "bug-fix only" rule. We've been trying to move away from > "reference defined by implementation", and this seems like a perfect > opportunity to exercise it. Um, our docs are scattered enough that we prefer not to break anything (at least not in a bugfix release) that might have been useful before. Given that even Tim didn't find this in the PEP upon his first two readings, and that simple experimentation with the implementation shows otherwise, and that I at first misremembered my own ruling before I found it in the PEP, I'd say that *if* there's a useful use of this, we shouldn't break that in the 2.2 branch. 2.3 is a different issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Mon Jul 15 17:26:09 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 15 Jul 2002 09:26:09 -0700 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: ; from aleax@aleax.it on Mon, Jul 15, 2002 at 05:56:56PM +0200 References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020715092608.A1879@glacier.arctrix.com> Alex Martelli wrote: > On Monday 15 July 2002 05:15 pm, Guido van Rossum wrote: > > I'm still only considering two options: > > > > (a) leave the status quo, or > > (b) implement (and document!) the "sink-state" rule from the PEP. > > For what it's worth, I strongly prefer (b). Me too. I think option (b) is simpler for users to understand. Neil From cce@clarkevans.com Mon Jul 15 17:30:07 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Mon, 15 Jul 2002 12:30:07 -0400 Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: ; from aleax@aleax.it on Mon, Jul 15, 2002 at 06:08:08PM +0200 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715142225.GA9006@panix.com> <20020715115837.A45200@doublegemini.com> Message-ID: <20020715123007.A45942@doublegemini.com> | > From the "Object Adaptation" perspective, you would have a file | > protocol (perhaps the built-in File object works). And then | | If so, then presumably the answer is "no", since the built-in | file object has many more important methods such as seek and | tell. If the file type itself serves as the protocol, surely | that should mean "implement all of the methods" rather than | just some of them. Yes. | Some, but not all, current uses of "file-like objects" may be satisfied | with just a .read method that must be called without arguments -- | other would need the argument to be accepted, others yet would | need readline instead, not to speak of seeking behavior (which all | file object expose, but not all _implement_...). | | To use adaptation, we may need to be more precise than just saying | "a file object is expected" -- IF only a SUBSET of the file object's | methods (or a subset of their signatures) is indeed expected. Exactly. | | | > you could call "check()" or "adapt()" built-in functions. | > | > check() at a high level, this built-in function first asks | > the object iself directly: "Hey are you a File?" | > if the response is affirmative or negative, then the | > search is done. If the object doesn't respond (either | > it lacks __check or __check returns None) then the | > built-in then goes and asks the protocol object if the | > file complies. When all else fails, the built-in | > could use some default logic of its own. | > | > adapt() returns the object itself if check() is true; otherwise | > it asks the object and then the protocol to provide | > a wrapper. If neither provide the wrapper, then an | > error is thrown. | | I don't see the need or opportunity to have a check() that | is separate from adapt(). COM's QueryInterface only has the | equivalent of adapt(), and that's quite enough. PEP 246 does | not specify a check() built-in, either. I agree here; having two methods doubles the complication without giving much additional value. adapt() is more than adequate, although I use check() to help explain the innerds. If someone really insists on having check() exposed, the I don't see the harm... only that it makes the proposal seem more complicated than it is. | > The key thing about the Object Adaptation proposal is that it | > leaves wide open what it means to comply. This flexibility is | | Yes, but I see it as a minimum that a "compliant" object has | a set of methods callable with given signatures. If a protocol is | represented by a type, the set should comprise the type's methods. Yes. This would be an improvement of the proposal. How do we express this so that the protocol of core Types can do this sort of enforcement. Perhaps by giving the Protocol the ability to "veto" the final result? | While it WOULD be nice to extend this further, we can see just | from examining file objects that this is probably impractical -- they | all do have (e.g.) methods write and seek, but if you call those | methods on a given file object f, f may raise exceptions because | it's not really writable or seekable. So "having a method" is not | a sufficient condition for REALLY having it, if you see what I mean. *nods* Clark Yo! Check out YAML. http://yaml.org YAML is language independent readable object serialization. -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From jmiller@stsci.edu Mon Jul 15 17:36:29 2002 From: jmiller@stsci.edu (Todd Miller) Date: Mon, 15 Jul 2002 12:36:29 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() References: <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> <3D32E791.40809@stsci.edu> <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D32FA0D.6020200@stsci.edu> Guido van Rossum wrote: >>We have a very small extension function which creates writeable buffer >>objects using the buffer type C-API. >> > >That's how the buffer API was supposed to be used. > >>We also wrap suitable type instances with a "buffer object wrapper". >> I'm slowly gathering that this is unsafe. :-( >> > >I don't understand what you say, but I believe you. > I meant we call PyBuffer_FromReadWriteObject and the resulting buffer lives longer than the extension function call that created it. I have heard that it is possible for the original object to "move" leaving the buffer object pointer to it dangling. > > >>>Maybe instead of the buffer() function/type, there should be a way to >>>allocate raw memory? >>> > >>Yes. It would also be nice to be able to: >> >>1. Know (at the python level) that a type supports the buffer C-API. >> > >Good idea. (I guess right now you can see if calling buffer() with an >instance as argument works. :-) > >>2. Copy bytes from one buffer to another (writeable buffer). >> > >Maybe you would like to work on a requirements gathering for a memory >object > Sure. I'd be willing to poll comp.lang.python (python-list?) and collate the results of any discussion that ensues. Is that what you had in mind? > > >--Guido van Rossum (home page: http://www.python.org/~guido/) > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > Todd From pinard@iro.umontreal.ca Mon Jul 15 18:22:08 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 15 Jul 2002 13:22:08 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > There's a full implementation for PEP 263. Martin von Loewis is ready > to commit it. It's of course possible to let him do this and deal with > the consequences once they're in CVS [...] There is one thing which bothers me in the `Concepts' section: Note that Python identifiers are restricted to the ASCII subset of the encoding, and thus need no further conversion after step 4. Could identifiers be produced according to the usual syntax (letters or underscore, then letters, digits and underscore), but without going to ASCII first? The fact that I can now interactively (but not in batch) do: ----------------------------------------------------------------------> 12:24 0 pinard@titan:~ $ python Python 2.2.1 (#1, Apr 29 2002, 14:27:21) [GCC 2.95.3 20010315 (SuSE)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> >>> >>> >>> >>> >>> élève = 3 >>> print élève 3 >>> ----------------------------------------------------------------------< surely let people dream. Other members in our local development team are even more excited than me about this! They keep asking me if and when this will become available for real, dependably, in Python! :-) They are eagerly (and understandably) hoping to start spelling identifiers correctly. We should try not missing the opportunity, if it happens to exist now. -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Mon Jul 15 18:32:43 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 13:32:43 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: Your message of "Mon, 15 Jul 2002 13:22:08 EDT." References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> > Note that Python identifiers are restricted to the ASCII > subset of the encoding, and thus need no further conversion > after step 4. > > Could identifiers be produced according to the usual syntax (letters or > underscore, then letters, digits and underscore), but without going to > ASCII first? [...] > We should try not missing the opportunity, if it happens to exist now. To the contrary, I wish GNU readline didn't call setlocale(). Allowing non-ASCII identifiers may eventually happen, but there are lots of reasons why it's a bad idea (such as source code portability), and tying such a proposal to this PEP is definitely the wrong thing. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 18:29:08 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 13:29:08 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Mon, 15 Jul 2002 12:36:29 EDT." <3D32FA0D.6020200@stsci.edu> References: <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> <3D32E791.40809@stsci.edu> <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net> <3D32FA0D.6020200@stsci.edu> Message-ID: <200207151729.g6FHT8A07987@pcp02138704pcs.reston01.va.comcast.net> > I meant we call PyBuffer_FromReadWriteObject and the resulting buffer > lives longer than the extension function call that created it. I have > heard that it is possible for the original object to "move" leaving the > buffer object pointer to it dangling. Yes, that can happen (depending on what kind if object it is). > Sure. I'd be willing to poll comp.lang.python (python-list?) and > collate the results of any discussion that ensues. Is that what you had > in mind? Yes, but beware that you will have to decide which requirements make sense and which ones don't -- the community is so large these days that you can't get agreement any more. :-) Feel free to come back with results to python-dev any time. --Guido van Rossum (home page: http://www.python.org/~guido/) From xscottg@yahoo.com Mon Jul 15 18:37:09 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 15 Jul 2002 10:37:09 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020715173709.20711.qmail@web40106.mail.yahoo.com> --- Guido van Rossum wrote: > > Maybe instead of the buffer() function/type, there should be a way to > allocate raw memory? > This is a part of my soon to be issued PEP. I've looked at their memory object, and Numarray is one of the use cases that I'm catering to. __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From aleax@aleax.it Mon Jul 15 18:40:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 19:40:11 +0200 Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability) In-Reply-To: <20020715123007.A45942@doublegemini.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715123007.A45942@doublegemini.com> Message-ID: On Monday 15 July 2002 06:30 pm, Clark C . Evans wrote: ... > If someone really insists on having check() exposed, the I > don't see the harm... only that it makes the proposal seem more > complicated than it is. The harm of exposing check is encouraging the "look before you leap" (LBYL) idiom: if allconditionsgreenformetodothis(): dothis() else: print "oops, cant dothis" rather than the generally more effective "it's easier to ask forgiveness than permission" (EAFP) idiom: try: dothis() except DoingThisWasWrongError: print "oops, cant dothis" With LBYL one more easily gets into duplication of work (the effort of checking duplicates the effort of actually doing the work) and multiprogramming issues (a check passes, but then immediately afterwards the situation has changed...). > | > The key thing about the Object Adaptation proposal is that it > | > leaves wide open what it means to comply. This flexibility is > | > | Yes, but I see it as a minimum that a "compliant" object has > | a set of methods callable with given signatures. If a protocol is > | represented by a type, the set should comprise the type's methods. > > Yes. This would be an improvement of the proposal. How do we > express this so that the protocol of core Types can do this > sort of enforcement. Perhaps by giving the Protocol the ability > to "veto" the final result? Dunno. I do plan to devote substantial concentrated effort to rewriting the PEP, and that's incompatible with my current situation wrt finishing the Nutshell. Further delay should be little problem given the time PEP 246 has already waited AND the BDFL's indication that it's not going to get into 2.3 anyway, so there's definitely no hurry. Alex From guido@python.org Mon Jul 15 18:43:40 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 13:43:40 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Mon, 15 Jul 2002 10:37:09 PDT." <20020715173709.20711.qmail@web40106.mail.yahoo.com> References: <20020715173709.20711.qmail@web40106.mail.yahoo.com> Message-ID: <200207151743.g6FHheg08123@pcp02138704pcs.reston01.va.comcast.net> > This is a part of my soon to be issued PEP. I've looked at their memory > object, and Numarray is one of the use cases that I'm catering to. OK, then I guess Todd doesn't have to go to c.l.py for requirements. --Guido van Rossum (home page: http://www.python.org/~guido/) From xscottg@yahoo.com Mon Jul 15 19:00:48 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 15 Jul 2002 11:00:48 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207151743.g6FHheg08123@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020715180048.32619.qmail@web40103.mail.yahoo.com> --- Guido van Rossum wrote: > > This is a part of my soon to be issued PEP. I've looked at their > memory > > object, and Numarray is one of the use cases that I'm catering to. > > OK, then I guess Todd doesn't have to go to c.l.py for requirements. > More information couldn't hurt too much, and since Todd Miller volunteered* to herd the information, I'll be interested to see if any new perspectives come out. * - Actually it looked like you volunteered him, but he seemed willing enough. :-) __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From aahz@pythoncraft.com Mon Jul 15 19:15:55 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 14:15:55 -0400 (EDT) Subject: [Python-Dev] OSCON: Community dinner Weds 7/24 6pm Message-ID: <200207151815.g6FIFtQ09567@panix1.panix.com> [posted to c.l.py with cc to c.l.py.announce and python-dev] I'm proposing a Python community dinner at OSCON next week, for Weds 7/24 at 6pm. Is there anyone familiar with the San Diego area who wants to suggest a location near the Sheraton? If I don't get any recommendations, we'll probably just have the dinner at the Sheraton. If you're interested, please send me an e-mail so I have some idea of the number of people. Also, please include a way of getting in touch with you at OSCON in case plans change (phone numbers accepted, but e-mail addresses preferred). (There's a meeting for PSF members at 8pm, so some of us will likely have to skip out early.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ -- From martin@v.loewis.de Mon Jul 15 19:27:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jul 2002 20:27:03 +0200 Subject: [Python-Dev] PEP 11: unsupported platforms In-Reply-To: <00b501c22bd2$77fbea80$ced241d5@hagrid> References: <20020714212620.GA3192@cthulhu.gerg.ca> <00b501c22bd2$77fbea80$ced241d5@hagrid> Message-ID: "Fredrik Lundh" writes: > wouldn't something like "no longer supported platforms" or > "removing support for little used platforms" be more accurate? Indeed. I'll go for "Removing support for little used platforms". Regards, Martin From aleax@aleax.it Mon Jul 15 20:04:59 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 21:04:59 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> References: <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Monday 15 July 2002 04:39 pm, Guido van Rossum wrote: ... > Maybe the xreadlines object could grow a flush() method that throws > away its buffer, and f.seek() could call that if there's a cached > xreadlines iterator? Couldn't f.seek just decref the xreadlines object and put a NULL into f's pointer to the xreadlines object? Alex From aleax@aleax.it Mon Jul 15 20:12:54 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 21:12:54 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <15666.55797.351811.317428@anthem.wooz.org> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207150615.g6F6FJq28099@smtp.zope.com> <15666.55797.351811.317428@anthem.wooz.org> Message-ID: On Monday 15 July 2002 04:19 pm, Barry A. Warsaw wrote: > >>>>> "AM" == Alex Martelli writes: > > AM> The big question is rather: given that Isub inherits from > AM> Isuper, does any object implementing Isub also implicitly > AM> implement Isuper? > > There's another issue that Jim Fulton likes to bring up, IIRC. If > class Super implements IInterface, does class Sub(Super) also > (automatically) implement IInterface? > > I could be totally misremembering, but I believe that Jim would say > "no". Class Sub would have to explicitly declare that it also > implements IInterface. I fully agree with Jim. Inheritance is often the handiest way to _implement_ some things, but not if it comes with a mandatory contract that you have to respect (specifically, supplying some interfaces because your superclasses supply them). In C++, you distinguish by using private inheritance when you are inheriting just to get implementation, public inheritance to signify that you're also accepting the IS-A obligations (and then it gets messy because private affects accessibility and not visibility, but that's C++'s specific problem:-). I _like_ to use inheritance of implementation exactly for that -- implementation purposes -- without mystical IS-A obligations. It _may_ be because most of my experience is with COM, which does not expose implementation inheritance and gives each object full control on what interfaces it wants to supply -- behind the scenes, the object's implementation is free to use inheritance, delegation, or, as far as COM's concerned, bat wings and newt blood -- that's the object's business. But I do have enough experience in bare (no-COM) C++ and Java to know that I found the COM approach distinctly preferable (at least when the tools offered easy ways to get typical behavior while still leaving enough hooks and handles for me to get fine-grained control when needed -- Microsoft's ATL library was quite good for that). Alex From pinard@iro.umontreal.ca Mon Jul 15 20:18:34 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 15 Jul 2002 15:18:34 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > To the contrary, I wish GNU readline didn't call setlocale(). So, that's why it works interactively, and not in batch, then! > and tying such a proposal to this PEP is definitely the wrong thing. Expected and understood. I merely hope that this PEP's "concept" will not later be quoted as rigid. Going back from Unicode through ASCII may be today's way for implementing PEP 263, but not necessarily the only one. > Allowing non-ASCII identifiers may eventually happen, but there are > lots of reasons why it's a bad idea (such as source code portability), If Unicode letters get eventually accepted in Python identifiers, I do not much see what portability problems it would create. I mean, not more than with generators, or any other Python feature. It's forward compatible. Unless you mean that by encouraging non-English writings, _this_ creates a threat to portability, where English is the planetary computer language for exchanges. The point is that many Python programs, contrarily to the Python distribution itself, are not aiming the planet, and for some communities or teams, would be more useful and comfortable if not English. Each thing in its proper time, of course. Just let's keep the doors opened. -- François Pinard http://www.iro.umontreal.ca/~pinard From aleax@aleax.it Mon Jul 15 20:20:25 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 21:20:25 +0200 Subject: [Python-Dev] PEP 246 - Object Adaptation In-Reply-To: <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Monday 15 July 2002 04:04 pm, Guido van Rossum wrote: > (Changing the subject) > > > The big question is rather: given that Isub inherits from Isuper, > > does any object implementing Isub also implicitly implement Isuper? > > This probably shows my naivete more than anything else... > > I'd say "of course", based on an example where Isuper is > FileOpenForReading and Isub is FileOpenForReadingAndWriting. > It would be strange if a file open for reading and writing was not > acceptable in a place where a file open for reading is accepted > (because it implements all the right methods). Or is the fact that it > implements *more* the problem? It often does look like a R/W container "implements more" than the corresponding R/O container, but in many cases the R/W "subclass" can guarantee fewer invariants -- and they're often invariants quite hard to express even in languages that do support contracts. To see that in the case of a file, imagine the file interface having a rewind method. With a R/O file, I know that: def firstbyte(f): f.rewind() return f.read(1) always returns the same byte for a given f. If f is R/W, then I can't be certain any more, which may change the caching strategy I need to use, for example. (Of course, the same uncertainty might be present if, while f is R/O, the OS/whatever also allows other processes to open the underlying file for R/W at the same time, so in the case of files this only goes so far). More generally, it's _nice_ to be able to use inheritance just for implementation purposes, without necessarily having to worry about IS-A. When i have two interfaces with, say, three methods in common, I can refactor those three methods up to a common base-interface -- even if no object actually deigns to supply that base-interface. This simply avoids a little nasty copy-and-paste coding -- not an earth-shaking concern, admittedly. But, still, nice. Alex From guido@python.org Mon Jul 15 20:33:06 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 15:33:06 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Mon, 15 Jul 2002 21:12:54 +0200." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207150615.g6F6FJq28099@smtp.zope.com> <15666.55797.351811.317428@anthem.wooz.org> Message-ID: <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net> > > There's another issue that Jim Fulton likes to bring up, IIRC. If > > class Super implements IInterface, does class Sub(Super) also > > (automatically) implement IInterface? > > > > I could be totally misremembering, but I believe that Jim would say > > "no". Class Sub would have to explicitly declare that it also > > implements IInterface. > > I fully agree with Jim. Inheritance is often the handiest way to > _implement_ some things, but not if it comes with a mandatory > contract that you have to respect (specifically, supplying some > interfaces because your superclasses supply them). I'm happy to allow for a way to state explicitly that Sub doesn't implement IInterface, despite deriving from Super which does. But I think it ought to inherit this property by default (this is in fact what Zope does AFAIK). Otherwise creating minor variations on a class would be quite a pain -- you'd have to repeat all the interfaces implemented by the base class; and what if a later version of Super implements more interfaces? I would think that it's much more common to extend a class while maintaining its contract than to inherit for implementation only, even though there are important examples of the latter. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 20:38:01 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 15:38:01 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: Your message of "Mon, 15 Jul 2002 15:18:34 EDT." References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> > > and tying such a proposal to this PEP is definitely the wrong thing. > > Expected and understood. I merely hope that this PEP's "concept" will > not later be quoted as rigid. Going back from Unicode through ASCII may > be today's way for implementing PEP 263, but not necessarily the only one. Sure. > > Allowing non-ASCII identifiers may eventually happen, but there are > > lots of reasons why it's a bad idea (such as source code portability), > > If Unicode letters get eventually accepted in Python identifiers, I do not > much see what portability problems it would create. I mean, not more than > with generators, or any other Python feature. It's forward compatible. Well, for one, not everybody has an easy way to edit Unicode files. I expect I'd have to spend half a day downloading new stuff before I could. There's an issue with 8-bit encodings that is hopefully resolved by the encoding cookie proposed by this PEP -- but we'll have to see how well the PEP gets adapted. Not only Python itself needs to recognize these cookies -- also all tools that scan Python sources. (E.g. pyclbr.py and tokenize.py in the standard library.) > Unless you mean that by encouraging non-English writings, _this_ creates > a threat to portability, where English is the planetary computer language > for exchanges. The point is that many Python programs, contrarily to > the Python distribution itself, are not aiming the planet, and for some > communities or teams, would be more useful and comfortable if not English. This is exactly what the Chinese are already doing. I'm just worried that sooner or later they'll write someting that's useful outside China. I hope that English will remain the language for libraries shared within the Python community at large. > Each thing in its proper time, of course. Just let's keep the doors opened. Always. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 20:47:26 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 15:47:26 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Mon, 15 Jul 2002 21:04:59 +0200." References: <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207151947.g6FJlQ310369@pcp02138704pcs.reston01.va.comcast.net> > > Maybe the xreadlines object could grow a flush() method that throws > > away its buffer, and f.seek() could call that if there's a cached > > xreadlines iterator? > > Couldn't f.seek just decref the xreadlines object and put a NULL > into f's pointer to the xreadlines object? Well, that wouldn't help for code that's hanging on to the iterator. I also just realized that having the file object point to the xreadlines object creates a cycle, since the xreadlines object already points to the file. And neither participates in GC. I guess the xreadlines object could drop the pointer to the file once it's raised StopIteration, as a way to ensure that this is a sink state. Or we could add GC support to file objects and xreadline objects (sigh). --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Mon Jul 15 20:49:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Mon, 15 Jul 2002 21:49:11 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Monday 15 July 2002 09:33 pm, Guido van Rossum wrote: ... > I'm happy to allow for a way to state explicitly that Sub doesn't > implement IInterface, despite deriving from Super which does. But I > think it ought to inherit this property by default (this is in fact Yes, such a default would probably be handier than the opposite one in most cases. > what Zope does AFAIK). Otherwise creating minor variations on a class > would be quite a pain -- you'd have to repeat all the interfaces > implemented by the base class; and what if a later version of Super > implements more interfaces? This is actually a difficult point. If I have to explicitly state all the interfaces of Super that I want to _exclude_, and Super adds some more interfaces tomorrow, then it's quite possible that my class is suddenly broken -- it doesn't guarantee the invariants that says it guarantees, any more -- and I don't even know about it. At this point I'm thinking of my class as "a component", used by client code for its interfaces and contracts. Implementation inheritance is iffy enough in the component-world -- if it carries a baggage of exposing an a priori unknown set of interfaces it becomes basically unfeasible. > I would think that it's much more common > to extend a class while maintaining its contract than to inherit for > implementation only, even though there are important examples of the > latter. This is probably true. But maybe the explicitness we want is not per-interface: it's suddenly become an explicitness of "inheriting for implementation" vs "inheriting to extend" (with IS-A), just like C++'s private vs public inheritance (except, one hopes, done right -- i.e. with effect on visibility, not on accessibility). Alex From martin@v.loewis.de Mon Jul 15 20:50:33 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jul 2002 21:50:33 +0200 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Well, for one, not everybody has an easy way to edit Unicode files. I > expect I'd have to spend half a day downloading new stuff before I > could. If the PEP is implemented, IDLE will be able to honor the encoding declarations. As a side effect, this will allow you to edit UTF-8 files in IDLE. Allowing arbitrary Unicode in identifiers is no challenge, either, except that __dict__ dictionaries may suddenly find Unicode as keys. I'm not sure what other implications this would have, so it definitely is a separate issue. Another issue with allowing Unicode is that a good definition of "letter" must be given (it clearly should not depend on the locale). The Unicode consortium gives guidelines, but those depend on the Unicode version. Regards, Martin From tim.one@comcast.net Mon Jul 15 20:50:32 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 15:50:32 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207151612.g6FGCNb32176@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > Given that even Tim didn't find this in the PEP upon his first two > readings, While agreeing that caution is prudent, this specific reason is a poor one: I didn't read the PEP like "a user", but like a standards geek. It never occurred to me that a once-open issue would be resolved *only* in an addendum without the resolution also being reflected back into the main text. So I read the main text carefully, but barely even noticed the existence of the rest. On a Bell curve, I expect that way of reading a PEP is hugging a tail. > ... > I'd say that *if* there's a useful use of this, we shouldn't break > that in the 2.2 branch. 2.3 is a different issue. If Marc-Andre hasn't complained yet, there's no use at all for it . From guido@python.org Mon Jul 15 21:00:12 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 16:00:12 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: Your message of "Mon, 15 Jul 2002 21:50:33 +0200." References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> > If the PEP is implemented, IDLE will be able to honor the encoding > declarations. As a side effect, this will allow you to edit UTF-8 > files in IDLE. Who's gonna make the necessary changes to IDLE? > Allowing arbitrary Unicode in identifiers is no challenge, either, > except that __dict__ dictionaries may suddenly find Unicode as keys. > I'm not sure what other implications this would have, so it definitely > is a separate issue. As long as the only use of 8-bit strings is to contain pure ASCII, this shouldn't be a problem. > Another issue with allowing Unicode is that a good definition of > "letter" must be given (it clearly should not depend on the > locale). The Unicode consortium gives guidelines, but those depend on > the Unicode version. I'd just use the isalpha() method of Unicode string objects. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 15 21:06:42 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 16:06:42 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Mon, 15 Jul 2002 15:50:32 EDT." References: Message-ID: <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net> > While agreeing that caution is prudent, this specific reason is a poor one: > I didn't read the PEP like "a user", but like a standards geek. It never > occurred to me that a once-open issue would be resolved *only* in an > addendum without the resolution also being reflected back into the main > text. So I read the main text carefully, but barely even noticed the > existence of the rest. On a Bell curve, I expect that way of reading a PEP > is hugging a tail. I guess the true meaning of the term "BDFL Pronouncement" hasn't quite sunk in with you. :-) > > ... > > I'd say that *if* there's a useful use of this, we shouldn't break > > that in the 2.2 branch. 2.3 is a different issue. > > If Marc-Andre hasn't complained yet, there's no use at all for it . OK, on to practicalities. While preparing a patch, I discovered something strange: despite the fact that listiter_next() never raises StopIteration when it returns NULL, and despite the fact that it is used as the implementation for the next() method, calling iter(list()).next() *does* raise StopIteration, rather than a complaint about NULL without setting an exception condition. It took a brief debugging session to discover that in the presence of a tp_iternext function, the type machinery adds a next method that wraps tp_iternext. Cute, though unexpected! It means that the implementation of various iterators can be a little simpler, because no next() implementation needs to be given. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 15 21:14:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 15 Jul 2002 22:14:26 +0200 Subject: [Python-Dev] Termination of two-arg iter() References: <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D332D22.9020103@lemburg.com> >>>I'd say that *if* there's a useful use of this, we shouldn't break >>>that in the 2.2 branch. 2.3 is a different issue. >> >>If Marc-Andre hasn't complained yet, there's no use at all for it . I'm not following this thread... perhaps that's why ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Mon Jul 15 21:32:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jul 2002 22:32:08 +0200 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > If the PEP is implemented, IDLE will be able to honor the encoding > > declarations. As a side effect, this will allow you to edit UTF-8 > > files in IDLE. > > Who's gonna make the necessary changes to IDLE? I am. idlefork patch #508973 implements most of that, but doesn't support UTF-8 signatures. It also doesn't give good diagnostics if the user did not declare an encoding but uses non-ASCII. > > Allowing arbitrary Unicode in identifiers is no challenge, either, > > except that __dict__ dictionaries may suddenly find Unicode as keys. > > I'm not sure what other implications this would have, so it definitely > > is a separate issue. > > As long as the only use of 8-bit strings is to contain pure ASCII, > this shouldn't be a problem. I thought we were talking about non-ASCII in identifiers. > > Another issue with allowing Unicode is that a good definition of > > "letter" must be given (it clearly should not depend on the > > locale). The Unicode consortium gives guidelines, but those depend on > > the Unicode version. > > I'd just use the isalpha() method of Unicode string objects. That might vary across platforms (which I consider a bug) and across Python releases. Regards, Martin From guido@python.org Mon Jul 15 21:41:15 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 16:41:15 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: Your message of "Mon, 15 Jul 2002 22:32:08 +0200." References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net> > > Who's gonna make the necessary changes to IDLE? > > I am. idlefork patch #508973 implements most of that, but doesn't > support UTF-8 signatures. It also doesn't give good diagnostics if the > user did not declare an encoding but uses non-ASCII. Cool. > > > Allowing arbitrary Unicode in identifiers is no challenge, either, > > > except that __dict__ dictionaries may suddenly find Unicode as keys. > > > I'm not sure what other implications this would have, so it definitely > > > is a separate issue. > > > > As long as the only use of 8-bit strings is to contain pure ASCII, > > this shouldn't be a problem. > > I thought we were talking about non-ASCII in identifiers. Yes, but all the non-ASCII has to be represented as Unicode strings. I.e. no Latin-1 in 8-bit strings! > > > Another issue with allowing Unicode is that a good definition of > > > "letter" must be given (it clearly should not depend on the > > > locale). The Unicode consortium gives guidelines, but those depend on > > > the Unicode version. > > > > I'd just use the isalpha() method of Unicode string objects. > > That might vary across platforms (which I consider a bug) and across > Python releases. Really? I thought Unicode's isalpha() was built on the Unicode text database? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 15 22:42:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 15 Jul 2002 23:42:21 +0200 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D3341BD.30309@lemburg.com> Guido van Rossum wrote: >>>>Another issue with allowing Unicode is that a good definition of >>>>"letter" must be given (it clearly should not depend on the >>>>locale). The Unicode consortium gives guidelines, but those depend on >>>>the Unicode version. >>> >>>I'd just use the isalpha() method of Unicode string objects. >> >>That might vary across platforms (which I consider a bug) and across >>Python releases. > > Really? I thought Unicode's isalpha() was built on the Unicode text > database? It is, but on some platforms, the user can configure Python to use the C lib's versions instead of the Python provided ones (--with-ctype-functions). Also note that the Unicode database in Python was created from Unicode 3.0. Unicode 3.1 adds lots more characters and also changed a few character properties. I'd consider the case academic, though... I am not aware of any editor which can display the full Unicode 3.1 character set. The most complete font currently around seems to be the MS font for Arial (both cover Unicode 2.0): http://www.unicode.org/unicode/onlinedat/products.html -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jul 15 22:47:28 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 17:47:28 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Mon, 15 Jul 2002 16:06:42 EDT." Message-ID: <200207152147.g6FLlTQ12212@pcp02138704pcs.reston01.va.comcast.net> I've placed a patch for this on SF: http://python.org/sf/581944 . Comments please? --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jul 15 22:16:52 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jul 2002 23:16:52 +0200 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net> <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Yes, but all the non-ASCII has to be represented as Unicode strings. > I.e. no Latin-1 in 8-bit strings! Exactly. This might still cause problems for inspect and other introspective tools. For ASCII identifiers, I agree that using byte strings is sensible, for best backwards compatibility. > Really? I thought Unicode's isalpha() was built on the Unicode text > database? It isn't if it has a "usable wchar_t", see unicodeobject.h: #if defined(HAVE_USABLE_WCHAR_T) && defined(WANT_WCTYPE_FUNCTIONS) #include #define Py_UNICODE_ISSPACE(ch) iswspace(ch) ... I was missing the part that it also requires active selection of wctype functions - that is probably a feature that is never used. So it is better than I thought: isletter might vary across builds on the same platform, but likely never varies in practice. Regards, Martin From pinard@iro.umontreal.ca Mon Jul 15 23:33:37 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 15 Jul 2002 18:33:37 -0400 Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings In-Reply-To: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > This is exactly what the Chinese are already doing. I'm just worried > that sooner or later they'll write someting that's useful outside China. > I hope that English will remain the language for libraries shared within > the Python community at large. You know, people are already quite aware that if they want to contribute to the whole community, English is their best bet. If not evident enough already, this may be cut in writing within Python style guidelines: anything being contributed to Python has to be documented and commented in English. (On the other hand, the Python project might be kind enough for allowing various contributors to write their own name the way they like it best.) > Well, for one, not everybody has an easy way to edit Unicode files. In practice, from your viewpoint, it is unlikely that you'll have much to play with non-English and non-ASCII Python sources, if ever. And if it happens nevertheless, you are even in a position to request that modules be translated before you look at them. For a lot of years, in other projects, I never witnessed that it has been a real problem in practice. Of course, closed shops will take good care and make sure they have the proper tools. No need to protect them against themselves! :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From greg@cosc.canterbury.ac.nz Mon Jul 15 23:48:28 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 16 Jul 2002 10:48:28 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz> Guido: > Me: > > If the file > > object were to become an object obeying the iterator > > protocol, its next() method should really return the > > next *byte* of the file. > > I don't think so. We should pick the most convenient chunking for the > default iterator But we're talking here about making the file object *be* an iterator itself, not just have a "default iterator". If that's to happen, all the other ways of iterating over a file ought to be implemented on top of the basic iteration facility provided by the file object -- lest we get unfortunate interactions between the different iteration methods a la xreadlines(). To me, this implies that the file object must iterate by bytes. I'm not necessarily advocating this, just following the idea to its logical conclusion. If the conclusion is distasteful, maybe that's a sign that the idea (i.e. making file objects into iterators) isn't so good in the first place. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz@pythoncraft.com Tue Jul 16 00:06:19 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 19:06:19 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz> References: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net> <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz> Message-ID: <20020715230619.GA12513@panix.com> On Tue, Jul 16, 2002, Greg Ewing wrote: > Guido: >> Greg: >>> >>> If the file object were to become an object obeying the iterator >>> protocol, its next() method should really return the next *byte* of >>> the file. >> >> I don't think so. We should pick the most convenient chunking for the >> default iterator > > But we're talking here about making the file object *be* an iterator > itself, not just have a "default iterator". If that's to happen, all > the other ways of iterating over a file ought to be implemented on top > of the basic iteration facility provided by the file object -- lest we > get unfortunate interactions between the different iteration methods a > la xreadlines(). To me, this implies that the file object must iterate > by bytes. "Practicality beats purity" -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From ping@zesty.ca Tue Jul 16 00:16:51 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Mon, 15 Jul 2002 16:16:51 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Fri, 12 Jul 2002, Guido van Rossum wrote: > I don't see what's wrong with the file object. Iterating over a file > changes the file's state, that's just a fact of life. That's exactly the point. Iterators and containers are different. Walking over a container shouldn't mutate it, whereas an iterator has mutable state independent of the container. The key problem is that the file's __iter__ method returns something whose state depends on the file, thus breaking this expectation. Either __iter__ should be implemented to fulfill its commitment, or there shouldn't be an __iter__ method on files at all. I'm not suggesting that the semantics of files themselves are "broken" or have a "wart" that needs to be fixed -- merely that we should decide on a place for files to live in our world of containers and iterators, so we can set and maintain consistent expectations. -- ?!ng From ping@zesty.ca Tue Jul 16 00:16:54 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Mon, 15 Jul 2002 16:16:54 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207111616.g6BGGFB16385@europa.research.att.com> Message-ID: On Thu, 11 Jul 2002, Andrew Koenig wrote: > More seriously, I can imagine distinguishing a multiple iterator by > the presence of __copy__, but I can't imagine using the presence of > __copy__ to determine whether a *container* supports multiple > iteration. For example, there surely exist containers today that > support __copy__ but whose __iter__ methods yield iterators that do > not themselves support __copy__. Just fetch the iterator from the container and look for __copy__ on that. Or, what if there is no container to begin with, but the iterator is still copyable? You can't flag that by putting __multiter__ on anything; again it makes more sense to just provide __copy__ on the iterator. All that's really necessary here is to document the convention about what __copy__ is supposed to mean if it's available on an iterator. If we all agree that __copy__ should preserve an independent copy of the current state of the iterator, we're all set. > Another reason is that I can imagine this idea extended to encompass, > say, ambidextrous iterators that support prev() as well as next(), > and I would want to use __ambiter__ as a marker for those rather > than having to create an iterator and see if it has prev(). I think a proliferation of iterator-fetching methods would be a messy and unpleasant prospect. After __iter__, __multiter__, and __ambiter__, what next? __mutableiter__? __depthfirstiter__? __breadthfirstiter__? -- ?!ng "If I have seen farther than others, it is because I was standing on a really big heap of midgets." -- K. Eric Drexler From ping@zesty.ca Tue Jul 16 00:16:56 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Mon, 15 Jul 2002 16:16:56 -0700 (PDT) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Message-ID: On Tue, 9 Jul 2002, Tim Peters wrote: > > The context does not make it clear that the iterator's __iter__ method is > > *only* required whenever one *also* wants to use an iterator as an > > iterable. > > That's not how the iteration protocol is defined, and isn't how it should be > defined either. Requiring *some* method with a reserved name is an aid to > introspection This is a terrible reason for the existence of an __iter__ method, because (a) it's a bad way to do type-checking and (b) it doesn't even work. (a) If we followed this logic, we'd insist on having a useless __dict__ method on dictionaries, a useless __list__ method on lists, etc. etc. just so we could check types by looking for these methods. As i understood it, the Python way is to let the protocol speak for itself. Something that wants to give out keys can implement the keys() method, something that wants to act like a container can implement __getitem__, and so on. There's no need to make an additional declaration of dict-ness by adding a dummy __dict__ method -- indeed, sometimes we don't *want* to make that kind of commitment, and Python allows that flexibility. It seems to me that dictionaries are to keys() as iterators are to next(). (b) Looking for __iter__ is not a valid test for iterator-ness. Files and other iterable objects supply __iter__, but they are not iterators. So it doesn't work as a type test. I agree with Oren that it makes more sense for iterator-fetching to be a convenience handled by the implementations of "for" and "in", rather than foisting the extra hassle of "def __iter__(self): return self" on every individual iterator implementation. -- ?!ng From ark@research.att.com Tue Jul 16 00:25:13 2002 From: ark@research.att.com (Andrew Koenig) Date: Mon, 15 Jul 2002 19:25:13 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: (message from Ka-Ping Yee on Mon, 15 Jul 2002 16:16:54 -0700 (PDT)) References: Message-ID: <200207152325.g6FNPD128921@europa.research.att.com> Ping> Just fetch the iterator from the container and look for __copy__ on that. Yes, that's an alternative. However the purpose my suggestion of __multiter__ was not to use it to test for multiple iteration, but to enable a container to be able to yield either a single or a multiple iterator on request. Ping> Or, what if there is no container to begin with, but the iterator is still Ping> copyable? You can't flag that by putting __multiter__ on anything; again Ping> it makes more sense to just provide __copy__ on the iterator. You could flag it by putting __multiter__ on the iterator, just as iterators presently have __iter__. Ping> All that's really necessary here is to document the convention about what Ping> __copy__ is supposed to mean if it's available on an iterator. If we Ping> all agree that __copy__ should preserve an independent copy of the Ping> current state of the iterator, we're all set. Not quite. We also need an agreement that calling __iter__ on a container is not a destructive operation unless you call next() on the iterator that you get back. >> Another reason is that I can imagine this idea extended to encompass, >> say, ambidextrous iterators that support prev() as well as next(), >> and I would want to use __ambiter__ as a marker for those rather >> than having to create an iterator and see if it has prev(). Ping> I think a proliferation of iterator-fetching methods would be a Ping> messy and unpleasant prospect. After __iter__, __multiter__, Ping> and __ambiter__, what next? __mutableiter__? Ping> __depthfirstiter__? __breadthfirstiter__? A data structure that supports several different kinds of iteration has to provide that support somehow. What's your suggestion? From barry@zope.com Tue Jul 16 00:46:08 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 15 Jul 2002 19:46:08 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15667.24256.621942.555107@anthem.wooz.org> >>>>> "AM" == Alex Martelli writes: >> what Zope does AFAIK). Otherwise creating minor variations on >> a class would be quite a pain -- you'd have to repeat all the >> interfaces implemented by the base class; and what if a later >> version of Super implements more interfaces? AM> This is actually a difficult point. If I have to explicitly AM> state all the interfaces of Super that I want to _exclude_, AM> and Super adds some more interfaces tomorrow, then it's quite AM> possible that my class is suddenly broken -- it doesn't AM> guarantee the invariants that says it guarantees, any more -- AM> and I don't even know about it. You'd need a way to explicitly state that you implement /none/ of the interfaces of your superclass, and then explicitly add back the ones you do implement. -Barry From guido@python.org Tue Jul 16 00:48:02 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 19:48:02 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 16 Jul 2002 10:48:28 +1200." <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz> References: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz> Message-ID: <200207152348.g6FNm8C12883@pcp02138704pcs.reston01.va.comcast.net> > > I don't think so. We should pick the most convenient chunking for the > > default iterator > > But we're talking here about making the file object > *be* an iterator itself, not just have a "default > iterator". If that's to happen, all the > other ways of iterating over a file ought to be > implemented on top of the basic iteration facility > provided by the file object -- lest we get unfortunate > interactions between the different iteration methods > a la xreadlines(). To me, this implies that the file > object must iterate by bytes. Why should all iteration over a file be defined in terms of its basic iteration? I don't see that as dogma. --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Tue Jul 16 00:51:23 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Mon, 15 Jul 2002 16:51:23 -0700 (PDT) Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Sun, 14 Jul 2002, Guido van Rossum wrote: > The question is, should we place the burden on iterator users to avoid > calling next() after the first StopIteration, or should we place the > burden on iterator implementations? Since by far the most common > iterator use case is still a single for loop, which already does the > right thing, it's not at all clear to me which is worse. As a general design philosophy question, my vote would be for placing the burden on the implementations. If code reuse is all it's cracked up to be, you're going to use the iterator more times than you implemented it. Moreover, the more consistent the implementation is, the more widely it can be used. (Tim just said this.) As for the specifics of the iterator protocol, there seem to be two separate issues here: 1. After StopIteration, should iterators be allowed to keep going? 2. Should an empty iterator be distinguishable from an exhausted iterator? For 1, i don't think i've seen anyone come down too strongly on the "yes" side. There have been a couple of examples as to why this might be cute, but i don't think they are compelling. My opinion is that, if you are trying to make an iterator keep going after it has stopped, it's just a way of abusing the iterator to represent a sequence of sequences. You can always get the behaviour you want by explicitly describing both kinds of sequence. Tim's example of getting paragraphs out of a file demonstrates exactly why we don't want to encourage the abuse of one iterator to represent a sequence of sequences: you're going to be in trouble if you can't distinguish between the termination conditions for the two kinds of sequences. For 2, i believe Andrew and Oren want the answer to be "yes", but Guido and Aahz want the answer to be "no". I think the answer should be "yes". An exhausted iterator is not the same thing as a freshly-created iterator on an empty sequence, and allowing one to silently pass for the other is going to lead to problems. I'm not going to insist that IndexError should be the effect, as Guido's preference to keep IndexError for randomly-indexable sequences seems reasonable; anything distinguishable from StopIteration is fine. -- ?!ng From guido@python.org Tue Jul 16 00:53:52 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 15 Jul 2002 19:53:52 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Mon, 15 Jul 2002 16:16:51 PDT." References: Message-ID: <200207152353.g6FNrqm13200@pcp02138704pcs.reston01.va.comcast.net> > On Fri, 12 Jul 2002, Guido van Rossum wrote: > > I don't see what's wrong with the file object. Iterating over a file > > changes the file's state, that's just a fact of life. [Ping] > That's exactly the point. Iterators and containers are different. > Walking over a container shouldn't mutate it, whereas an iterator > has mutable state independent of the container. > > The key problem is that the file's __iter__ method returns something > whose state depends on the file, thus breaking this expectation. > Either __iter__ should be implemented to fulfill its commitment, or > there shouldn't be an __iter__ method on files at all. What commitment? Iterators don't have to have an undelying container! (E.g. generators.) > I'm not suggesting that the semantics of files themselves are "broken" > or have a "wart" that needs to be fixed -- merely that we should decide > on a place for files to live in our world of containers and iterators, > so we can set and maintain consistent expectations. What are your expectations? I think that both file.__iter__() returning file (as it does with Oren's patch) or file.__iter__() returning an xreadlines object (as it still does in CVS) are fine as far as reasonable expectations for iterators go. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <200207152325.g6FNPD128921@europa.research.att.com> Message-ID: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> From: "Andrew Koenig" > Ping> Just fetch the iterator from the container and look for __copy__ on that. > > Yes, that's an alternative. > > However the purpose my suggestion of __multiter__ was not to use it to > test for multiple iteration, but to enable a container to be able to > yield either a single or a multiple iterator on request. Why would you want that? Seems like a corner case at best. > A data structure that supports several different kinds of iteration > has to provide that support somehow. What's your suggestion? class DataStructure(object): def __init__(self): self._numbers = range(10); self._names = [ str(x) for x in range(10) ]; names = property(lambda self: iter(self._names)) numbers = property(lambda self: iter(self._numbers)) x = DataStructure(); for y in x.names: print repr(y), print for y in x.numbers: print repr(y), [Y'know, Python is great. That worked the first time I ran it.] -Dave From ark@research.att.com Tue Jul 16 01:32:36 2002 From: ark@research.att.com (Andrew Koenig) Date: Mon, 15 Jul 2002 20:32:36 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> (david.abrahams@rcn.com) References: <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> Message-ID: <200207160032.g6G0Wac29265@europa.research.att.com> >> However the purpose my suggestion of __multiter__ was not to use it to >> test for multiple iteration, but to enable a container to be able to >> yield either a single or a multiple iterator on request. David> Why would you want that? Seems like a corner case at best. You're right -- I wasn't thinking clearly. What I meant to say was that I would like a program that expects to be able to use a multiple iterator to be able to say so simply and efficiently in code. For example: for i in multiter(x): // whatever I would like this to fail cleanly if x does not support multiple iterators. >> A data structure that supports several different kinds of iteration >> has to provide that support somehow. What's your suggestion? David> class DataStructure(object): David> def __init__(self): David> self._numbers = range(10); David> self._names = [ str(x) for x in range(10) ]; David> names = property(lambda self: iter(self._names)) David> numbers = property(lambda self: iter(self._numbers)) David> x = DataStructure(); David> for y in x.names: David> print repr(y), David> print David> for y in x.numbers: David> print repr(y), David> [Y'know, Python is great. That worked the first time I ran it.] I don't understand how this code answers my question. You've asked for iterators over two different data structures. What I was asking was, for example, how one might arrange for a single tree to yield either a depth-first or breadth-first iterator. From David Abrahams" <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> Message-ID: <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> From: "Andrew Koenig" > >> A data structure that supports several different kinds of iteration > >> has to provide that support somehow. What's your suggestion? > > David> class DataStructure(object): > David> def __init__(self): > David> self._numbers = range(10); > David> self._names = [ str(x) for x in range(10) ]; > > David> names = property(lambda self: iter(self._names)) > David> numbers = property(lambda self: iter(self._numbers)) > > David> x = DataStructure(); > David> for y in x.names: > David> print repr(y), > > David> print > > David> for y in x.numbers: > David> print repr(y), > > David> [Y'know, Python is great. That worked the first time I ran it.] > > I don't understand how this code answers my question. > You've asked for iterators over two different data structures. > What I was asking was, for example, how one might arrange for a single > tree to yield either a depth-first or breadth-first iterator. Just replace 'names' by breadth_first and 'numbers' by depth_first. or-vice-versa-ly y'rs, dave From aahz@pythoncraft.com Tue Jul 16 01:46:36 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 20:46:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020716004635.GA2384@panix.com> On Mon, Jul 15, 2002, Ka-Ping Yee wrote: > > 2. Should an empty iterator be distinguishable from an exhausted iterator? > > For 2, i believe Andrew and Oren want the answer to be "yes", > but Guido and Aahz want the answer to be "no". I think the answer > should be "yes". An exhausted iterator is not the same thing as > a freshly-created iterator on an empty sequence, and allowing one > to silently pass for the other is going to lead to problems. I don't think I expressed an opinion on this, and if you think I did, either you misunderstood me or I misunderstood what I was expounding on. I also think that's the wrong question, given the nature of iterators; before you can ask that question, you need to demonstrate that there is in fact a difference between an empty iterator and an exhausted iterator. I think that you can't demonstrate that, but I'm certainly willing to be convinced. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tdelaney@avaya.com Tue Jul 16 02:10:16 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Tue, 16 Jul 2002 11:10:16 +1000 Subject: [Python-Dev] Termination of two-arg iter() Message-ID: > From: Aahz [mailto:aahz@pythoncraft.com] > On Mon, Jul 15, 2002, Ka-Ping Yee wrote: > > > I also think that's the wrong question, given the nature of iterators; > before you can ask that question, you need to demonstrate > that there is > in fact a difference between an empty iterator and an > exhausted iterator. > I think that you can't demonstrate that, but I'm certainly > willing to be > convinced. I think the definition that some people are using is: An exhausted iterator is one for which StopIteration has already been raised. An empty iterator OTOH is one which will raise StopIteration the next time next() is called. An iterator for an empty list is the simplest example of this, although it should be applied to any iterator. FWIW I think the "best" behaviour for iterators is that once an iterator begins raising StopIteration is must continue to do so under any circumstances. Given than, I don't see a lot of point in distinguishing between the two above cases. One way this could be enforced (and the burden removed from iterator writers) is to have iter() always returned a wrapper around an iterator: class EnforcementIterator: __slots__ = ('iterator', 'exhausted',) def __init__(self, iterator): self.iterator = iterator self.exhausted = False # getattr, setattr, delattr delegate to self.iterator def __iter__(self): return self def next (self): if self.exhausted: raise StopIteration() try: return self.iterator.next() except StopIteration: self.exhausted = True raise def iter (iterable): # testing for type - optimisation ;) if iterable instanceof EnforcementIterator: return iterable else: return EnforcementIterator(iterable.__iter__()) Tim Delaney From tim.one@comcast.net Tue Jul 16 02:10:31 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 21:10:31 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Message-ID: [Ping] > As a general design philosophy question, my vote would be for > placing the burden on the implementations. If code reuse is all > it's cracked up to be, you're going to use the iterator more times > than you implemented it. Moreover, the more consistent the > implementation is, the more widely it can be used. (Tim just said this.) OTOH, the less the protocol defines, the more open it is to unforeseen uses. Tim just said that too . > As for the specifics of the iterator protocol, there seem to be > two separate issues here: > > 1. After StopIteration, should iterators be allowed to keep going? > > 2. Should an empty iterator be distinguishable from an exhausted > iterator? > > For 1, i don't think i've seen anyone come down too strongly on > the "yes" side. There have been a couple of examples as to why > this might be cute, but i don't think they are compelling. I haven't seen an example of why it might useful, although I could have made some up, and have been pleasantly surprised all along that nobody else made one up either . We saw a few examples illustrating that StopIteration is in fact not sticky today, but nobody claimed such uses "were features". Jeff Epler made one up to get clarification, and I showed a dict iter example that demonstrated how unpredictable it can get now. > My opinion is that, if you are trying to make an iterator keep going > after it has stopped, it's just a way of abusing the iterator to > represent a sequence of sequences. > > You can always get the behaviour you want by explicitly describing > both kinds of sequence. Tim's example of getting paragraphs out > of a file demonstrates exactly why we don't want to encourage the > abuse of one iterator to represent a sequence of sequences: you're > going to be in trouble if you can't distinguish between the > termination conditions for the two kinds of sequences. That example relied on StopIteration being sticky (which it already happens to be for the specific iter(file.readline, "") case), not on iteration doing "something useful" after StopIteration had been raised. A sequence is either empty, or an element followed by a sequence. Sticky StopIteration makes the "empty" case at the end reliably empty, and, I think, for much the same reason Python has always kept returning "" from file.read() after it reaches EOF. There's simply nothing erroneous about reaching the end of a sequence, or about probing it again to determine emptiness instead of carrying around fiddly flags in parallel. > For 2, i believe Andrew and Oren want the answer to be "yes", > but Guido and Aahz want the answer to be "no". I think the answer > should be "yes". An exhausted iterator is not the same thing as > a freshly-created iterator on an empty sequence, and allowing one > to silently pass for the other is going to lead to problems. I'm on the "no" side there -- an empty sequence is no more error-prone than that range(10, 10) returns an empty list, or string[i:i] an empty string, or that file("some_empty_file").read() returns an empty string. An iterator-based algorithm works on some prefix of the elements "from here until the end": an exhausted sequence and an empty sequence are indeed indistinguishable from that view. Indeed, I'm having a hard time imagining *wanting* to distiguish the two. > I'm not going to insist that IndexError should be the effect, as > Guido's preference to keep IndexError for randomly-indexable > sequences seems reasonable; anything distinguishable from > StopIteration is fine. OK, if we have to do this, let's call it StopIteration2 and make it a subclass of StopIteration so my code won't have to know it exists . From aahz@pythoncraft.com Tue Jul 16 02:39:42 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 21:39:42 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: Message-ID: <20020716013942.GA11513@panix.com> On Tue, Jul 16, 2002, Delaney, Timothy wrote: > From: Aahz [mailto:aahz@pythoncraft.com] >> >> I also think that's the wrong question, given the nature of >> iterators; before you can ask that question, you need to demonstrate >> that there is in fact a difference between an empty iterator and an >> exhausted iterator. I think that you can't demonstrate that, but I'm >> certainly willing to be convinced. > > I think the definition that some people are using is: > > An exhausted iterator is one for which StopIteration has already been > raised. > > An empty iterator OTOH is one which will raise StopIteration the next time > next() is called. An iterator for an empty list is the simplest example of > this, although it should be applied to any iterator. In order to draw this distinction, you have to change the definition of "iterator" that we've been using. The sole protocol of iterator to date has been the existence of a next() method that either returns an item or raises StopIteration. Making statements about what an iterator *will* do counts as abuse IMO. If you want a feature like that, go use something else -- don't break the simplicity of iterators. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tdelaney@avaya.com Tue Jul 16 02:47:57 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Tue, 16 Jul 2002 11:47:57 +1000 Subject: [Python-Dev] Termination of two-arg iter() Message-ID: > From: Aahz [mailto:aahz@pythoncraft.com] > > On Tue, Jul 16, 2002, Delaney, Timothy wrote: > > From: Aahz [mailto:aahz@pythoncraft.com] > > > > I think the definition that some people are using is: ^^^^^ > > > > An exhausted iterator is one for which StopIteration has > > > > An empty iterator OTOH is one which will raise > > In order to draw this distinction, you have to change the > definition of > "iterator" that we've been using. The sole protocol of > iterator to date > has been the existence of a next() method that either returns > an item or > raises StopIteration. Making statements about what an iterator *will* > do counts as abuse IMO. If you want a feature like that, go use > something else -- don't break the simplicity of iterators. Aahz - you did read the next paragraph didn't you. "... I don't see a lot of point in distinguishing between the two above cases." I'm *against* distinguishing between the two - I do not want a feature like that. Tim Delaney From ark@research.att.com Tue Jul 16 03:05:33 2002 From: ark@research.att.com (Andrew Koenig) Date: Mon, 15 Jul 2002 22:05:33 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> (david.abrahams@rcn.com) References: <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> Message-ID: <200207160205.g6G25XC29723@europa.research.att.com> David> Just replace 'names' by breadth_first and 'numbers' by depth_first. David> or-vice-versa-ly y'rs, which doesn't address the question of a uniform convention. From aahz@pythoncraft.com Tue Jul 16 03:06:12 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 15 Jul 2002 22:06:12 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: Message-ID: <20020716020612.GA16827@panix.com> On Tue, Jul 16, 2002, Delaney, Timothy wrote: > From: Aahz [mailto:aahz@pythoncraft.com] >> On Tue, Jul 16, 2002, Delaney, Timothy wrote: >>> >>> I think the definition that some people are using is: > ^^^^^ >>> >>> An exhausted iterator is one for which StopIteration has >>> >>> An empty iterator OTOH is one which will raise >> >> In order to draw this distinction, you have to change the definition >> of "iterator" that we've been using. The sole protocol of iterator >> to date has been the existence of a next() method that either returns >> an item or raises StopIteration. Making statements about what an >> iterator *will* do counts as abuse IMO. If you want a feature >> like that, go use something else -- don't break the simplicity of >> iterators. > > Aahz - you did read the next paragraph didn't you. > > "... I don't see a lot of point in distinguishing between the two > above cases." Sorry for being unclear; that was the generic "you", not pointing at you (Tim) specifically. s/you/one/ -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim.one@comcast.net Tue Jul 16 03:25:08 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 15 Jul 2002 22:25:08 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > [Y'know, Python is great. That worked the first time I ran it.] Oops! Please file a bug report on SourceForge . From David Abrahams" <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com> Message-ID: <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com> From: "Andrew Koenig" > David> Just replace 'names' by breadth_first and 'numbers' by depth_first. > > David> or-vice-versa-ly y'rs, > > which doesn't address the question of a uniform convention. I'm with you on the desire to have a way to get a multipass-iterator-or-error in one swell foop. That says "I want to iterate over this thing without changing it". I still think that hasattr(iter(x), '__copy__') is a pretty clean way to do that, despite the fact that it potentially creates an iterator (which some people apparently view as too heavyweight as an introspection step). However, I don't see any point in trying to define a protocol for every different possible iteration view of a thing. Dicts have keys, values, and items. Trees have breadth-first, depth-first, inorder, postorder, blah, blah, blah. There are just too many of these, and they're all different. -Dave From ark@research.att.com Tue Jul 16 04:22:46 2002 From: ark@research.att.com (Andrew Koenig) Date: Mon, 15 Jul 2002 23:22:46 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com> (david.abrahams@rcn.com) References: <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com> <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com> Message-ID: <200207160322.g6G3Mkh00041@europa.research.att.com> David> I'm with you on the desire to have a way to get a David> multipass-iterator-or-error in one swell foop. That says "I David> want to iterate over this thing without changing it". I still David> think that hasattr(iter(x), '__copy__') is a pretty clean way David> to do that, despite the fact that it potentially creates an David> iterator (which some people apparently view as too heavyweight David> as an introspection step). In particular, creating an iterator had better not be a destructive operation. David> However, I don't see any point in trying to define a protocol David> for every different possible iteration view of a thing. Dicts David> have keys, values, and items. Trees have breadth-first, David> depth-first, inorder, postorder, blah, blah, blah. There are David> just too many of these, and they're all different. I wasn't suggesting defining a protocol for every possible iteration view. I was raising the question of whether multi-pass iteration was likely to be a common enough operation that it is worth defining a protocol for it, while leaving the door open to defining protocols for others should it turn out to be desirable to do so. From David Abrahams" <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com> <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com> <200207160322.g6G3Mkh00041@europa.research.att.com> Message-ID: <254001c22c79$155d82b0$6601a8c0@boostconsulting.com> From: "Andrew Koenig" > I wasn't suggesting defining a protocol for every possible iteration > view. I was raising the question of whether multi-pass iteration > was likely to be a common enough operation that it is worth defining > a protocol for it, while leaving the door open to defining protocols > for others should it turn out to be desirable to do so. I think your examples are confusing different beasts, then. Multipass (=copyable in these examples) should be a capability of iterators, just as bidirectional or random-access would be. Breadth-first/depth-first is not a capability of the iterator in that sense, but an implementatoin detail -- from the POV of the iterator's user, there's no way to tell what the traversal order is. -Dave From oren-py-d@hishome.net Tue Jul 16 06:25:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 16 Jul 2002 08:25:03 +0300 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 15, 2002 at 10:39:51AM -0400 References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020716082503.A20992@hishome.net> On Mon, Jul 15, 2002 at 10:39:51AM -0400, Guido van Rossum wrote: > > http://www.python.org/sf/580331 > > > > No, it's not a complete rewrite of file buffering. This patch > > implements Just's idea of xreadlines caching in the file object. It > > also makes a file into an iterator: __iter__ returns self and next > > calls the next method of the cached xreadlines object. > > Hm. What happens to the xreadlines object when you do a seek() on the > file? > With the old semantics, you could do f.seek(0) and get another > iterator (assuming it's a seekable file of course). With the new > semantics, the cached iterator keeps getting in the way. On the new version of patch #580331 the cache is invalidated on a seek. > Maybe the xreadlines object could grow a flush() method that throws > away its buffer, and f.seek() could call that if there's a cached > xreadlines iterator? The behavior of an xreadlines object is already undefined after a seek on the file. This patch doesn't try to fix that. The invalidation makes sure that the next iter() call will produce a fresh xreadlines, though. Flushing would be too much work for this little hack. The right solution would be to fully integrate buffering into the file object and get rid of the dependency on the xreadlines module. The xreadlines method will then be equivalent to __iter__ (i.e. return self). I assume that after this rewrite the xreadlines module would be deprecated. > > See my previous postings for why I think a file should be an iterator. > > Haven't seen them but I would agree that this makes sense. For some reason I got the impression that you disagreed. > I just realized that the (existing) file_xreadlines() function has a > subtle bug. It uses a local static variable to cache the function > xreadlines imported from the module xreadlines. But if there are > multiple interpreters or Py_Finalize() is called and then > Py_Initialize() again, the cache is invalid. Would you mind fixing > this? I think the caching just isn't worth it -- just do the import > every time (it's fast enough if sys.modules['xreadlines'] already > exists). Done. Oren From tim.one@comcast.net Tue Jul 16 06:24:23 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 16 Jul 2002 01:24:23 -0400 Subject: [Python-Dev] AtExit Functions In-Reply-To: <3D327F57.4040705@lemburg.com> Message-ID: [MAL, on Py_AtExit()] > PyObject_Del() [must be avoided] as well ? I don't think that one's a problem. The Py{Object,Mem}_{Del,DEL,Free,FREE} spellings have resolved to plain free(). Under pymalloc that's different, but pymalloc never tears itself down so it's safe there too. >> We have two sets of exit-function gimmicks, one that runs at >> the very start of Py_Finalize(), and the other at the very end. >> If you need to clean up Python objects, you have to get into >> the first group. > I suppose the first one is what the atexit module exposes > in Python 2.0+, right ? Yes, but do read Skip's message too. atexit.py wraps a primitive gimmick that was also in 1.5.2. See the docs for sys.exitfunc at http://www.python.org/doc/1.5.2p2/lib/module-sys.html The docs are pretty much the same now, except atexit.py provides a rational way to register multiple exit functions now. Hmm! The logic in atexit.py looks wrong if sys.exitfunc already exists: then atexit appends it to the module's own list of exit functions, but then forgets to do anything to ensure that its own list gets run at the end. I conclude that nobody has tried mixing these gimmicks. In any case, you can do something safe across all versions via: import sys try: inherited = sys.exitfunc except AttributeError: def inherited(): pass def myexitfunc(): clean_up_my_stuff() inherited() sys.exitfunc = myexitfunc del sys You can get screwed then if somebody else sets sys.exitfunc later without being as considerate of your hook as the code above is of a pre-existing hook, but then that's why atexit.py was created and you should move to a later Python if you want saner behavior . > The problem with that approach is that there may still be some > references to objects left in lists and dicts which are cleaned > up after having called the atexit functions. This is not so > much a problem in my cases, but something to watch out in other > applications which use C level Python objects as globals. I don't know specifically what you have in mind there, but I expect that it would kick off another round of the every-18-months discussion of what kind of module finalization protocol Python should start to support. A PEP for that is long overdue. >>> Also, atexit.py is not present in Python 1.5.2. >> What's that ? > That's the Python version which was brand new just 3 years > ago. I know... in US terms that's for history books ;-) Oh, 3 years ago is sooooo 20th century! Goodness, they didn't even have cellophane sleeping tubes back then. May as well go back to worshipping cats while you're at it. From tim.one@comcast.net Tue Jul 16 06:56:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 16 Jul 2002 01:56:33 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > While preparing a patch, I discovered something strange: despite the > fact that listiter_next() never raises StopIteration when it returns > NULL, That much is a documented part of the tp_iternext protocol. > and despite the fact that it is used as the implementation for > the next() method, Oops. > calling iter(list()).next() *does* raise StopIteration, rather than a > complaint about NULL without setting an exception condition. That is surprising! > It took a brief debugging session to discover that in the presence > of a tp_iternext function, the type machinery adds a next method that > wraps tp_iternext. Cute, though unexpected! It also explains an old mystery I never got around to investigating: >>> print iter([]).next.__doc__ x.next() -> the next value, or raise StopIteration >>> That was a mystery because that's not the docstring attached to the list iterator's next() method: static PyMethodDef listiter_methods[] = { {"next", (PyCFunction)listiter_next, METH_NOARGS, "it.next() -- get the next value, or raise StopIteration"}, > It means that the implementation of various iterators can be a little > simpler, because no next() implementation needs to be given. I'm not sure that's a feature we always want to use. Going thru a wrapper function (a) adds another layer of function call, and (b) adds a if (!PyArg_ParseTuple(args, "")) return NULL; call via wrap_next(). Both expenses could be avoided if an existing next method were left alone. I suppose only the seoond expense is actually "real", though, as most explicit xyz_next methods naturally call the tp_iternext slot function anyway. Still, when the body of a "next" method is as simple as it is for lists, a call to PyArg_ParseTuple is a significant overhead. From oren-py-d@hishome.net Tue Jul 16 07:08:26 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 16 Jul 2002 02:08:26 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020716060826.GA92278@hishome.net> On Mon, Jul 15, 2002 at 11:15:58AM -0400, Guido van Rossum wrote: > I'm still only considering two options: > > (a) leave the status quo, or > (b) implement (and document!) the "sink-state" rule from the PEP. (c) leave it officially undefined but make all builtin iterator behave consistently. Implementing consistent post-StopIteration behavior for builtin iterators is not too hard and doesn't require adding flags and special cases - when the iterator is exhausted it can clean up and decref any referenced objects and change its type to a StoppedIterator type. I can write a patch. I would prefer this StoppedIterator type to raise a new exception when its next() is called. I assume you would want it to be a StopIteration sink. As the risk of sounding like a broken record I will repeat my position: I consider the StopIteration sink state to be a silent error. It makes an exhausted iterator behave just like an iterator of an empty sequence. Because iterators and iterables can be mixed freely it results in silent failures when a function that requires a re-iterable object gets an iterator. Iterables can serve as a replacement for sequences in most cases. When they are not I'd like to get an error, please. When I pass a popened pipe to a function that expects a real file I will get an error if the function tries to perform a seek. I wouldn't want the seek operation to fail silently but that's more-or-less the equivalent of what iterators currently do. silent errors delenda est Oren From aleax@aleax.it Tue Jul 16 08:51:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Tue, 16 Jul 2002 09:51:11 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <15667.24256.621942.555107@anthem.wooz.org> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <15667.24256.621942.555107@anthem.wooz.org> Message-ID: On Tuesday 16 July 2002 01:46 am, Barry A. Warsaw wrote: > >>>>> "AM" == Alex Martelli writes: > >> what Zope does AFAIK). Otherwise creating minor variations on > >> a class would be quite a pain -- you'd have to repeat all the > >> interfaces implemented by the base class; and what if a later > >> version of Super implements more interfaces? > > AM> This is actually a difficult point. If I have to explicitly > AM> state all the interfaces of Super that I want to _exclude_, > AM> and Super adds some more interfaces tomorrow, then it's quite > AM> possible that my class is suddenly broken -- it doesn't > AM> guarantee the invariants that says it guarantees, any more -- > AM> and I don't even know about it. > > You'd need a way to explicitly state that you implement /none/ of the > interfaces of your superclass, and then explicitly add back the ones > you do implement. Right -- "i inherit the implementation but none of the interfaces". You can express this either by appropriately tagging the "I inherit" part, as C++ does (private inheritance -- the default, but that's yet _another_ C++ issue... defaults that may or may not be appropriate for typical use!-), or with a variation of "exclude-interfaces" however you spell that. Alternatively, "I inherit" could default to "not the interfaces", and, if needed, one might add a clause "oh, and all the interfaces too, please" when that is positively desired. Maybe the default would best be chosen on the basis of "what is good for them" rather on "what appears most desirable intuitively", as is currently done for module imports. Defaulting to "i inherit all" is roughly as convenient as defaulting to "from amodule import *" would be felt to be by naive users unfamiliar with the issues of namespace pollution. You know, "convenience" is getting to be something of a dirty word:-). Knuth said "premature optimization is the root of all evil in programming", and no doubt he was right for HIS generation -- people who grew up on machines with a few KB of memory and small fractions of MIPs were warped for life by the need to squeeze every possible drop of optimization. We still have some of that, no doubt. But current generations of programmers grew up on machines of overwhelming power -- gradually, the pitfall of premature optimization becomes less pervasive. OTOH, the same programmers grew up on machines overburdened to the gills with a surfeit of "convenient" features... a new "root of some evil" is emerging, and it's spelled "convenience". Perl's surfeit of ad-hoc, context-dependent, highly-"convenient" surprises just waiting to trip you at every step should be an object lesson in "convenience". Simple, clean, orthogonal, predictable, clear, unsurprising, regular. Now THESE are the buzzwords I long for... "convenient", OTOH, makes me wary. Convenience has its place, just as does optimization, and Python has traditionally done a great job of supplying just enough optimization and just enough convenience without compromising the really important buzzwords above listed. OTOH, the BDFL does say that he's not very experienced with "components" (interface-based programming); and few can claim extensive experience with many different nuances of that (on introspection, I _would_ claim for myself extensive experience with production use of COM, but a bit lesser with "bare C++" [a la Lakos, say] and definitely not "extensive" for Java, Haskell and others). Would we REALLY like to have: import foo do the equivalent of today's from foo import * and have to explicitly say import foo dont_pollute_my_namespace to get today's "import foo" behavior? I'm sure many beginners would love it -- you have to pound their heads with a mallet to wean them off the "import *" even today, particularly if they come from languages which offer the equivalent facility (e.g., C++'s "using namespace" -- back in think3, I got hoarse from having to repeat over and over again that making "using namespace std;" a standard prologue of every source file was NOT clever -- and I'm talking about able, mature programmers, quite used to large-scale programming in C++... but namespaces were new, and were perceived as "inconvenient"...!). Now, the amount of desirable separation between components may be lesser than the high separation that is most desirable between modules / namespaces -- but it IS higher than that most desirable between "ordinary" objects under inheritance. I surely don't know "the solution", but I just as surely do feel there ARE issues here that are worth pondering about. Just my two Eurocents (now worth slightly MORE than 2 cents of US$...!-) Alex From mal@lemburg.com Tue Jul 16 08:57:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 16 Jul 2002 09:57:41 +0200 Subject: [Python-Dev] AtExit Functions References: Message-ID: <3D33D1F5.3030302@lemburg.com> Tim Peters wrote: > [MAL, on Py_AtExit()] > >>PyObject_Del() [must be avoided] as well ? > > > I don't think that one's a problem. The Py{Object,Mem}_{Del,DEL,Free,FREE} > spellings have resolved to plain free(). Under pymalloc that's different, > but pymalloc never tears itself down so it's safe there too. Good, because I've been using that particular API for years in Py_AtExit() functions. >>>We have two sets of exit-function gimmicks, one that runs at >>>the very start of Py_Finalize(), and the other at the very end. >>>If you need to clean up Python objects, you have to get into >>>the first group. >> > >>I suppose the first one is what the atexit module exposes >>in Python 2.0+, right ? > > > Yes, but do read Skip's message too. atexit.py wraps a primitive gimmick > that was also in 1.5.2. See the docs for sys.exitfunc at > > http://www.python.org/doc/1.5.2p2/lib/module-sys.html > > The docs are pretty much the same now, except atexit.py provides a rational > way to register multiple exit functions now. The problem is not finding code to work in Python 1.5.2. I have my own ExitFunctions.py module which did pretty much the same as atexit.py for Python 1.5.2. The problem is that if there's no standard for managing atexit functions available in Python 1.5.2, then it is likely that others will have used a similar method and maybe failed to play nice with other such hooks. > Hmm! The logic in atexit.py > looks wrong if sys.exitfunc already exists: then atexit appends it to the > module's own list of exit functions, but then forgets to do anything to > ensure that its own list gets run at the end. I conclude that nobody has > tried mixing these gimmicks. Indeed. try: x = sys.exitfunc except AttributeError: sys.exitfunc = _run_exitfuncs else: # if x isn't our own exit func executive, assume it's another # registered exit function - append it to our list... if x != _run_exitfuncs: register(x) The logic seems a bit wrong for that case: how could x possibly be _run_exitfuncs ? I think this code should look something like this: try: x = sys.exitfunc except AttributeError: pass else: # if x isn't our own exit func executive, assume it's another # registered exit function - append it to our list... register(x) sys.exitfunc = _run_exitfuncs > In any case, you can do something safe across all versions via: > > import sys > try: > inherited = sys.exitfunc > except AttributeError: > def inherited(): > pass > > def myexitfunc(): > clean_up_my_stuff() > inherited() > > sys.exitfunc = myexitfunc > del sys > > You can get screwed then if somebody else sets sys.exitfunc later without > being as considerate of your hook as the code above is of a pre-existing > hook, but then that's why atexit.py was created and you should move to a > later Python if you want saner behavior . Right. OTOH, if someone screws up here, worst which can happen is a memory leak. Not all that much too lose nowadays with GB of RAM ;-) >>The problem with that approach is that there may still be some >>references to objects left in lists and dicts which are cleaned >>up after having called the atexit functions. This is not so >>much a problem in my cases, but something to watch out in other >>applications which use C level Python objects as globals. > > > I don't know specifically what you have in mind there, but I expect that it > would kick off another round of the every-18-months discussion of what kind > of module finalization protocol Python should start to support. A PEP for > that is long overdue. > >>>>Also, atexit.py is not present in Python 1.5.2. >>> > >>>What's that ? >> > >>That's the Python version which was brand new just 3 years >>ago. I know... in US terms that's for history books ;-) > > > Oh, 3 years ago is sooooo 20th century! Goodness, they didn't even have > cellophane sleeping tubes back then. May as well go back to worshipping > cats while you're at it. Naa, I'll stick to the Python 1.2 tutorial I still keep under my pillow as per instructions from Guido at the time. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gmcm@hypernet.com Tue Jul 16 12:46:19 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 16 Jul 2002 07:46:19 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: References: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D33CF4B.5890.18BBC3E8@localhost> On 15 Jul 2002 at 16:51, Ka-Ping Yee wrote: > ... An exhausted > iterator is not the same thing as a freshly-created > iterator on an empty sequence, Right. But a freshly-created iterator on an empty sequence is exactly like an iterator which will be exhausted on the next next. This is something the callng code can easily detect if it desires, so having the iterator track it is needless complication. -- Gordon http://www.mcmillan-inc.com/ PS to ?ing: Seen Aaron & Cindy lately? From guido@python.org Tue Jul 16 12:51:20 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 07:51:20 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 16 Jul 2002 09:51:11 +0200." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <15667.24256.621942.555107@anthem.wooz.org> Message-ID: <200207161151.g6GBpKs14485@pcp02138704pcs.reston01.va.comcast.net> > But current generations of programmers grew up on machines of > overwhelming power -- gradually, the pitfall of premature > optimization becomes less pervasive. I don't see that yet. We get an awful number of contributions that are broken or obfuscated by premature attempts at optimization. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 16 13:07:54 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 08:07:54 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Tue, 16 Jul 2002 02:08:26 EDT." <20020716060826.GA92278@hishome.net> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> <20020716060826.GA92278@hishome.net> Message-ID: <200207161207.g6GC7s414566@pcp02138704pcs.reston01.va.comcast.net> > On Mon, Jul 15, 2002 at 11:15:58AM -0400, Guido van Rossum wrote: > > I'm still only considering two options: > > > > (a) leave the status quo, or > > (b) implement (and document!) the "sink-state" rule from the PEP. > > (c) leave it officially undefined but make all builtin iterator behave > consistently. What would be the point of that? Since we can't enforce the sink-state rule for 3rd party iterators, this is no different from (b) except that 3rd party implementers have less of an incentive to fix their implementations. > Implementing consistent post-StopIteration behavior for builtin > iterators is not too hard and doesn't require adding flags and > special cases - when the iterator is exhausted it can clean up and > decref any referenced objects and change its type to a > StoppedIterator type. I can write a patch. Don't bother. I already wrote a patch, SF patch 580331. Changing the type is evil (you can't change the type unless the memory deallocation policies are the same), so I won't do that. > I would prefer this StoppedIterator type to raise a new exception > when its next() is called. I assume you would want it to be a > StopIteration sink. You got that right, buddy. > As the risk of sounding like a broken record I will repeat my > position: I consider the StopIteration sink state to be a silent > error. It makes an exhausted iterator behave just like an iterator > of an empty sequence. Because iterators and iterables can be mixed > freely it results in silent failures when a function that requires a > re-iterable object gets an iterator. Iterables can serve as a > replacement for sequences in most cases. When they are not I'd like > to get an error, please. This is inconsistent with your position (c) above, which gives you no guarantees in this case. I also think you're mistaken in your desire. Iterables do *not* serve as sequence replacements. > When I pass a popened pipe to a function that expects a real file I > will get an error if the function tries to perform a seek. I > wouldn't want the seek operation to fail silently but that's > more-or-less the equivalent of what iterators currently do. It would be an error to try to use __getitem__ on an iterator. Please give up this line of request -- I'm tired of this argument. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 16 13:49:36 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 08:49:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: Your message of "Tue, 16 Jul 2002 01:56:33 EDT." References: Message-ID: <200207161249.g6GCnbN14616@pcp02138704pcs.reston01.va.comcast.net> > > It means that the implementation of various iterators can be a little > > simpler, because no next() implementation needs to be given. > > I'm not sure that's a feature we always want to use. Going thru a wrapper > function (a) adds another layer of function call, and (b) adds a > > if (!PyArg_ParseTuple(args, "")) > return NULL; > > call via wrap_next(). Both expenses could be avoided if an existing > next method were left alone. I suppose only the seoond expense is > actually "real", though, as most explicit xyz_next methods naturally > call the tp_iternext slot function anyway. Still, when the body of > a "next" method is as simple as it is for lists, a call to > PyArg_ParseTuple is a significant overhead. I'm not worried. There's considerable expense in the attribute lookup for the next method too, if you call it from Python, which probably drowns the PyArg_ParseTuple overhead. The whole idea is that usually the tp_iternext slot will be used directly, and as long as that's fast, I'm happy. (If you're not, we can always write faster code to check for no arguments here.) The code in typeobject.c maintains a correspondence between tp_iternext and the next() method, just like it does for tp_iter and __iter__(), and for tp_hash and __hash__(). It goes both ways: if you assign to C.next, the tp_iternext slot will be set to a wrapper that calls C.next(). This also prevents inconsistency between next() and tp_iternext. (Both the list iterator and hotshot had broken next() implementations, BTW.) It needs to be documented, though! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 16 13:52:56 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 08:52:56 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 16 Jul 2002 08:25:03 +0300." <20020716082503.A20992@hishome.net> References: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net> <20020716082503.A20992@hishome.net> Message-ID: <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net> > On the new version of patch #580331 the cache is invalidated on a seek. > > > Maybe the xreadlines object could grow a flush() method that throws > > away its buffer, and f.seek() could call that if there's a cached > > xreadlines iterator? > > The behavior of an xreadlines object is already undefined after a seek on > the file. This patch doesn't try to fix that. The invalidation makes sure > that the next iter() call will produce a fresh xreadlines, though. OK, good enough. > Flushing would be too much work for this little hack. The right solution > would be to fully integrate buffering into the file object and get rid of > the dependency on the xreadlines module. The xreadlines method will then be > equivalent to __iter__ (i.e. return self). I assume that after this rewrite > the xreadlines module would be deprecated. Yes, that's what I've called rewriting the I/O system. :-) > > > See my previous postings for why I think a file should be an iterator. > > > > Haven't seen them but I would agree that this makes sense. > > For some reason I got the impression that you disagreed. I disagreed with making next simply point to readline, because that would defeat the speedup we get from using the file iterator. The solution in your patch doesn't have this problem. (Though one *could* argue that making the file object its own iterator is only confusing; given that I'm also not sure what problem it solves, I'm at best +0 on it.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Tue Jul 16 14:29:17 2002 From: aleax@aleax.it (Alex Martelli) Date: Tue, 16 Jul 2002 15:29:17 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net> References: <20020716082503.A20992@hishome.net> <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Tuesday 16 July 2002 02:52 pm, Guido van Rossum wrote: ... > argue that making the file object its own iterator is only confusing; > given that I'm also not sure what problem it solves, I'm at best +0 on Personally, I think it solves at least a teaching problem -- it helps me teach the difference between iterators and iterables. In the Europython tutorial I had to gloss a bit over the fact that the difference was rather blurried. According to the principles I mentioned, as easiest for the audience to understand and apply, the file object SHOULD have been an iterator, not an iterable -- i.e. it SHOULD have been the case that f is iter(f) when f is a file object -- but it wasn't. When it IS, that's one less micro-wart I need to mention when teaching or writing about it. I don't see any downside to having this micro-wart removed. In particular, I don't see what's confusing. Things that respond to iter(x) fall in two categories: iterators: also have x.next(), and iter(x) is x iterables: iter(x) is not x, so you can presumably get another iterator out of x at some later point in time if needed. It's not QUITE as simple as this, but moving file objects from the second category to the first seems to _simplify_ things a bit. E.g.: def useIterable(x): try: it = iter(x) except TypeError: raise TypeError, "Need iterable object, not %s" % type(x) if it is x: raise TypeError, "Need iterable object, not iterator" # keep happily using it and/or x as needed, and in particular # the code is able to call it1 = iter(x) if it needs to iterate again Not perfect -- but having a file-object argument fail this simplistic test seems better to me, less confusing, than having it pass. So, I, personally, am +1. It might be even nicer (from the point of view of teaching, at least) if iterating on f interoperated more smoothly with other method calls on f, but I do see your point that the right way to achieve THAT would be a complete rewrite of the I/O system, and thus a vastly heavier project than the current one. Still, the current step seems to be in the right direction. Alex From guido@python.org Tue Jul 16 14:50:22 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 09:50:22 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 16 Jul 2002 15:29:17 +0200." References: <20020716082503.A20992@hishome.net> <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207161350.g6GDoM522277@odiug.zope.com> > > argue that making the file object its own iterator is only confusing; > > given that I'm also not sure what problem it solves, I'm at best +0 on > > Personally, I think it solves at least a teaching problem -- it > helps me teach the difference between iterators and iterables. In > the Europython tutorial I had to gloss a bit over the fact that the > difference was rather blurried. According to the principles I > mentioned, as easiest for the audience to understand and apply, the > file object SHOULD have been an iterator, not an iterable -- i.e. it > SHOULD have been the case that f is iter(f) when f is a file object > -- but it wasn't. When it IS, that's one less micro-wart I need to > mention when teaching or writing about it. I dunno. The presence of seek() and write() makes the behavior of files a rather unique blend of iterator and iterable. > I don't see any downside to having this micro-wart removed. In > particular, I don't see what's confusing. Things that respond to > iter(x) fall in two categories: > iterators: also have x.next(), and iter(x) is x > iterables: iter(x) is not x, so you can presumably get another > iterator out of x at some later point in time if needed. > It's not QUITE as simple as this, but moving file objects from > the second category to the first seems to _simplify_ things a bit. I worry that equating a file with its iterable makes it more likely that people mix next() with readline() or seek(), which doesn't work (at least not until the I/O system is rewritten). I'd be more comfortable with teaching people that you should *either* use a file in a for loop (the common case, probably) *or* use its native I/O methods (readline() etc.), but not mix both. > E.g.: > > def useIterable(x): > try: > it = iter(x) > except TypeError: > raise TypeError, "Need iterable object, not %s" % type(x) > if it is x: > raise TypeError, "Need iterable object, not iterator" > # keep happily using it and/or x as needed, and in particular > # the code is able to call it1 = iter(x) if it needs to iterate again > > Not perfect -- but having a file-object argument fail this simplistic > test seems better to me, less confusing, than having it pass. This actually looks like an example of the "look before you leap" (LBYL) syndrome, which you disapproved of recently. > So, I, personally, am +1. It might be even nicer (from the point of > view of teaching, at least) if iterating on f interoperated more > smoothly with other method calls on f, but I do see your point that > the right way to achieve THAT would be a complete rewrite of the I/O > system, and thus a vastly heavier project than the current one. > Still, the current step seems to be in the right direction. Somehow I'd rather emphasize the relative brokenness of the current situation. Anyway, I'm somewhere between -0 and +0 (inclusive) on this. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Jul 16 14:55:49 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 16 Jul 2002 08:55:49 -0500 Subject: [Python-Dev] AtExit Functions In-Reply-To: <3D33D1F5.3030302@lemburg.com> References: <3D33D1F5.3030302@lemburg.com> Message-ID: <15668.9701.455650.465650@localhost.localdomain> mal> The problem is that if there's no standard for managing atexit mal> functions available in Python 1.5.2, then it is likely that others mal> will have used a similar method and maybe failed to play nice with mal> other such hooks. That was precisely why I wrote the atexit module. (I agree there's a bug in the init code.) Skip From barry@zope.com Tue Jul 16 15:30:51 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 16 Jul 2002 10:30:51 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <15667.24256.621942.555107@anthem.wooz.org> <200207160751.g6G7pIq13900@smtp.zope.com> Message-ID: <15668.11803.110215.686549@anthem.wooz.org> >>>>> "AM" == Alex Martelli writes: AM> Right -- "i inherit the implementation but none of the AM> interfaces". You can express this either by appropriately AM> tagging the "I inherit" part, as C++ does (private inheritance AM> -- the default, but that's yet _another_ C++ issue... defaults AM> that may or may not be appropriate for typical use!-), or with AM> a variation of "exclude-interfaces" however you spell that. AM> Alternatively, "I inherit" could default to "not the AM> interfaces", and, if needed, one might add a clause "oh, and AM> all the interfaces too, please" when that is positively AM> desired. Maybe the default would best be chosen on the basis AM> of "what is good for them" rather on "what appears most AM> desirable intuitively", as is currently done for module AM> imports. I don't know, it seems like 6-one-way, half-dozen-the-other, but I tend to agree with Guido on this one. AM> Defaulting to "i inherit all" is roughly as convenient as AM> defaulting to "from amodule import *" would be felt to be by AM> naive users unfamiliar with the issues of namespace pollution. Interface conformance seems totally different than name importing, so I don't think the analogy holds. I just feel that in Python, I rarely use inheritance for implementation convenience only. -Barry From aahz@pythoncraft.com Tue Jul 16 15:34:36 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 16 Jul 2002 10:34:36 -0400 Subject: [Python-Dev] Termination of two-arg iter() In-Reply-To: <20020716060826.GA92278@hishome.net> References: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> <20020716060826.GA92278@hishome.net> Message-ID: <20020716143436.GB4012@panix.com> On Tue, Jul 16, 2002, Oren Tirosh wrote: > > I consider the StopIteration sink state to be a silent error. It > makes an exhausted iterator behave just like an iterator of an empty > sequence. Because iterators and iterables can be mixed freely it > results in silent failures when a function that requires a re-iterable > object gets an iterator. Iterables can serve as a replacement for > sequences in most cases. When they are not I'd like to get an error, > please. So the real problem isn't that you can't distinguish between an empty iterator and an exhausted one, but that you can't distinguish between re-iterable objects and objects that can't be re-iterated. If my understanding of your POV is correct, you can't get there from here. You're talking about two different concepts and conflating them, which to my mind breaks, "Simple is better than complex," and, "Beautiful is better than ugly." Your sole hope IMO, is to get behind Alex's bandwagon so that there is a mechanism available for documenting such behaviors at the code level. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From neal@metaslash.com Wed Jul 17 00:05:03 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 16 Jul 2002 19:05:03 -0400 Subject: [Python-Dev] atexit test failing Message-ID: <3D34A69F.91B60F74@metaslash.com> Tim: test_atexit is failing for me because the test assumes that the python being tested is the first one found in the path. This is not true on my system. Would it be safer to use the patch below which replaces: "python " + fname with "%s %s" % (sys.executable, fname) Neal -- > import sys 25c26 < p = os.popen("python " + fname) --- > p = os.popen("%s %s" % (sys.executable, fname)) 53c54 < p = os.popen("python " + fname) --- > p = os.popen("%s %s" % (sys.executable, fname)) From tim.one@comcast.net Wed Jul 17 01:33:54 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 16 Jul 2002 20:33:54 -0400 Subject: [Python-Dev] atexit test failing In-Reply-To: <3D34A69F.91B60F74@metaslash.com> Message-ID: [Neal Norwitz] > test_atexit is failing for me because the test assumes that > the python being tested is the first one found in the path. > This is not true on my system. I wondered about that, but figured "it must be safe" since test_popen's _do_test_commandline also does a popen("python ..."). So what I'm actually assuming is that you don't have a broken Python first on your path . > Would it be safer to use the patch below which replaces: > "python " + fname > with > "%s %s" % (sys.executable, fname) I don't know. For example, does that work for you? If so, that would be a good start. It works for me, so I'll check that in. From ping@zesty.ca Wed Jul 17 02:18:05 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Tue, 16 Jul 2002 18:18:05 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207152325.g6FNPD128921@europa.research.att.com> Message-ID: On Mon, 15 Jul 2002, Andrew Koenig wrote: > However the purpose my suggestion of __multiter__ was not to use it to > test for multiple iteration, but to enable a container to be able to > yield either a single or a multiple iterator on request. I see what you want, though i have a hard time imagining a situation where it's really necessary to have both (as opposed to just the multiple iterator, which is strictly more capable). I can certainly see how you might want to be able to ask for a breadth-first or depth-first iterator on a tree, though. > > Or, what if there is no container to begin with, but the iterator is still > > copyable? You can't flag that by putting __multiter__ on anything; again > > it makes more sense to just provide __copy__ on the iterator. > > You could flag it by putting __multiter__ on the iterator, just as iterators > presently have __iter__. Ugh. I don't like this, for the reasons i outlined in another message: an iterator is not the same as a container. Iterators always mutate; containers usually do not (at least not as a result of looking at the elements). > > All that's really necessary here is to document the convention about what > > __copy__ is supposed to mean if it's available on an iterator. If we > > all agree that __copy__ should preserve an independent copy of the > > current state of the iterator, we're all set. > > Not quite. We also need an agreement that calling __iter__ on a container > is not a destructive operation unless you call next() on the iterator that > you get back. What i'd like is an agreement that calling __iter__ on a container is not a destructive operation at all. If it's destructive, then what you started with is not really a container, and we should encourage people to call attention to this irregularity in their documentation. > > I think a proliferation of iterator-fetching methods would be a > > messy and unpleasant prospect. After __iter__, __multiter__, > > and __ambiter__, what next? __mutableiter__? > > __depthfirstiter__? __breadthfirstiter__? > > A data structure that supports several different kinds of iteration > has to provide that support somehow. Agreed. I was unclear: what makes me uncomfortable is the pollution of the double-underscore namespace. When you do have a container-like object that supports various kinds of iteration, naturally you are going to need some methods for getting iterators. I just think it's not appropriate to establish special names for them. To me, the presence of double-underscores around a method name means that the method is called automatically. My expectation is that when i write a method with a "normal" name, the name itself will appear after a dot wherever that method is used; and that when there's a method with a "__special__" name, the method is called implicitly. The implicit call can occur via an operator (e.g. __add__), or to implement a protocol defined in the language (e.g. __init__), etc. If you see the string ".__" it means that something unusual is going on. If you follow this convention, then "__iter__" deserves a special name, because it is the specially blessed iterator-getter used by "for". There may be other iterator-getters, but they must be called explicitly, so they shouldn't get underscores. * * * An aside on "next" vs. "__next__": Note that this convention would also suggest that "next" should be called "__next__", since "for" calls "next" implicitly. I forget why we ended up going with "next" instead of "__next__". I think "__next__" would have been better, especially in light of this: Tim Peters wrote: > Requiring *some* method with a reserved name is an aid to > introspection, lest it become impossible to distinguish, say, > an iterator from an instance of a doubly-linked list node class > that just happens to supply methods named .prev() and .next() > for an unrelated purpose. This is exactly why the iterator protocol should consist of one method named "__next__" rather than two methods named "__iter__" (which has nothing to do with the act of iterating!) and "next" (which is the one we really care about, but can collide with existing method names). As far as i know, "next" is the only implicitly-called method of an internal protocol that has no underscores. It's a little late to fix the name of "next" in Python 2, though it might be worth considering for Python 3. -- ?!ng From guido@python.org Wed Jul 17 02:29:55 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 16 Jul 2002 21:29:55 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Tue, 16 Jul 2002 18:18:05 PDT." References: Message-ID: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> > An aside on "next" vs. "__next__": > > Note that this convention would also suggest that "next" should be > called "__next__", since "for" calls "next" implicitly. I forget > why we ended up going with "next" instead of "__next__". I think > "__next__" would have been better, especially in light of this: > > Tim Peters wrote: > > Requiring *some* method with a reserved name is an aid to > > introspection, lest it become impossible to distinguish, say, > > an iterator from an instance of a doubly-linked list node class > > that just happens to supply methods named .prev() and .next() > > for an unrelated purpose. > > This is exactly why the iterator protocol should consist of one > method named "__next__" rather than two methods named "__iter__" > (which has nothing to do with the act of iterating!) and "next" > (which is the one we really care about, but can collide with > existing method names). > > As far as i know, "next" is the only implicitly-called method of > an internal protocol that has no underscores. It's a little late > to fix the name of "next" in Python 2, though it might be worth > considering for Python 3. Yup. I regret this too. We should have had a built-in next(x) which calls x.__next__(). I think that if it had been __next__() we wouldn't have the mistake that I just discovered -- that all the iterator types that define a next() method shouldn't have done so, because you get one automatically which is the tp_iternext slot wrapped. :-( But yes, it's too late to change now. --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Wed Jul 17 03:16:35 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Tue, 16 Jul 2002 19:16:35 -0700 (PDT) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Tue, 16 Jul 2002, Guido van Rossum wrote: > Yup. I regret this too. We should have had a built-in next(x) which > calls x.__next__(). Ah! I think one of the hang-ups i (we? the BOF?) got stuck on was that users of iterators would sometimes call next() directly, and so it wouldn't do to call it __next__. But it's clear to me now that a built-in next() is exactly the right answer, by analogy to the built-in repr() and method __repr__. > But yes, it's too late to change now. Sigh. Well, in the hope that it makes the change a little easier to swallow, i'll say now that if the protocol is fixed in some future version of Python, i'll volunteer to update the standard library to the new protocol. I guess when Python 3 comes around, there's going to me some sort of migration helper tool, and that tool can check for classes that have __iter__ and next, and suggest changing the name to __next__. -- ?!ng From mhammond@skippinet.com.au Wed Jul 17 14:37:06 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 17 Jul 2002 23:37:06 +1000 Subject: [Python-Dev] Review of build system patch requested Message-ID: I would like review of a patch that touches the configure/build system. The patch is to fix/deprecate the DL_IMPORT macros that pepper the Python source code. These macros were originally introduced for the Windows port many years ago as a way of declaring special linkage for the Python API functions and data exposed in the Python DLL. It has since been used in the cygwin and BeOS ports, and, to put it bluntly, is broken! The patch touches the configure/build system to provide a consistent mechanism for declaring special linkage macros regardless of platform. Specifically: * configure.in has been changed to #define Py_ENABLE_SHARED in pyconfig.h if Python has been configured for building as a shared library. * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to the compiler when building Python itself and any builtin modules. This flag is not passed to extension modules. * pyport.h has been changed to define the macros PyAPI_FUNC, PyAPI_DATA and PyMODINIT_FUNC. For Windows, cygwin and BeOS, these will resolve to "__declspec" directives (depending on Py_ENABLE_SHARED and Py_BUILD_CORE). For all other platforms these will resolve to nothing. The patch also contains significant changes to PC/pyconfig.h - while reviews of that code are welcome, I am primarily interested in reviews of the above three points, and some indication from gurus on Linux and other platforms that these changes are reasonable (or if I am lucky, desirable ) www.python.org/sf/566100 - Patch [ 566100 ] Rationalize DL_IMPORT and DL_EXPORT Thanks, Mark. From cce@clarkevans.com Wed Jul 17 14:45:04 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Wed, 17 Jul 2002 09:45:04 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jul 16, 2002 at 09:29:55PM -0400 References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020717094504.A85351@doublegemini.com> On Tue, Jul 16, 2002 at 09:29:55PM -0400, Guido van Rossum wrote: | > An aside on "next" vs. "__next__": | > | > As far as i know, "next" is the only implicitly-called method of | > an internal protocol that has no underscores. It's a little late | > to fix the name of "next" in Python 2, though it might be worth | > considering for Python 3. | | Yup. I regret this too. | But yes, it's too late to change now. I don't think it is too late. 90% ++ of the python code base out there doesn't use iterators yet... people are still wrapping their minds around it to see how they can use it in their applications. If it was publicly stated that this could be "fixed" in the next version I don't think that it would hurt. These things happen, and sometimes its best to "roll back". Programmers understand this. Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From barry@zope.com Wed Jul 17 14:48:09 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 17 Jul 2002 09:48:09 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> Message-ID: <15669.30105.128625.569434@anthem.wooz.org> >>>>> "CC" == Clark C writes: CC> I don't think it is too late. 90% ++ of the python code base CC> out there doesn't use iterators yet... people are still CC> wrapping their minds around it to see how they can use it in CC> their applications. If it was publicly stated that this could CC> be "fixed" in the next version I don't think that it would CC> hurt. These things happen, and sometimes its best to "roll CC> back". Programmers understand this. And besides (to continue Clark's devils advocacy), how much of the code out there that /does/ use iterators, calls .next() explicitly? -Barry From David Abrahams" <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net><20020717094504.A85351@doublegemini.com> <15669.30105.128625.569434@anthem.wooz.org> Message-ID: <040701c22d98$bdf45320$6501a8c0@boostconsulting.com> From: "Barry A. Warsaw" > CC> I don't think it is too late. 90% ++ of the python code base > CC> out there doesn't use iterators yet... people are still > CC> wrapping their minds around it to see how they can use it in > CC> their applications. If it was publicly stated that this could > CC> be "fixed" in the next version I don't think that it would > CC> hurt. These things happen, and sometimes its best to "roll > CC> back". Programmers understand this. > > And besides (to continue Clark's devils advocacy), how much of the > code out there that /does/ use iterators, calls .next() explicitly? Hmm, I'm getting excited! We rarely get an opportunity to fix mistakes in language design. Probably someone will bring me back to reality shortly, though ;-) Maybe I'll do it: the problem is really the iterators people have written. However, you could implicitly generate __next__() which calls next() when the result of __iter__() lacks a __next__() function... with a warning, of course. -Dave From guido@python.org Wed Jul 17 15:09:13 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 10:09:13 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 09:45:04 EDT." <20020717094504.A85351@doublegemini.com> References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> Message-ID: <200207171409.g6HE9Di00659@odiug.zope.com> > I don't think it is too late. 90% ++ of the python code base out > there doesn't use iterators yet... people are still wrapping their > minds around it to see how they can use it in their applications. > If it was publicly stated that this could be "fixed" in the next > version I don't think that it would hurt. These things happen, > and sometimes its best to "roll back". Programmers understand this. I find this really hard to believe, given that such a big deal has been made of iterators. Care to conduct a survey on c.l.py? Given that it's really only a very minor problem, I'd rather not expend the effort to 'fix" this. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 17 15:18:34 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 10:18:34 -0400 Subject: [Python-Dev] Review of build system patch requested In-Reply-To: Your message of "Wed, 17 Jul 2002 23:37:06 +1000." References: Message-ID: <200207171418.g6HEIZo00747@odiug.zope.com> > * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to the compiler > when building Python itself and any builtin modules. This flag is > not passed to extension modules. My only concern would be that tools which parse the Makefile (I believe distutils does this?) should not accidentally pick up the "-DPy_BUILD_CORE" flag. Apart from that I trust your judgement and Neal's test drive. --Guido van Rossum (home page: http://www.python.org/~guido/) From cce@clarkevans.com Wed Jul 17 15:49:35 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Wed, 17 Jul 2002 10:49:35 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171409.g6HE9Di00659@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 10:09:13AM -0400 References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> Message-ID: <20020717104935.A86293@doublegemini.com> On Wed, Jul 17, 2002 at 10:09:13AM -0400, Guido van Rossum wrote: | > I don't think it is too late. 90% ++ of the python code base out | > there doesn't use iterators yet... people are still wrapping their | > minds around it to see how they can use it in their applications. | > If it was publicly stated that this could be "fixed" in the next | > version I don't think that it would hurt. These things happen, | > and sometimes its best to "roll back". Programmers understand this. | | I find this really hard to believe, given that such a big deal has | been made of iterators. i None of my code uses explicit use of iterators, and I was very aware of them. My new code that I'm building now does, but it wouldn't take much effort to fix it. I myself personally would rather keep Python "clean" of blemish. For the most part, Python is really free of dragons and that's why I like it. I'm willing to put up with short-term pain for long term gain. Unlike Java or Visual Basic, I intend to be programming in Python 10+ years from now; so from my perspective, it is an investment. Plus, most features don't get used by the public for at least a year or so as it takes a while for the code-examples to start using them and books to be updated. | Care to conduct a survey on c.l.py? Sure. I'll run the survey and report back. What would be the options? It'll be a simple CGI form using a radio or check boxes and a button. I'll aggregate the results. To do this I need: - A specific description of what would change - An example of what would break, plus what it would be replaced with. - An explanation of what problems occur when the blemish isn't fixed (what can't you do?) | Given that it's really only a very minor problem, I'd rather not | expend the effort to 'fix" this. Well, if it is a minor problem, it shouldn't be that hard to fix. *evil grins* Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From guido@python.org Wed Jul 17 16:03:48 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 11:03:48 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 10:49:35 EDT." <20020717104935.A86293@doublegemini.com> References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> Message-ID: <200207171503.g6HF3mW01047@odiug.zope.com> > None of my code uses explicit use of iterators, and I was very > aware of them. My new code that I'm building now does, but it > wouldn't take much effort to fix it. I myself personally would > rather keep Python "clean" of blemish. For the most part, > Python is really free of dragons and that's why I like it. I'm > willing to put up with short-term pain for long term gain. Unlike > Java or Visual Basic, I intend to be programming in Python 10+ > years from now; so from my perspective, it is an investment. Calling it a dragon sounds way overstated. Another issue is that we can't really fix this retroactively in Python 2.2. Python 2.2 has been elected to be the "Python-in-a-Tie" favored by the Python Business Forum, giving it a very long life expectancy -- 18 months from the first official release of Python-in-a-Tie (probably Python 2.2.2), plus however long it takes people to want to ulgrade after that. > Plus, most features don't get used by the public for at least > a year or so as it takes a while for the code-examples to > start using them and books to be updated. > > | Care to conduct a survey on c.l.py? > > Sure. I'll run the survey and report back. What would > be the options? It'll be a simple CGI form using a radio > or check boxes and a button. I'll aggregate the results. > To do this I need: > > - A specific description of what would change > - An example of what would break, plus what it would > be replaced with. > - An explanation of what problems occur when the > blemish isn't fixed (what can't you do?) - The mapping between the next() method and the tp_iternext slot in the type object would disappear, and instead the __next__() method would be mapped to this slot. This means that every iterator definition written in Python has to be changed from "def next(self): ..." to "def __next__(self): ...". - There would be a new built-in function, next(), which calls the __next__() method on its argument. - Calls to it.next() will have to be changed to call next(it) instead. (it.__next__() would also work but is not recommended.) - There really isn't anything "broken" about the current situation; it's just that "next" is the only method name mapped to a slot in the type object that doesn't have leading and trailing double underscores. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Wed Jul 17 16:27:31 2002 From: ark@research.att.com (Andrew Koenig) Date: Wed, 17 Jul 2002 11:27:31 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: (message from Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT)) References: Message-ID: <200207171527.g6HFRV214431@europa.research.att.com> Ping> On Mon, 15 Jul 2002, Andrew Koenig wrote: >> However the purpose my suggestion of __multiter__ was not to use it to >> test for multiple iteration, but to enable a container to be able to >> yield either a single or a multiple iterator on request. Ping> I see what you want, though i have a hard time imagining a situation Ping> where it's really necessary to have both (as opposed to just the Ping> multiple iterator, which is strictly more capable). I can certainly Ping> see how you might want to be able to ask for a breadth-first or Ping> depth-first iterator on a tree, though. How about a class that represents a file? If you ask it for a single iterator, that's easy. If you ask it for a multiple iterator, it checks whether the file is really an interactive device such as a pipe or a keyboard. If so, it uses a buffering mechanism to simulate multiple iteration; otherwise, it lets the multiple iterators access the file directly. Then when you ask to iterate over the file, you automatically get the least cumbersome mechanism needed to support the kind of iteration that you requested. >> > Or, what if there is no container to begin with, but the iterator is still >> > copyable? You can't flag that by putting __multiter__ on anything; again >> > it makes more sense to just provide __copy__ on the iterator. >> You could flag it by putting __multiter__ on the iterator, just as iterators >> presently have __iter__. Ping> Ugh. I don't like this, for the reasons i outlined in another message: Ping> an iterator is not the same as a container. Iterators always mutate; Ping> containers usually do not (at least not as a result of looking at the Ping> elements). The scenario is this: def search(thing): iter = thing.__multiter__() // now either iter is an iterator that supports __copy__ // or we will raise an exception (and raise it here, rather // than waiting for the first time we try to copy iter). >> Not quite. We also need an agreement that calling __iter__ on a container >> is not a destructive operation unless you call next() on the iterator that >> you get back. Ping> What i'd like is an agreement that calling __iter__ on a container is Ping> not a destructive operation at all. If it's destructive, then what you Ping> started with is not really a container, and we should encourage people Ping> to call attention to this irregularity in their documentation. Is a file a container or not? Isn't making an iterator from a file and calling next() on it a destructive operation? >> > I think a proliferation of iterator-fetching methods would be a >> > messy and unpleasant prospect. After __iter__, __multiter__, >> > and __ambiter__, what next? __mutableiter__? >> > __depthfirstiter__? __breadthfirstiter__? >> A data structure that supports several different kinds of iteration >> has to provide that support somehow. Ping> Agreed. I was unclear: what makes me uncomfortable is the pollution Ping> of the double-underscore namespace. When you do have a container-like Ping> object that supports various kinds of iteration, naturally you are Ping> going to need some methods for getting iterators. I just think it's Ping> not appropriate to establish special names for them. Fair enough. But then why is __iter__ special? Ping> To me, the presence of double-underscores around a method name means Ping> that the method is called automatically. My expectation is that when Ping> i write a method with a "normal" name, the name itself will appear Ping> after a dot wherever that method is used; and that when there's a Ping> method with a "__special__" name, the method is called implicitly. Ping> The implicit call can occur via an operator (e.g. __add__), or to Ping> implement a protocol defined in the language (e.g. __init__), etc. Ping> If you see the string ".__" it means that something unusual is going on. Ping> If you follow this convention, then "__iter__" deserves a special name, Ping> because it is the specially blessed iterator-getter used by "for". Ping> There may be other iterator-getters, but they must be called explicitly, Ping> so they shouldn't get underscores. Ah, is it only "for" that makes __iter__ special, and not iter() ? Ping> An aside on "next" vs. "__next__": Ping> This is exactly why the iterator protocol should consist of one Ping> method named "__next__" rather than two methods named "__iter__" Ping> (which has nothing to do with the act of iterating!) and "next" Ping> (which is the one we really care about, but can collide with Ping> existing method names). Ping> As far as i know, "next" is the only implicitly-called method of Ping> an internal protocol that has no underscores. It's a little late Ping> to fix the name of "next" in Python 2, though it might be worth Ping> considering for Python 3. One way to clarify a discussion of a protocol is to append an "s" and think of a plurality of protocols, so as to see which properites are truly intrinsic and which can vary between protocols. That's part of what I'm trying to do in this discussion. (and I don't presently have a strong opinion about what the right answer is. I don't even know for sure what the question is.) From aleax@aleax.it Wed Jul 17 17:08:51 2002 From: aleax@aleax.it (Alex Martelli) Date: Wed, 17 Jul 2002 18:08:51 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207161350.g6GDoM522277@odiug.zope.com> References: <200207161350.g6GDoM522277@odiug.zope.com> Message-ID: On Tuesday 16 July 2002 03:50 pm, Guido van Rossum wrote: ... > I dunno. The presence of seek() and write() makes the behavior of > files a rather unique blend of iterator and iterable. All files have seek and write, but not on all files do they work -- and the same goes for iteration. I.e., it IS something of a mess, probably because the file object's is the only example of "fat interface" problem in Python -- an interface that exposes a lot of methods, with many objects claiming they implement that interface but actually lying (because they only implement a subset of it -- trying to use methods they can't in fact provide raises exceptions). The galaxy of Microsoft interfaces based on COM has sadly many fat interfaces and it IS the worst mess with that galaxy. Anyway, a rewindable-iterator is not an iterable in any case. You can't have two nested loops on it -- that's crucial. Making a file into an iterable requires wrapping it with a class that caches it. If and when rewindable iterators are recognized as such by Python, files whose seek(0) method doesn't raise will make a fine example. But iterables, they ain't, just like rewindable iterators in general aren't. > > I don't see any downside to having this micro-wart removed. In > > particular, I don't see what's confusing. Things that respond to > > iter(x) fall in two categories: > > iterators: also have x.next(), and iter(x) is x > > iterables: iter(x) is not x, so you can presumably get another > > iterator out of x at some later point in time if needed. > > It's not QUITE as simple as this, but moving file objects from > > the second category to the first seems to _simplify_ things a bit. > > I worry that equating a file with its iterable makes it more likely > that people mix next() with readline() or seek(), which doesn't work > (at least not until the I/O system is rewritten). It's exactly to DISTINGUISH a file from "its iterable" (which it does not have) that I'd like files to be iterators, NOT fake iterables. f.seek does cooperate with f.next now, doesn't it? since it invalidates f's xreadlines object, if any? > I'd be more comfortable with teaching people that you should *either* > use a file in a for loop (the common case, probably) *or* use its > native I/O methods (readline() etc.), but not mix both. Fine (I think BOTH cases are very common), although it will probably be handier one day if/when the I/O system is indeed rewritten. But having "iter(f) is f" isn't really germane to this issue. > > E.g.: > > > > def useIterable(x): > > try: > > it = iter(x) > > except TypeError: > > raise TypeError, "Need iterable object, not %s" % type(x) > > if it is x: > > raise TypeError, "Need iterable object, not iterator" > > # keep happily using it and/or x as needed, and in particular > > # the code is able to call it1 = iter(x) if it needs to iterate again > > > > Not perfect -- but having a file-object argument fail this simplistic > > test seems better to me, less confusing, than having it pass. > > This actually looks like an example of the "look before you leap" > (LBYL) syndrome, which you disapproved of recently. Only if you don't look carefully enough. It uses try/except when it can (just to change the exception's contents -- probably might as well not bother and just do it=iter(x) without a try), it uses a guarded raise statement when it must, because there's no way it could get an exception out of the case it can't handle. Consider, by analogy: def loopUntilConvergence(f, x, epsilon): y = f(x) while abs(x-y) > epsilon: x = y y = f(x) return y Now what happens if you mistakenly pass epsilon<0? Oops -- an infinite loop. So, one may add: if epsilon<0: raise ValueError, "Need epsilon>=0, not %s" % epsilon Is this an example of erroneous use of LBYL rather than EAFP? No, because no exception would be raised by the infinite loop, so there is no alternative to doing the checks. In exactly the same way, there is no alternative to checking in useIterable, because there is no exception one could count on -- rather, we'd have a case of an error passing silently. In other words: that EAFP is preferable to LBYL does NOT mean that one should NEVER use: if whatever: raise something because certain error conditions do reveal themselves only in ways testable with an if, NOT by raising exceptions themselves. And some you can't even test with an if, and then you're in trouble (e.g., in loopUntilConvergence, nothing assures us that f and the initial x ARE such as to converge -- so, one would further have a maximum-iteration-count argument, defaulting to something suitably big, count iterations, and do something of a look-AFTER-you've-leaped to raise on non-iteration:-). This doesn't have all that much to do with file objects being or not being iterators, but I love rambling discussions anyway:-). Alex From aahz@pythoncraft.com Wed Jul 17 17:17:55 2002 From: aahz@pythoncraft.com (Aahz) Date: Wed, 17 Jul 2002 12:17:55 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020717104935.A86293@doublegemini.com> References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> Message-ID: <20020717161754.GA22297@panix.com> On Wed, Jul 17, 2002, Clark C . Evans wrote: > > Sure. I'll run the survey and report back. What would > be the options? It'll be a simple CGI form using a radio > or check boxes and a button. I'll aggregate the results. Make sure you ask for e-mail addresses to prevent duplicates. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Wed Jul 17 17:23:46 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 12:23:46 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 18:08:51 +0200." References: <200207161350.g6GDoM522277@odiug.zope.com> Message-ID: <200207171623.g6HGNki02368@odiug.zope.com> > All files have seek and write, but not on all files do they work -- and > the same goes for iteration. I.e., it IS something of a mess, probably > because the file object's is the only example of "fat interface" problem > in Python -- an interface that exposes a lot of methods, with many > objects claiming they implement that interface but actually lying > (because they only implement a subset of it -- trying to use methods > they can't in fact provide raises exceptions). Yup. I inherited this from C stdio. :-( > But iterables, they ain't, just like rewindable iterators in general aren't. Can you remind me of your definition of "iterable"? Mine is "something for which iter() works", which clearly isn't yours. :-) > f.seek does cooperate with f.next now, doesn't it? since it > invalidates f's xreadlines object, if any? Not yet. You may have seen Oren's patch for this. Unfortunately it has a problem in that it creates a cycle, and neither type supports GC... So I'm not sure if it ever will -- this is an implementation mess as much as a conceptual mess. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Wed Jul 17 17:34:09 2002 From: aleax@aleax.it (Alex Martelli) Date: Wed, 17 Jul 2002 18:34:09 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171623.g6HGNki02368@odiug.zope.com> References: <200207171623.g6HGNki02368@odiug.zope.com> Message-ID: On Wednesday 17 July 2002 06:23 pm, Guido van Rossum wrote: ... > > But iterables, they ain't, just like rewindable iterators in general > > aren't. > > Can you remind me of your definition of "iterable"? Mine is > "something for which iter() works", which clearly isn't yours. :-) Right -- I mean something closer to what I've seen others call "a container". By your definition, iterators are indeed iterable. I would love for all iterables-by-your-definition to divide neatly into iterators and what-many-call-containers. The file object, unless you make it into an iterator, is not "a container" like all others and just sits there -- a bit of a wart. > > f.seek does cooperate with f.next now, doesn't it? since it > > invalidates f's xreadlines object, if any? > > Not yet. You may have seen Oren's patch for this. Unfortunately it Right -- that's what I had in mind. I had also tweaked it so that readline sort of interoperated with it (delegating to next if the file object is holding an xreadlines object) and sent the modified patch to Oren but he disliked it (because it meant readline would not respect its numeric argument, if any, in that case). > has a problem in that it creates a cycle, and neither type supports > GC... > > So I'm not sure if it ever will -- this is an implementation mess as > much as a conceptual mess. :-( I see your point. Darn!-(. Alex From guido@python.org Wed Jul 17 17:40:11 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 12:40:11 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 18:34:09 +0200." References: <200207171623.g6HGNki02368@odiug.zope.com> Message-ID: <200207171640.g6HGeB302603@odiug.zope.com> > > Can you remind me of your definition of "iterable"? Mine is > > "something for which iter() works", which clearly isn't yours. :-) > > Right -- I mean something closer to what I've seen others call > "a container". By your definition, iterators are indeed iterable. > I would love for all iterables-by-your-definition to divide neatly > into iterators and what-many-call-containers. > > The file object, unless you make it into an iterator, is not "a > container" like all others and just sits there -- a bit of a wart. I must be misunderstanding. How does making the file object into an iterator make it a container??? > > > f.seek does cooperate with f.next now, doesn't it? since it > > > invalidates f's xreadlines object, if any? > > > > Not yet. You may have seen Oren's patch for this. Unfortunately it > > Right -- that's what I had in mind. I had also tweaked it so that > readline sort of interoperated with it (delegating to next if the file > object is holding an xreadlines object) and sent the modified patch > to Oren but he disliked it (because it meant readline would not > respect its numeric argument, if any, in that case). Hm, you should've sent it to me. The numeric argument was a mistake I think. Who ever uses it? --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Jul 17 17:49:44 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 17 Jul 2002 12:49:44 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171527.g6HFRV214431@europa.research.att.com> References: (message from Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT)) Message-ID: <3D3567E8.17472.1EF7E920@localhost> On 17 Jul 2002 at 11:27, Andrew Koenig wrote: > How about a class that represents a file? If you > ask it for a single iterator, that's easy. If you > ask it for a multiple iterator, it checks whether > the file is really an interactive device such as a > pipe or a keyboard. If so, it uses a buffering > mechanism to simulate multiple iteration; otherwise, > it lets the multiple iterators access the file > directly. > > Then when you ask to iterate over the file, you > automatically get the least cumbersome mechanism > needed to support the kind of iteration that you > requested. OK, it's a pipe, and one iterator wants to go past what's been received. Is that iterator at EOF? Not really, just "temporary EOF". So should it block? But I'm single threaded and receiving asynchronously. Oh, and it turns out to be a humongous download, and what happens if the buffering mechanism runs out of memory / disk space. Does my process die? Aargh. Too much magic. Too many corner cases. -- Gordon http://www.mcmillan-inc.com/ From ark@research.att.com Wed Jul 17 17:53:27 2002 From: ark@research.att.com (Andrew Koenig) Date: Wed, 17 Jul 2002 12:53:27 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <3D3567E8.17472.1EF7E920@localhost> (gmcm@hypernet.com) References: (message from Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT)) <3D3567E8.17472.1EF7E920@localhost> Message-ID: <200207171653.g6HGrRb15897@europa.research.att.com> Gordon> OK, it's a pipe, and one iterator wants to go past Gordon> what's been received. Is that iterator at EOF? Not Gordon> really, just "temporary EOF". So should it block? Gordon> But I'm single threaded and receiving asynchronously. Gordon> Oh, and it turns out to be a humongous download, Gordon> and what happens if the buffering mechanism runs Gordon> out of memory / disk space. Does my process die? Gordon> Aargh. Too much magic. Too many corner cases. The implementation of the file should have a choice: Either refuse to yield a multiple iterator (which seems to be your preference) or yield one that works (which might or might not be my preference, depending on circumstances). In the latter case, I don't think your questions are hard to answer, because most of the answers fall out of the single-iterator case. So if the iterator is at EOF, it should either block or not, depending on what a single iterator should so. The only real question is what happens if the buffering mechanism runs out of space, but that's always a question for such mechanisms; I don't see why it's any more irksome in this particular context. From aleax@aleax.it Wed Jul 17 17:55:48 2002 From: aleax@aleax.it (Alex Martelli) Date: Wed, 17 Jul 2002 18:55:48 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171640.g6HGeB302603@odiug.zope.com> References: <200207171640.g6HGeB302603@odiug.zope.com> Message-ID: On Wednesday 17 July 2002 06:40 pm, Guido van Rossum wrote: ... > > The file object, unless you make it into an iterator, is not "a > > container" like all others and just sits there -- a bit of a wart. > > I must be misunderstanding. How does making the file object into an > iterator make it a container??? My fault for unclear expression! I mean: if it's an iterator, it's an iterator. All OTHER iterables (iterables that aren't iterators) are (what some call) containers. It's not QUITE that way, but Python would be easier to teach if it were. > > to Oren but he disliked it (because it meant readline would not > > respect its numeric argument, if any, in that case). > > Hm, you should've sent it to me. The numeric argument was a mistake I > think. Who ever uses it? Not me, and I think it's advisory anyway according to the docs. Still, it doesn't solve the reference-loop-between-two-deuced-things- that-don't-cooperate-with-gc problem. And I can't see how either could be made into a WEAK reference given that xreadlines objects in other contexts need to hold a strong ref to the file they work on -- we'd have to refactor xreadlines objects too, a core part holding a weak ref and a shell around it (holding a strong ref to the file) to support ordinary calls to xreadlines.xreadlines. Messy:-(. Alex From mal@lemburg.com Wed Jul 17 17:55:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 18:55:36 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: Message-ID: <3D35A188.20407@lemburg.com> jhylton@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Objects > In directory usw-pr-cvs1:/tmp/cvs-serv17711/Objects > > Modified Files: > dictobject.c floatobject.c intobject.c listobject.c > longobject.c rangeobject.c stringobject.c tupleobject.c > typeobject.c unicodeobject.c xxobject.c > Log Message: > staticforward bites the dust. > > The staticforward define was needed to support certain broken C > compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the > static keyword when it was used with a forward declaration of a static > initialized structure. Standard C allows the forward declaration with > static, and we've decided to stop catering to broken C compilers. (In > fact, we expect that the compilers are all fixed eight years later.) You'd think so :-) From a support file of the mx tools: /* --- Platform or compiler specific tweaks ------------------------------- */ /* Add some platform specific symbols to enable work-arounds for the static forward declaration of type definitions; note that the GNU C compiler does not have this problem. Many thanks to all who have contributed to this list. */ #if (!defined(__GNUC__)) # if (defined(NeXT) || defined(sgi) || defined(_AIX) || (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS)) # define BAD_STATIC_FORWARD # endif #endif /* Some more tweaks for various platforms. */ /* VMS needs this define. Thanks to Jean-Fran?ois PI?RONNE */ #if defined(__VMS) # define __SC__ #endif /* xlC on AIX doesn't like the Python work-around for static forwards in ANSI mode (default), so we switch on extended mode. Thanks to Albert Chin-A-Young */ #if defined(__xlC__) # pragma langlvl extended #endif > I'm leaving staticforward and statichere defined in object.h as > static. This is only for backwards compatibility with C extensions > that might still use it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Wed Jul 17 18:38:35 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 13:38:35 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 18:55:48 +0200." References: <200207171640.g6HGeB302603@odiug.zope.com> Message-ID: <200207171738.g6HHcZa09875@odiug.zope.com> > > > The file object, unless you make it into an iterator, is not "a > > > container" like all others and just sits there -- a bit of a wart. > > > > I must be misunderstanding. How does making the file object into an > > iterator make it a container??? > > My fault for unclear expression! I mean: if it's an iterator, it's an > iterator. All OTHER iterables (iterables that aren't iterators) are > (what some call) containers. > > It's not QUITE that way, but Python would be easier to teach if > it were. But leaving the file object as an exception to the rule helps as a reminder that it's just a rule of thumb and cannot be taken as absolute law. > > > to Oren but he disliked it (because it meant readline would not > > > respect its numeric argument, if any, in that case). > > > > Hm, you should've sent it to me. The numeric argument was a mistake I > > think. Who ever uses it? > > Not me, and I think it's advisory anyway according to the docs. > > Still, it doesn't solve the reference-loop-between-two-deuced-things- > that-don't-cooperate-with-gc problem. And I can't see how either > could be made into a WEAK reference given that xreadlines objects > in other contexts need to hold a strong ref to the file they work on -- > we'd have to refactor xreadlines objects too, a core part holding a > weak ref and a shell around it (holding a strong ref to the file) to > support ordinary calls to xreadlines.xreadlines. Messy:-(. I don't think that a weak ref to the file would be sufficient for xreadlines -- e.g. for line in open(filename): print line, would close the file right away. Likewise, the file needs a strong ref to the xreadlines, otherwise the following would create a new iterator in the second for loop, and lose data buffered by the first iterator. f = open(filename) it = iter(f) for i in range(10): it.next() del it for line in f: print line, I think I will have to reject Oren's patch because of this, and the situation with file iterators will remain as it is: once you've asked for the iterator, all operations on the file are unsafe, and the only way to get back to using the file is to abandon the file and do an absolute seek on the file. (This is sort of like switching between the raw integer file descriptor and the stream object in C -- or in Python if you care to use f.fileno() and os.read() etc.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Wed Jul 17 18:46:42 2002 From: aleax@aleax.it (Alex Martelli) Date: Wed, 17 Jul 2002 19:46:42 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com> References: <200207171738.g6HHcZa09875@odiug.zope.com> Message-ID: On Wednesday 17 July 2002 07:38 pm, Guido van Rossum wrote: ... > But leaving the file object as an exception to the rule helps as a > reminder that it's just a rule of thumb and cannot be taken as > absolute law. The sublunar world has enough reminders of its imperfections that we need not strive to add more. > > Still, it doesn't solve the reference-loop-between-two-deuced-things- > > that-don't-cooperate-with-gc problem. And I can't see how either > > could be made into a WEAK reference given that xreadlines objects > > in other contexts need to hold a strong ref to the file they work on -- > > we'd have to refactor xreadlines objects too, a core part holding a > > weak ref and a shell around it (holding a strong ref to the file) to > > support ordinary calls to xreadlines.xreadlines. Messy:-(. > > I don't think that a weak ref to the file would be sufficient for > xreadlines -- e.g. > > for line in open(filename): > print line, > > would close the file right away. If the iterator were the file itself, no it wouldn't, whatever kind of ref the xreadlines object had to the file. What would break without refactoring would be: for line in xreadlines.xreadlines(open(filename)): ... The refactoring would be to have a, say _xreadlines, object, with the functionality of today's xreadlines object BUT a weak ref to the file, and an xreadlines object with strong refs to the file and the _xreadlines object and delegating functionality to the latter. A bit of a mess. > Likewise, the file needs a strong ref to the xreadlines, otherwise the Definitely! Otherwise nothing keeps the xreadlines (or _xreadlines) object around _at all_ -- it's even worse than you indicate below, it seems to me: > following would create a new iterator in the second for loop, and lose > data buffered by the first iterator. > > f = open(filename) > it = iter(f) ...with the patch it would be "it is f", and so, I don't really get it... > for i in range(10): > it.next() > del it > for line in f: > print line, > > I think I will have to reject Oren's patch because of this, and the > situation with file iterators will remain as it is: once you've asked > for the iterator, all operations on the file are unsafe, and the only > way to get back to using the file is to abandon the file and do an Abandon the iterator, you mean? Or am I hopelessly confused? > absolute seek on the file. (This is sort of like switching between > the raw integer file descriptor and the stream object in C -- or in > Python if you care to use f.fileno() and os.read() etc.) In these cases you do get some control on the buffering, though, if you care to exercise it. Alex From guido@python.org Wed Jul 17 19:07:31 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 14:07:31 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 19:46:42 +0200." References: <200207171738.g6HHcZa09875@odiug.zope.com> Message-ID: <200207171807.g6HI7VS10049@odiug.zope.com> OK, I'll wait to see if someone submits a working patch. I still find it a non-issue myself. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Wed Jul 17 19:08:50 2002 From: ark@research.att.com (Andrew Koenig) Date: 17 Jul 2002 14:08:50 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com> References: <200207171640.g6HGeB302603@odiug.zope.com> <200207171738.g6HHcZa09875@odiug.zope.com> Message-ID: Guido> Likewise, the file needs a strong ref to the xreadlines, Guido> otherwise the following would create a new iterator in the Guido> second for loop, and lose data buffered by the first iterator. Guido> f = open(filename) Guido> it = iter(f) Guido> for i in range(10): Guido> it.next() Guido> del it Guido> for line in f: Guido> print line, Guido> I think I will have to reject Oren's patch because of this, and Guido> the situation with file iterators will remain as it is: once Guido> you've asked for the iterator, all operations on the file are Guido> unsafe, and the only way to get back to using the file is to Guido> abandon the file and do an absolute seek on the file. This implies that you don't expect the code above to work correctly, right? -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From guido@python.org Wed Jul 17 19:30:09 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 14:30:09 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 14:08:50 EDT." References: <200207171640.g6HGeB302603@odiug.zope.com> <200207171738.g6HHcZa09875@odiug.zope.com> Message-ID: <200207171830.g6HIU9i10168@odiug.zope.com> > Guido> Likewise, the file needs a strong ref to the xreadlines, > Guido> otherwise the following would create a new iterator in the > Guido> second for loop, and lose data buffered by the first iterator. > > Guido> f = open(filename) > Guido> it = iter(f) > Guido> for i in range(10): > Guido> it.next() > Guido> del it > Guido> for line in f: > Guido> print line, > > Guido> I think I will have to reject Oren's patch because of this, and > Guido> the situation with file iterators will remain as it is: once > Guido> you've asked for the iterator, all operations on the file are > Guido> unsafe, and the only way to get back to using the file is to > Guido> abandon the file and do an absolute seek on the file. > > This implies that you don't expect the code above to work correctly, right? I think that Oren's patch would make this work (the iterator requested by the second for loop would return the same iterator as the first one, since it's cached in the file object), but at the cost of an unbreakable cycle between the file and the xreadlines object. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Wed Jul 17 19:38:57 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Wed, 17 Jul 2002 14:38:57 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: <3D35A188.20407@lemburg.com> References: <3D35A188.20407@lemburg.com> Message-ID: <15669.47553.15097.651868@slothrop.zope.com> Sigh :-(. Both C89 and C99 say that what we're doing is legal. I'll try this on the SF compile farm's True64 and see where I get. Reports of failures on other platforms would be appreciated. (Actual compiler output rather than include files. I don't want to believe you <0.3 wink>.) Jeremy From guido@python.org Wed Jul 17 19:52:15 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 14:52:15 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: Your message of "Wed, 17 Jul 2002 14:38:57 EDT." <15669.47553.15097.651868@slothrop.zope.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> Message-ID: <200207171852.g6HIqFY10241@odiug.zope.com> I also note that there seems to be a typo in Marc-Andre's include file: (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS)) Note the missing 'p' in 'TrueComaq64'. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Jul 17 19:57:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 17 Jul 2002 20:57:07 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: <200207171852.g6HIqFY10241@odiug.zope.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <200207171852.g6HIqFY10241@odiug.zope.com> Message-ID: Guido van Rossum writes: > (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS)) > > Note the missing 'p' in 'TrueComaq64'. I thought it had superfluous 'q' instead... Regards, Martin From tim@zope.com Wed Jul 17 20:09:04 2002 From: tim@zope.com (Tim Peters) Date: Wed, 17 Jul 2002 15:09:04 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: Note that it's easy to make objects cooperate with gc. We've historically only done so when the need was clear, because the gc header takes about a dozen extra bytes per gc-tracked object. There aren't enough files or xreadlines objects in existence to care about the extra memory burden here, though; we simply thought that objects of these types could never be in cycles. OTOH, if that means lazy code like for fname in os.listdir('.'): for line in file(fname): n += 1 would accumulate an ever-growing number of open file objects until gc happened to run and break cycles, I expect a lot of CPython programs would "suddenly break" (they rely on refcount semantics now closing the anonymous file object the instant it becomes unreachable). From cce@clarkevans.com Wed Jul 17 20:41:36 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Wed, 17 Jul 2002 15:41:36 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 11:03:48AM -0400 References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com> Message-ID: <20020717154136.A91218@doublegemini.com> On Wed, Jul 17, 2002 at 11:03:48AM -0400, Guido van Rossum wrote: | | Calling it a dragon sounds way overstated. Oh. I wasn't calling it a dragon... I was stating that Python is dragon free. | > | Care to conduct a survey on c.l.py? | > | > Sure. I'll run the survey and report back. Ok. Here is the survey form for comment before it is posted. http://yaml.org/wk/survey?id=pyiter I'll summarize the results after the survey has run its course... Best, Clark http://yaml.org From guido@python.org Wed Jul 17 20:46:23 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 15:46:23 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 15:41:36 EDT." <20020717154136.A91218@doublegemini.com> References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com> <20020717154136.A91218@doublegemini.com> Message-ID: <200207171946.g6HJkN210645@odiug.zope.com> > Ok. Here is the survey form for comment before it is posted. > > http://yaml.org/wk/survey?id=pyiter > > I'll summarize the results after the survey has run its > course... Fine with me. Maybe the 4th para ("There really isn't ...") should be moved up so it becomes the 2nd. --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Wed Jul 17 20:58:55 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 14:58:55 -0500 (CDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171409.g6HE9Di00659@odiug.zope.com> Message-ID: On Wed, 17 Jul 2002, Guido van Rossum wrote: > Given that it's really only a very minor problem, I'd rather not > expend the effort to 'fix" this. I do agree that there is a policy decision to be made about when it's appropriate to make a protocol change, and that this should be left to you, Guido. But i think this is more than a minor problem. This is a namespace collision problem, and that's significant. Naming the method "next" means that any object with a "next" method cannot be adapted to support the iterator protocol. Unfortunately "next" is a pretty common word and it's quite possible that such a method name is already in use. So it's worth thinking through. -- ?!ng From guido@python.org Wed Jul 17 21:00:54 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 16:00:54 -0400 Subject: [Python-Dev] Going to OSCON? Give a lightning talk! Message-ID: <200207172000.g6HK0sp13487@odiug.zope.com> If you're going to the O'Reilly Open Source Convention next week, please consider giving a lightning talk. We have reserved two 45-minute slots in the Python track on Thursday afternoon for lightning talks. A lightning talk is a 5-minute tightly-focused presentation on any subject you like. You can discuss your favorite extension, rant, sing the praises of an under-appreciated developer, plug your product or company, beg for a job, or even present a Shakespearean-style play (don't laugh --- we had one of these in 2001). To submit your idea, fill out this simple web form: http://conferences.oreillynet.com/cs/os2002/create/e_sess?x-t=os2002_lt.create.form --Guido van Rossum (home page: http://www.python.org/~guido/) From cce@clarkevans.com Wed Jul 17 21:11:44 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Wed, 17 Jul 2002 16:11:44 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: ; from ping@zesty.ca on Wed, Jul 17, 2002 at 02:58:55PM -0500 References: <200207171409.g6HE9Di00659@odiug.zope.com> Message-ID: <20020717161144.A91916@doublegemini.com> On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote: | But i think this is more than a minor problem. This is a | namespace collision problem, and that's significant. Naming | the method "next" means that any object with a "next" method | cannot be adapted to support the iterator protocol. Unfortunately | "next" is a pretty common word and it's quite possible that such | a method name is already in use. Right, but such objects wouldn't be mis-leading beacuse they'd be missing a __iter__ method, correct? Best, Clark From guido@python.org Wed Jul 17 21:04:12 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 16:04:12 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 14:58:55 CDT." References: Message-ID: <200207172004.g6HK4C613511@odiug.zope.com> > But i think this is more than a minor problem. This is a > namespace collision problem, and that's significant. Naming > the method "next" means that any object with a "next" method > cannot be adapted to support the iterator protocol. Unfortunately > "next" is a pretty common word and it's quite possible that such > a method name is already in use. Can you explain this? Last time I checked CVS, PEP 246 wasn't implemented yet, so I don't think you mean "adapted" in that sense. Generally speaking, iterator implementations aren't created by making changes to an existing class -- they're created by creating new a class. The only change to *existing* classes needed is the addition of an __iter__ method to the underlying container object. So I'm not sure what you mean. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Jul 17 21:19:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 22:19:36 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <200207171852.g6HIqFY10241@odiug.zope.com> Message-ID: <3D35D158.6010908@lemburg.com> Guido van Rossum wrote: > I also note that there seems to be a typo in Marc-Andre's include > file: > > (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS)) > > Note the missing 'p' in 'TrueComaq64'. Good catch. Thanks. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 17 21:25:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 22:25:19 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <200207171852.g6HIqFY10241@odiug.zope.com> Message-ID: <3D35D2AF.9050604@lemburg.com> Martin v. Loewis wrote: > Guido van Rossum writes: > > >> (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS)) >> >>Note the missing 'p' in 'TrueComaq64'. > > > I thought it had superfluous 'q' instead... The whole name is superfluous ... it should be TrueHP64 ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 17 21:32:38 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 22:32:38 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> Message-ID: <3D35D466.5090903@lemburg.com> Jeremy Hylton wrote: > Sigh :-(. Both C89 and C99 say that what we're doing is legal. > > I'll try this on the SF compile farm's True64 and see where I get. > Reports of failures on other platforms would be appreciated. (Actual > compiler output rather than include files. I don't want to believe > you <0.3 wink>.) Can't provide you with that. I simply collect feedback from users having compile problems in that file. Note that most of these problems are related to declaring arrays as static forward (rather than C functions as Python normally does): staticforward PyMethodDef mxODBCursor_Methods[]; ...tons of code... statichere PyMethodDef mxODBCursor_Methods[] = { /* DB API interface */ ... I could eliminate those by clever rearranging the code, but have never had an actual need for it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Wed Jul 17 21:45:11 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 16:45:11 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: Your message of "Wed, 17 Jul 2002 22:32:38 +0200." <3D35D466.5090903@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> Message-ID: <200207172045.g6HKjBg13729@odiug.zope.com> > Can't provide you with that. I simply collect feedback from > users having compile problems in that file. Of course you never hear from users when their compiler is fixed so that a particular work-around is no longer necessary, so you keep collecting cruft until it collapses under its own weight. > Note that most of these problems are related to declaring > arrays as static forward (rather than C functions as Python > normally does): Note that staticforward was *only* intended for data declarations. It was never intended (nor needed) for functions. > staticforward PyMethodDef mxODBCursor_Methods[]; > > ...tons of code... > > statichere PyMethodDef mxODBCursor_Methods[] = > { > /* DB API interface */ > ... > > I could eliminate those by clever rearranging the code, > but have never had an actual need for it. You shouldn't need to. I suggest that we keep Jeremy's checkins in 2.3. Hopefully during the alpha or beta release cycle we will find out if there *really* are still platforms with broken compilers. At worst, it will show up after 2.3 final is released, and then we can fix it in 2.3.1. You won't have to target mx for 2.3 for another 18 months (assuming the PBF ever releases Python-in-a-Tie). --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Wed Jul 17 21:36:45 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 15:36:45 -0500 (CDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172004.g6HK4C613511@odiug.zope.com> Message-ID: On Wed, 17 Jul 2002, Guido van Rossum wrote: > > the method "next" means that any object with a "next" method > > cannot be adapted to support the iterator protocol. Unfortunately > > "next" is a pretty common word and it's quite possible that such > > a method name is already in use. > > Can you explain this? Last time I checked CVS, PEP 246 wasn't > implemented yet, so I don't think you mean "adapted" in that sense. No, i didn't -- i just meant in the more general sense. To make an object support one of the other internal protocols, like repr(), you can just add a special method (in Python) or fill in a slot (in C). > Generally speaking, iterator implementations aren't created by making > changes to an existing class Well, i guess that's part of the protocol philosophy. There exist cursor-like objects that would be natural candidates for being used like iterators (files are one example, database cursors are another). Choosing __next__ makes it possible to add support to an existing object when appropriate, instead of requiring an auxiliary object regardless of whether it's appropriate or inappropriate. To me, the former feels like the more typical Python thing to do, because it's consistent with the way all the other protocols work. So it's from this perspective that "next" without underscores is a wart to me. For example, when something is container-like you can implement __getitem__ on the object itself, and then you can use [] with the object. Some objects let you fetch containers and some objects implement __getitem__ on their own. But we don't force everybody to provide a convert-to-container operation in all cases before allowing them to provide __getitem__. -- ?!ng From ping@zesty.ca Wed Jul 17 21:58:11 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 15:58:11 -0500 (CDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020717161144.A91916@doublegemini.com> Message-ID: On Wed, 17 Jul 2002, Clark C . Evans wrote: > On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote: > | Naming > | the method "next" means that any object with a "next" method > | cannot be adapted to support the iterator protocol. > > Right, but such objects wouldn't be mis-leading beacuse they'd > be missing a __iter__ method, correct? __iter__ is a red herring. It has nothing to do with the act of iterating. It exists only to support the use of "for" directly on the iterator. Iterators that currently implement "next" but not "__iter__" will work in some places and not others. For example, given this: class Counter: def __init__(self, last): self.i = 0 self.last = last def next(self): self.i += 1 if self.i > self.last: raise StopIteration return self.i class Container: def __init__(self, size): self.size = size def __iter__(self): return Counter(self.size) This will work: >>> for x in Container(3): print x ... 1 2 3 But this will fail: >>> for x in Counter(3): print x ... Traceback (most recent call last): File "", line 1, in ? TypeError: iteration over non-sequence It's more accurate to say that there are two distinct protocols here. 1. An object is "for-able" if it implements __iter__ or __getitem__. This is a subset of the sequence protocol. 2. An object can be iterated if it implements next. The Container supports only protocol 1, and the Counter supports only protocol 2, with the above results. Iterators are currently asked to support both protocols. The semantics of iteration come only from protocol 2; protocol 1 is an effort to make iterators look sorta like sequences. But the analogy is very weak -- these are "sequences" that destroy themselves while you look at them -- not like any typical sequence i've ever seen! The short of it is that whenever any Python programmer says "for x in y", he or she had better be darned sure of whether this is going to destroy y. Whatever we can do to make this clear would be a good idea. -- ?!ng From mal@lemburg.com Wed Jul 17 21:58:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 22:58:15 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> Message-ID: <3D35DA67.8060206@lemburg.com> Guido van Rossum wrote: >>Can't provide you with that. I simply collect feedback from >>users having compile problems in that file. > > > Of course you never hear from users when their compiler is fixed so > that a particular work-around is no longer necessary, so you keep > collecting cruft until it collapses under its own weight. True; it doesn't hurt too much, though :-) >>Note that most of these problems are related to declaring >>arrays as static forward (rather than C functions as Python >>normally does): > > > Note that staticforward was *only* intended for data declarations. It > was never intended (nor needed) for functions. So what I'm doing is intended and what Jeremy corrected is not. Gald to hear that :-) >>staticforward PyMethodDef mxODBCursor_Methods[]; >> >>...tons of code... >> >>statichere PyMethodDef mxODBCursor_Methods[] = >>{ >> /* DB API interface */ >>... >> >>I could eliminate those by clever rearranging the code, >>but have never had an actual need for it. > > > You shouldn't need to. > > I suggest that we keep Jeremy's checkins in 2.3. Hopefully during the > alpha or beta release cycle we will find out if there *really* are > still platforms with broken compilers. At worst, it will show up > after 2.3 final is released, and then we can fix it in 2.3.1. You > won't have to target mx for 2.3 for another 18 months (assuming the > PBF ever releases Python-in-a-Tie). It's easy enough for me to add the #defines to the support header file if you take it out of the distribution, so it wouldn't hurt. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jul 17 22:03:53 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 17 Jul 2002 23:03:53 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> Message-ID: <3D35DBB9.9000103@lemburg.com> M.-A. Lemburg wrote: >> I suggest that we keep Jeremy's checkins in 2.3. Hopefully during the >> alpha or beta release cycle we will find out if there *really* are >> still platforms with broken compilers. At worst, it will show up >> after 2.3 final is released, and then we can fix it in 2.3.1. You >> won't have to target mx for 2.3 for another 18 months (assuming the >> PBF ever releases Python-in-a-Tie). > > > It's easy enough for me to add the #defines to the > support header file if you take it out of the distribution, > so it wouldn't hurt. Just an addition: please leave the configure test in the distribution. While I could implement that using distutils as well, I would rather benefit from relying on config.h doing the right thing in case there are some newly broken compilers out there, e.g. the xlC one on AIX seems to be a very picky one... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ping@zesty.ca Wed Jul 17 22:07:07 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 16:07:07 -0500 (CDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020717161144.A91916@doublegemini.com> Message-ID: On Wed, 17 Jul 2002, Clark C . Evans wrote: > > Right, but such objects wouldn't be mis-leading beacuse they'd > be missing a __iter__ method, correct? Oh, i guess i didn't properly answer your question. Oops. :) My answer would be: you could say that, but wouldn't it suck to have to check for the existence of __iter__ every time you wanted to call next? You can legislate that everyone should implement __iter__ together with next; you can legislate that everyone should check for __iter__ before calling next. To some extent you have to do both or neither; one without the other is inconsistent and would lead to surprises. In practice no one's going to check. So in practice __iter__ isn't really part of the protocol. -- ?!ng From guido@python.org Wed Jul 17 22:09:33 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 17:09:33 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 15:36:45 CDT." References: Message-ID: <200207172109.g6HL9XJ13871@odiug.zope.com> > Well, i guess that's part of the protocol philosophy. There exist > cursor-like objects that would be natural candidates for being used > like iterators (files are one example, database cursors are another). > Choosing __next__ makes it possible to add support to an existing > object when appropriate, instead of requiring an auxiliary object > regardless of whether it's appropriate or inappropriate. OK, that's clear. > To me, the former feels like the more typical Python thing to do, > because it's consistent with the way all the other protocols work. > So it's from this perspective that "next" without underscores > is a wart to me. Yes. > For example, when something is container-like you can implement > __getitem__ on the object itself, and then you can use [] with the > object. Some objects let you fetch containers and some objects > implement __getitem__ on their own. But we don't force everybody > to provide a convert-to-container operation in all cases before > allowing them to provide __getitem__. Correct. Now, weren't you a co-author of the Iterator PEP? I wish you'd brought this up then. Or maybe you did, and I overruled you. Sorry then. But I don't think we can withdraw this so easily. It's not the end of the world. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 17 22:21:26 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 17:21:26 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 15:58:11 CDT." References: Message-ID: <200207172121.g6HLLQH13946@odiug.zope.com> > __iter__ is a red herring. It has nothing to do with the act of > iterating. It exists only to support the use of "for" directly > on the iterator. Iterators that currently implement "next" but > not "__iter__" will work in some places and not others. For > example, given this: > > class Counter: > def __init__(self, last): > self.i = 0 > self.last = last > > def next(self): > self.i += 1 > if self.i > self.last: raise StopIteration > return self.i > > class Container: > def __init__(self, size): > self.size = size > > def __iter__(self): > return Counter(self.size) > > This will work: > > >>> for x in Container(3): print x > ... > 1 > 2 > 3 > > But this will fail: > > >>> for x in Counter(3): print x > ... > Traceback (most recent call last): > File "", line 1, in ? > TypeError: iteration over non-sequence > > It's more accurate to say that there are two distinct protocols here. > > 1. An object is "for-able" if it implements __iter__ or __getitem__. > This is a subset of the sequence protocol. > > 2. An object can be iterated if it implements next. > > The Container supports only protocol 1, and the Counter supports > only protocol 2, with the above results. > > Iterators are currently asked to support both protocols. The > semantics of iteration come only from protocol 2; protocol 1 is > an effort to make iterators look sorta like sequences. But the > analogy is very weak -- these are "sequences" that destroy > themselves while you look at them -- not like any typical > sequence i've ever seen! > > The short of it is that whenever any Python programmer says > "for x in y", he or she had better be darned sure of whether > this is going to destroy y. Whatever we can do to make this > clear would be a good idea. This is a very good summary of the two iterator protocols. Ping, would you mind adding this to PEP 234? --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Wed Jul 17 23:06:45 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 00:06:45 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171807.g6HI7VS10049@odiug.zope.com> References: <200207171807.g6HI7VS10049@odiug.zope.com> Message-ID: On Wednesday 17 July 2002 08:07 pm, Guido van Rossum wrote: > OK, I'll wait to see if someone submits a working patch. I still find > it a non-issue myself. OK, I'm gonna give it a try -- kludging up Oren's patch so that the xreadlines object is able to hold a non-addref'd pointer to the file object (when it's for internal use of the file object) and, as long as I'm at it, also including the little further kludge that makes f.readline delegate to f.next if f is holding an xreadlines object. Oh, and dropping the xreadlines object on a seek, too. It's just a few lines' changes to two files after all, Objects/fileobject.c and Modules/xreadlines.c. A bit kludgey and tricky, admittedly, which is perhaps not the nicest thing in the world given that fileobject.c isn't the shortest, simplest, or least crucial part of Python. But anyway, I think I'll have it ready by early tomorrow my time (it's past midnight and I'm past the age for all-nighters:-). Alex From ping@zesty.ca Wed Jul 17 23:40:27 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 15:40:27 -0700 (PDT) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207172109.g6HL9XJ13871@odiug.zope.com> Message-ID: On Wed, 17 Jul 2002, Guido van Rossum wrote: [...] > OK, that's clear. [...] > Yes. [...] > Correct. Neat! So much understanding. > Now, weren't you a co-author of the Iterator PEP? I wish you'd > brought this up then. Or maybe you did, and I overruled you. Sorry > then. Indeed, i wrote the first draft of the PEP, though it was very different from what we have today; it's been largely rewritten. The big design changes happened at the iterator BOF, so unfortunately there's no e-mail record of the debate. I recall that __iter__ made me uncomfortable, but i don't recall to what extent i expressed this. I don't remember whether there was any overruling. But it doesn't really matter; it's today now, and here we are. It is true that i failed to understand or express the issue well enough to have an effect on the design. I will cheerfully accept blame if it somehow means we'll end up with a nicer language. :) > But I don't think we can withdraw this so easily. It's not the end of > the world. I would be pleased to see a migration path (perhaps along the lines of Dave's suggestion, with warnings for a while), but i won't throw myself off a bridge if it doesn't happen. I do think there is some potential for errors caused by misunderstandings about whether or not "for x in y" is destructive. That's the thing that worries me the most. I think this is the main reason why the old practice of abusing __getitem__ was bad, and thus helped to motivate iterators in the first place. It seems serious enough that migrating to something that distinguishes destructive-for from non-destructive-for could indeed be worth the cost. The destructive-for issue may seem superficially unrelated to the __next__-naming issue. As i see it, the __next__-naming issue is related to the mandatory-__iter__ issue (because some people view __iter__ as a type flag), which is related to the destructive-for issue. -- ?!ng From ping@zesty.ca Wed Jul 17 23:53:42 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 17 Jul 2002 15:53:42 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172121.g6HLLQH13946@odiug.zope.com> Message-ID: I wrote: > __iter__ is a red herring. [...blah blah blah...] > The short of it is that whenever any Python programmer says > "for x in y", he or she had better be darned sure of whether > this is going to destroy y. Whatever we can do to make this > clear would be a good idea. Guido wrote: > This is a very good summary of the two iterator protocols. Ping, > would you mind adding this to PEP 234? And i thought it was a critique. Fascinating, Captain. :) I'm happy to add the text, but i want to be clear, then: is it acceptable to write an iterator that only provides if you only care about the "iteration protocol" and not the "for-able protocol"? I see that "ought to" is the most opinion the PEP is willing to give on the topic: A class is a valid iterator object when it defines a next() method that behaves as described above. A class that wants to be an iterator also ought to implement __iter__() returning itself. -- ?!ng From greg@cosc.canterbury.ac.nz Thu Jul 18 00:18:28 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:18:28 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com> Message-ID: <200207172318.g6HNIRS23784@oma.cosc.canterbury.ac.nz> Guido: > - The mapping between the next() method and the tp_iternext slot in > the type object would disappear, and instead the __next__() method > would be mapped to this slot. For what it's worth, I took it upon myself to "fix" this already in Pyrex extension types. So if you make this change, you'll be making Python more compatible with Pyrex. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 00:25:26 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:25:26 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171527.g6HFRV214431@europa.research.att.com> Message-ID: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> Andrew Koenig : > Is a file a container or not? I would say no, a file object is not a container in Python terms. You can't index it with [] or use len() on it or any of the other things you expect to be able to do on containers. I think we just have to live with the idea that there are things other than containers that can supply iterators. Forcing everything that can supply an iterator to bend over backwards to try to be a random-access container as well would be too cumbersome. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 00:32:22 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:32:22 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> Alex Martelli : > All files have seek and write, but not on all files do they work -- and > the same goes for iteration. I.e., it IS something of a mess I've just had a thought. Maybe it would be less of a mess if what we are calling "iterators" had been called "streams" instead. Then the term "iterator" could have been reserved for the special case of an object that provides stream access to a random-access collection. Then you could say that a file object is a stream object that provides line-by-line access to an OS file. Other stream objects can be constructed that give access to the OS file in other units. That would all make sense without seeming to imply any multi-pass ability. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 00:39:05 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:39:05 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> Alex Martelli : > the file object's is the only example of "fat interface" problem > in Python -- an interface that exposes a lot of methods, with many > objects claiming they implement that interface but actually lying Maybe the existing file object should be split up into some number of other objects with smaller interfaces. For example, instead of the file object actually accessing an OS file itself, it could just be a wrapper around an underlying "bytestream" object, which implements only read() and write(). Then, instead of implementing your own file-like object, you would implement a new bytestream object instead, and wrap it in a standard file object. That would give you all the flavours of access automatically without having to implement them yourself and without lying about anything. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 00:48:14 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:48:14 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz> Alex Martelli : > Still, it doesn't solve the reference-loop-between-two-deuced-things- > that-don't-cooperate-with-gc problem. Would making them cooperate with GC be a difficult thing to do? Seems to me we should be moving towards making everything cooperate with GC, and fixing things like this whenever they come to light. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 00:55:27 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 11:55:27 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com> Message-ID: <200207172355.g6HNtRG23895@oma.cosc.canterbury.ac.nz> Guido: > Likewise, the file needs a strong ref to the xreadlines, otherwise the > following would create a new iterator in the second for loop, and lose > data buffered by the first iterator. To me, these problems are screaming out that the buffer *shouldn't* be kept in the xreadlines object! Maybe the xreadlines object's buffer should be kept in the file object? Then it wouldn't matter if multiple xreadlines objects were created, as they'd all share the same buffer, and there would be no reference loops. Hmmm... then we're moving towards making the file object and the xreadlines object be the same object. What was the reason for not doing that again? Was it just to avoid changing a lot of code, or was there some reason it wouldn't work? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 01:01:39 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 12:01:39 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207180001.g6I01dU23913@oma.cosc.canterbury.ac.nz> > Unfortunately "next" is a pretty common word and it's quite possible > that such a method name is already in use. It is -- all my scanners have a "next" method that does something different from what an iterator's "next" is supposed to do. Fortunately I haven't had an urge to make any of them into an iterator yet. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney@avaya.com Thu Jul 18 01:05:32 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 18 Jul 2002 10:05:32 +1000 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: > From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz] > > Would making them cooperate with GC be a difficult > thing to do? Seems to me we should be moving towards > making everything cooperate with GC, and fixing > things like this whenever they come to light. It would sure annoy those people who insist that file(f, 'w').write(s) is a safe idiom ... :) Tim Delaney From greg@cosc.canterbury.ac.nz Thu Jul 18 01:05:39 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 12:05:39 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172004.g6HK4C613511@odiug.zope.com> Message-ID: <200207180005.g6I05dO23924@oma.cosc.canterbury.ac.nz> Guido: > Generally speaking, iterator implementations aren't created by making > changes to an existing class Continuing with my scanner example, it's conceivable that I might want to give it an iterator interface as an alternative to the existing one -- it's already an iterator, really, it just doesn't have the new standard iterator interface. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 01:25:45 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 12:25:45 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207180025.g6I0Pjw23965@oma.cosc.canterbury.ac.nz> Ka-Ping: > is it > acceptable to write an iterator that only provides if you > only care about the "iteration protocol" and not the "for-able > protocol"? Probably just as acceptable as writing a file object that only provides some of the file methods. Seems quite Pythonic to me. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Jul 18 01:29:55 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 12:29:55 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207180029.g6I0TtE23981@oma.cosc.canterbury.ac.nz> "Delaney, Timothy" : > Me: > > Would making them cooperate with GC be a difficult > > thing to do? > > It would sure annoy those people who insist that > > file(f, 'w').write(s) > > is a safe idiom ... :) Well, it would serve them right! :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Thu Jul 18 01:43:20 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 17 Jul 2002 20:43:20 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 15:53:42 PDT." References: Message-ID: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net> > > This is a very good summary of the two iterator protocols. Ping, > > would you mind adding this to PEP 234? > > And i thought it was a critique. Fascinating, Captain. :) > > I'm happy to add the text, but i want to be clear, then: is it > acceptable to write an iterator that only provides if you > only care about the "iteration protocol" and not the "for-able > protocol"? No, an iterator ought to provide both, but it's good to recognize that there *are* two protocols. > I see that "ought to" is the most opinion the PEP is willing to > give on the topic: > > A class is a valid iterator object when it defines a next() > method that behaves as described above. A class that wants > to be an iterator also ought to implement __iter__() > returning itself. I would like to see this strengthened. I envision "iterator algebra" code that really needs to be able to do a for loop over an iterator when it feels like it. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Thu Jul 18 07:52:08 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 08:52:08 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <200207171807.g6HI7VS10049@odiug.zope.com> Message-ID: On Thursday 18 July 2002 12:06 am, Alex Martelli wrote: > On Wednesday 17 July 2002 08:07 pm, Guido van Rossum wrote: > > OK, I'll wait to see if someone submits a working patch. I still find > > it a non-issue myself. > > OK, I'm gonna give it a try -- kludging up Oren's patch so that Done, now submitted as patch 583235. Alex From jeremy@alum.mit.edu Thu Jul 18 17:58:05 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 12:58:05 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <3D35DA67.8060206@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> Message-ID: <15670.62365.517118.775364@slothrop.zope.com> >>>>> "MAL" == mal writes: >>> Note that most of these problems are related to declaring arrays >>> as static forward (rather than C functions as Python normally >>> does): >> >> >> Note that staticforward was *only* intended for data >> declarations. It was never intended (nor needed) for functions. MAL> So what I'm doing is intended and what Jeremy corrected is MAL> not. Gald to hear that :-) staticforward was intended for data declarations, but was widely misused within the core for function prototypes. The intended use for data declarations was to have the initial declaration be staticforward and the initialization use statichere. Although this was the intent, most uses did not follow this pattern. It was common to use staticforward the first time and static the second; this was pretty harmless as statichere always expanded to static. It was also common to use staticforward in both places, when ended up declaring it as extern rather than static. BTW, I'm also gald to hear that what I correct is not intended. Jeremy From jeremy@alum.mit.edu Thu Jul 18 18:02:11 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 13:02:11 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: <3D35DBB9.9000103@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> Message-ID: <15670.62611.943840.954629@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> M.-A. Lemburg wrote: >>> I suggest that we keep Jeremy's checkins in 2.3. Hopefully >>> during the alpha or beta release cycle we will find out if there >>> *really* are still platforms with broken compilers. At worst, >>> it will show up after 2.3 final is released, and then we can fix >>> it in 2.3.1. You won't have to target mx for 2.3 for another 18 >>> months (assuming the PBF ever releases Python-in-a-Tie). >> >> >> It's easy enough for me to add the #defines to the support header >> file if you take it out of the distribution, so it wouldn't hurt. MAL> Just an addition: please leave the configure test in the MAL> distribution. While I could implement that using distutils as MAL> well, I would rather benefit from relying on config.h doing the MAL> right thing in case there are some newly broken compilers out MAL> there, e.g. the xlC one on AIX seems to be a very picky one... I don't understand what your goal is. Why do you want the configure test if your header file has a bunch of platform-specific ifdefs? If these platforms actually had a problem, the configure test would have caught it and you wouldn't need the ifdefs. The only way the ifdefs would have an effect is if the configure test did not detect a problem; but if the configure test didn't detect a problem, then you don't need the ifdefs. Jeremy From jmiller@stsci.edu Thu Jul 18 16:59:16 2002 From: jmiller@stsci.edu (Todd Miller) Date: Thu, 18 Jul 2002 11:59:16 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() References: <20020715175256.5971.qmail@web40112.mail.yahoo.com> Message-ID: <3D36E5D4.80308@stsci.edu> Scott Gilbert wrote:
--- Todd Miller <jmiller@stsci.edu> wrote:
I don't understand what you say, but I believe you.

I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
lives longer than the extension function call that created it. I have
heard that it is possible for the original object to "move" leaving the
buffer object pointer to it dangling.

Yes. The PyBufferObject grabs the pointer from the PyBufferProcs
supporting object when the PyBufferObject is created. If the PyBufferProcs
supporting object reallocates the memory (possibly from a resize) the
Thanks for the example.

PyBufferObject can be left with a bad pointer. This is easily possible if
you try to use the array module arrays as a buffer.
This is good to know.


I've submitted a patch to fix this particular problem (among others), but
there are still enough things that the buffer object can't do that
something new is needed.
I understand.  I saw your patches and they sounded good to me.


Maybe instead of the buffer() function/type, there should be a way to
allocate raw memory?

Yes.    It would also be nice to be able to:

1. Know (at the python level) that a type supports the buffer C-API.

Good idea.  (I guess right now you can see if calling buffer() with an
instance as argument works. :-)

2.  Copy bytes from one buffer to another (writeable buffer).  


And the copy operations shouldn't create any large temporaries:
I agree with this completely.    I could summarize my opinion by saying that while
I regard the current buffering system as pretty complete,  the buffer object places emphasis
on the wrong behavior.  In terms of modelling memory regions, strings are the wrong way
to go.   


buf1 = memory(50000)
buf2 = memory(50000)
# no 10K temporary should be created in the next line
buf1[10000:20000] = buf2[30000:40000]

The current buffer object could be used like this, but it would create a
temporary string.
Looking at buffering most of this week, the fact that mmap slicing also returns strings is one justification I've found for having a buffer object,  i.e.,  mmap slicing is not a substitute for the buffer object.  The buffer object makes it possible to partition a mmap or any bufferable object into pseudo-independent, possibly writable, pieces.  

One justification to have a new buffer object is pickling (one of Scott's posts alerted me to this).   I think the behavior we want for numarray is to be able to pickle a view of a bufferable object more or less like a string containing the buffer image, and to unpickle it as a memory object.   The prospect of adding pickling support makes me wonder if seperating the allocator and view aspects of the buffer object is a good idea;  I thought it was, but now I wonder.

So getting an efficient copy operation seems to require that slices just
create new "views" to the same memory.
Other justifications for a new buffer object might be:

1. The ability to partition any bufferable object into regions which can be passed around.  These regions
would themselves be buffers.

2. The ability to efficiently pickle a view of any bufferable object.

Maybe you would like to work on a requirements gathering for a memory
object

Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
collate the results of any discussion that ensues. Is that what you had
in mind?



In the PEP that I'm drafting, I've been calling the new object "bytes"
(since it is just a simple array of bytes). Now that you guys are
referring to it as the "memory object", should I change the name? Doesn't
really matter, but it might avoid confusion to know we're all talking about
the same thing.

Calling this a memory type  sounds the best to me.  The question I have not resolved for myself
is whether there should be one type which "does it all" or two types, a memory allocator and a bufferable
object manipulator.  



__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com


From lalo@laranja.org Thu Jul 18 16:03:40 2002 From: lalo@laranja.org (Lalo Martins) Date: Thu, 18 Jul 2002 12:03:40 -0300 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020718150340.GB1209@laranja.org> On Fri, Jul 12, 2002 at 10:47:34AM -0400, Guido van Rossum wrote: > > Guido, can you please, for our enlightenment, tell us what are the > > reasons you feel %(foo)s was a mistake? > > Because of the trailing 's'. It's very easy to leave it out by > mistake, and because the definition of printf formats skips over > spaces (don't ask me why), the first character of the following word > is used as the type indicator. In case that wasn't clear, I agree with that - I asked because I wanted this in writing for the record. BTW: IIRC, it skips over spaces because spaces are a valid format modifier (meaning "pad with spaces"). []s, |alo +---- -- Those who trade freedom for security lose both and deserve neither. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From guido@python.org Thu Jul 18 17:27:25 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 12:27:25 -0400 Subject: [Python-Dev] test_socket failure on FreeBSD In-Reply-To: Your message of "Mon, 08 Jul 2002 23:53:01 +1100." References: Message-ID: <200207181627.g6IGRPE21459@odiug.zope.com> > > There are probably some differences in the socket semantics. I'd > > appreciate it if you could provide a patch or at least a clue! > > I've not read enough Stevens to grok sockets code (yet) :-( > > However, I hope that the instrumented verbose output of test_socket might > give you a clue.... > > I've attached the diff from the version of test_socket (vs recent CVS) > that I used, as well as output from test_socket on FreeBSD 4.4 and > OS/2+EMX. Getting the FreeBSD issues sorted is a higher priority for me > than getting OS/2+EMX working (though that would be nice too). > > Please let me know if there's more testing/debugging I can do. I've got some time for this now. Ignoring your OS/2+EMX output and focusing on the FreeBSD logs, I notice: [...] > Testing recvfrom() in chunks over TCP. ... > seg1='Michael Gilfix was he', addr='None' > seg2='re > ', addr='None' > ERROR Hm. This looks like recvfrom() on a TCP stream doesn't return an address; not entirely unreasonable. I wonder if self.cli_conn.getpeername() returns the expected address; can you check this? Add this after each recvfrom() call. if addr is None: addr = self.cli_conn.getpeername() [...] > Testing large recvfrom() over TCP. ... > msg='Michael Gilfix was here > ', addr='None' > ERROR Ditto. > Testing non-blocking accept. ... > conn= > addr=('127.0.0.1', 3144) > FAIL This is different. It seems that the accept() call doesn't time out. But this could be because the client thread connects too fast. Can you add a sleep (e.g. time.sleep(5)) to _testAccept() before the connect() call? [...] > Testing non-blocking recv. ... > conn= > addr=('127.0.0.1', 3146) > FAIL Similar. Try putting a sleep in _testRecv() between the connect() and the send(). [...] Let me know if you want me to provide specific patches... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 16:49:44 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 11:49:44 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: Your message of "Mon, 08 Jul 2002 21:20:56 EDT." <20020709012056.GA2526@cthulhu.gerg.ca> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> Message-ID: <200207181549.g6IFniw21368@odiug.zope.com> > > Perhaps we could have some kind of category for distutils > > packages which marks them as system add-ons vs. site add-ons. > > +1 -- this should definitely be up to the package author/packager, not > the local admin. I once tried to convince Guido that the ability to > occasionally upgrade standard library modules/packages would be a good > thing, but he wasn't having it. Any change of heart, O Mighty BDFL? Before I answer that, here's a question. Why do we think it's a good idea to distribute upgrades as separate add-ons while we don't think it's okay to distribute such upgrades with bugfix releases? Doesn't this just increase the variability of site configurations, and hence version interaction hell? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 15:22:11 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 10:22:11 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Wed, 17 Jul 2002 15:40:27 PDT." References: Message-ID: <200207181422.g6IEMBr14526@odiug.zope.com> > I do think there is some potential for errors caused by > misunderstandings about whether or not "for x in y" is destructive. > That's the thing that worries me the most. I think this is the main > reason why the old practice of abusing __getitem__ was bad, and thus > helped to motivate iterators in the first place. It seems serious > enough that migrating to something that distinguishes > destructive-for from non-destructive-for could indeed be worth the > cost. I'm not sure I understand this (this seems to be my week for not understanding what people write :-( ). First of all, I'm not sure what exactly the issue is with destructive for-loops. If I have a function that contains a for-loop over its argument, and I pass iter(x) as the argument, then the iterator is destroyed in the process, but x may or may not be, depending on what it is. Maybe the for-loop is a red herring? Calling next() on an iterator may or may not be destructive on the underlying "sequence" -- if it is a generator, for example, I would call it destructive. Perhaps you're trying to assign properties to the iterator abstraction that aren't really there? Next, I'm not sure how renaming next() to __next__() would affect the situation w.r.t. the destructivity of for-loops. Or were you talking about some other migration? --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Thu Jul 18 12:23:16 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 18 Jul 2002 07:23:16 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net> References: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > > I'm happy to add the text, but i want to be clear, then: is it > > acceptable to write an iterator that only provides if you > > only care about the "iteration protocol" and not the "for-able > > protocol"? > No, an iterator ought to provide both, but it's good to recognize that > there *are* two protocols. > > A class is a valid iterator object when it defines a next() > > method that behaves as described above. A class that wants > > to be an iterator also ought to implement __iter__() > > returning itself. > I would like to see this strengthened. I envision "iterator algebra" > code that really needs to be able to do a for loop over an iterator > when it feels like it. Maybe the reasons behind having __iter__() returning itself should be clearly expressed in the PEP, too. On this list, Tim gave one recently, Guido gives another here, but unless I missed it, the PEP gives none. Usually, PEPs explain the reasons behind the choices. -- François Pinard http://www.iro.umontreal.ca/~pinard From cce@clarkevans.com Thu Jul 18 15:06:31 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Thu, 18 Jul 2002 10:06:31 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: ; from ping@zesty.ca on Wed, Jul 17, 2002 at 02:58:55PM -0500 References: <200207171409.g6HE9Di00659@odiug.zope.com> Message-ID: <20020718100631.A3468@doublegemini.com> On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote: | But i think this is more than a minor problem. This is a | namespace collision problem, and that's significant. Naming | the method "next" means that any object with a "next" method | cannot be adapted to support the iterator protocol. Unfortunately | "next" is a pretty common word and it's quite possible that such | a method name is already in use. Ping, Do you have any suggestions for re-wording the Iterator questionare at http://yaml.org/wk/survey?id=pyiter to reflect this paragraph above? Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From xscottg@yahoo.com Mon Jul 15 18:52:56 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 15 Jul 2002 10:52:56 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <3D32FA0D.6020200@stsci.edu> Message-ID: <20020715175256.5971.qmail@web40112.mail.yahoo.com> --- Todd Miller wrote: > > > >I don't understand what you say, but I believe you. > > > I meant we call PyBuffer_FromReadWriteObject and the resulting buffer > lives longer than the extension function call that created it. I have > heard that it is possible for the original object to "move" leaving the > buffer object pointer to it dangling. Yes. The PyBufferObject grabs the pointer from the PyBufferProcs supporting object when the PyBufferObject is created. If the PyBufferProcs supporting object reallocates the memory (possibly from a resize) the PyBufferObject can be left with a bad pointer. This is easily possible if you try to use the array module arrays as a buffer. I've submitted a patch to fix this particular problem (among others), but there are still enough things that the buffer object can't do that something new is needed. > > > > > > >>>Maybe instead of the buffer() function/type, there should be a way to > >>>allocate raw memory? > >>> > > > >>Yes. It would also be nice to be able to: > >> > >>1. Know (at the python level) that a type supports the buffer C-API. > >> > > > >Good idea. (I guess right now you can see if calling buffer() with an > >instance as argument works. :-) > > > >>2. Copy bytes from one buffer to another (writeable buffer). > >> And the copy operations shouldn't create any large temporaries: buf1 = memory(50000) buf2 = memory(50000) # no 10K temporary should be created in the next line buf1[10000:20000] = buf2[30000:40000] The current buffer object could be used like this, but it would create a temporary string. So getting an efficient copy operation seems to require that slices just create new "views" to the same memory. > > > >Maybe you would like to work on a requirements gathering for a memory > >object > > > Sure. I'd be willing to poll comp.lang.python (python-list?) and > collate the results of any discussion that ensues. Is that what you had > in mind? > In the PEP that I'm drafting, I've been calling the new object "bytes" (since it is just a simple array of bytes). Now that you guys are referring to it as the "memory object", should I change the name? Doesn't really matter, but it might avoid confusion to know we're all talking about the same thing. __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From aleax@aleax.it Thu Jul 18 07:02:23 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 08:02:23 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz> References: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz> Message-ID: On Thursday 18 July 2002 01:48 am, Greg Ewing wrote: > Alex Martelli : > > Still, it doesn't solve the reference-loop-between-two-deuced-things- > > that-don't-cooperate-with-gc problem. > > Would making them cooperate with GC be a difficult > thing to do? Seems to me we should be moving towards > making everything cooperate with GC, and fixing > things like this whenever they come to light. Tim Peters says it wouldn't be, but I have not explored that. Alex From aleax@aleax.it Thu Jul 18 06:52:34 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 07:52:34 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> Message-ID: On Thursday 18 July 2002 01:25 am, Greg Ewing wrote: > Andrew Koenig : > > Is a file a container or not? > > I would say no, a file object is not a container in Python terms. > You can't index it with [] or use len() on it or any of > the other things you expect to be able to do on containers. > > I think we just have to live with the idea that there are > things other than containers that can supply iterators. Yes, there are such things, and there may be cases in which no other alternative makes sense. But I don't think files are necessarily in such a bind. > Forcing everything that can supply an iterator to bend > over backwards to try to be a random-access container > as well would be too cumbersome. Absolutely. But what Oren's patch does, and my mods of it preserve, is definitely NOT "forcing" files "to be random- access containers": on the contrary, it accepts the fact that files aren't containers and conceptually simplifies things by making them iterators instead. I'm not sure about "random access" being needed to be a container. Consider sets, e.g. as per Greg Wilson's soapbox implementation (as modified by my patch to allow immutable-sets, maybe, but that's secondary). They're doubtlessly containers, able to produce on request as many iterators as you wish, each iterator not affecting the set's state in any way -- the ideal. But what sense would it make to force sets to expose a __getitem__? Right now they inherit from dict and thus do happen to expose it, but that's really an implementation artefact showing through (and a good example of why one might like to inherit without needing to expose all of the superclass's interface, to tie this in to another recent thread -- inheritance for implementation). Ideally, sets would expose __contains__, __iter__, __len__, ways to add and remove elements, and perhaps (it's so in Greg's implementation, and I didn't touch that) set ops such as union, intersection &c. someset[anindex] is really a weird thing to have... yet sets _are_ containers! Alex From guido@python.org Thu Jul 18 19:42:30 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 14:42:30 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: Your message of "Mon, 15 Jul 2002 10:52:56 PDT." <20020715175256.5971.qmail@web40112.mail.yahoo.com> References: <20020715175256.5971.qmail@web40112.mail.yahoo.com> Message-ID: <200207181842.g6IIgUo22271@odiug.zope.com> > Yes. The PyBufferObject grabs the pointer from the PyBufferProcs > supporting object when the PyBufferObject is created. If the PyBufferProcs > supporting object reallocates the memory (possibly from a resize) the > PyBufferObject can be left with a bad pointer. This is easily possible if > you try to use the array module arrays as a buffer. > > I've submitted a patch to fix this particular problem (among others), but > there are still enough things that the buffer object can't do that > something new is needed. Can you remind me of the patch#? (I'm curious how you plan to fix this...) > In the PEP that I'm drafting, I've been calling the new object "bytes" > (since it is just a simple array of bytes). Now that you guys are > referring to it as the "memory object", should I change the name? Doesn't > really matter, but it might avoid confusion to know we're all talking about > the same thing. I like bytes just fine. PS, Todd, if you can, please don't send HTML-only mail to python-dev... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 19:49:19 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 14:49:19 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 08:02:23 +0200." References: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz> Message-ID: <200207181849.g6IInJa22327@odiug.zope.com> > > Would making them cooperate with GC be a difficult > > thing to do? Seems to me we should be moving towards > > making everything cooperate with GC, and fixing > > things like this whenever they come to light. > > Tim Peters says it wouldn't be, but I have not explored that. But he also warned that it introduces new surprises. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 19:45:41 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 14:45:41 -0400 Subject: [Python-Dev] Re: Sets In-Reply-To: Your message of "Thu, 18 Jul 2002 07:52:34 +0200." References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> Message-ID: <200207181845.g6IIjfw22307@odiug.zope.com> > But what sense would it make to force sets to expose > a __getitem__? Right now they inherit from dict and > thus do happen to expose it, but that's really an > implementation artefact showing through (and a good > example of why one might like to inherit without needing > to expose all of the superclass's interface, to tie this in > to another recent thread -- inheritance for implementation). > > Ideally, sets would expose __contains__, __iter__, __len__, > ways to add and remove elements, and perhaps (it's so in > Greg's implementation, and I didn't touch that) set ops such > as union, intersection &c. someset[anindex] is really a weird > thing to have... yet sets _are_ containers! I believe I recommended to Greg to make sets "have" a dict instead of "being" dicts, and I think he agreed. But I guess he never got to implementing that change. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Thu Jul 18 06:57:37 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 07:57:37 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> Message-ID: On Thursday 18 July 2002 01:32 am, Greg Ewing wrote: > Alex Martelli : > > All files have seek and write, but not on all files do they work -- and > > the same goes for iteration. I.e., it IS something of a mess > > I've just had a thought. Maybe it would be less of a mess > if what we are calling "iterators" had been called "streams" Possibly -- I did use the "streams" name often in the tutorial on iterators and generators, it's a very natural term. > instead. Then the term "iterator" could have been reserved > for the special case of an object that provides stream > access to a random-access collection. Nice touch, except that I keep quibbling on the "random access" need -- see my previous msg about sets. > Then you could say that a file object is a stream object That's what I'd love to do -- and requires the file object to expose a next method and have iter(f) is f. That's what Oren's patch does, and the reason I'm trying to save it from the need for a reference loop. > that provides line-by-line access to an OS file. Other > stream objects can be constructed that give access to > the OS file in other units. That would all make sense > without seeming to imply any multi-pass ability. Seekable files can be multi-pass, but in the strict sense that you can rewind them -- it's still impractical to have them produce multiple *independent* iterators (needing some sort of in-memory caching). Alex From jeremy@alum.mit.edu Thu Jul 18 20:08:16 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 15:08:16 -0400 Subject: [Python-Dev] configure problems porting to Tru64 Message-ID: <15671.4640.361811.434411@slothrop.zope.com> I've been trying to build with the current CVS on Tru64 today. This is Tru64 Unix 5.1a with Compaq C++ 6.5. I've run into a bunch of problems with posixmodule.c (not surprise there), but I don't know what the right strategy for fixing them is. Here is a conflicting set of problems: fchdir() is only defined if _XOPEN_SOURCE_EXTENDED is defined. setpgrp() takes no arguments if _XOPEN_SOURCE_EXTENDED is defined, but two arguments if it is not. I found the fchdir() problem first and though the solution would be to change this bit of code in Python.h: /* Forcing SUSv2 compatibility still produces problems on some platforms, True64 and SGI IRIX being two of them, so for now the define is switched off. */ #if 0 #ifndef _XOPEN_SOURCE # define _XOPEN_SOURCE 500 #endif #endif And change "#if 0" to "#if __digital__", but that causes the setpgrp() problem to appear. It seems that configure has a test for whether setpgrp() takes arguments, but configure runs its test without defining _XOPEN_SOURCE. (I'll also note that configure.in has a rather complex test for this, when it appears that autoconf has a builtin AC_FUNC_SETPGRP. Anyone know why we don't use this?) How should we actually fix this problem? It seems to me that the right solution is to define _XOPEN_SOURCE in Tru64 and somehow guarantee that configure runs its tests with that defined, too. How would we achieve that? Jeremy From mal@lemburg.com Thu Jul 18 20:13:37 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 21:13:37 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> Message-ID: <3D371361.7050908@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: >>>>> > > MAL> M.-A. Lemburg wrote: > >>> I suggest that we keep Jeremy's checkins in 2.3. Hopefully > >>> during the alpha or beta release cycle we will find out if there > >>> *really* are still platforms with broken compilers. At worst, > >>> it will show up after 2.3 final is released, and then we can fix > >>> it in 2.3.1. You won't have to target mx for 2.3 for another 18 > >>> months (assuming the PBF ever releases Python-in-a-Tie). > >> > >> > >> It's easy enough for me to add the #defines to the support header > >> file if you take it out of the distribution, so it wouldn't hurt. > > MAL> Just an addition: please leave the configure test in the > MAL> distribution. While I could implement that using distutils as > MAL> well, I would rather benefit from relying on config.h doing the > MAL> right thing in case there are some newly broken compilers out > MAL> there, e.g. the xlC one on AIX seems to be a very picky one... > > I don't understand what your goal is. Why do you want the configure > test if your header file has a bunch of platform-specific ifdefs? If > these platforms actually had a problem, the configure test would have > caught it and you wouldn't need the ifdefs. The only way the ifdefs > would have an effect is if the configure test did not detect a > problem; but if the configure test didn't detect a problem, then you > don't need the ifdefs. Correct, but I don't want to add more cruft to the file: The configure script tests whether static forwards work or not. If you'd rip out the test as well, then I'd have to add those platforms which still have problems manually. The problem is: I don't know which platforms these are (because configure found these itself). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From greg@cosc.canterbury.ac.nz Thu Jul 18 11:03:47 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Jul 2002 22:03:47 +1200 (NZST) Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? Message-ID: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> Someone told me that Pyrex should be generating __declspec(dllexport) for the module init func. But someone else says this is only needed if you're importing a dll as a library, and that it's not needed for Python extensions. Can anyone who knows what they're doing on Windows give me a definitive answer about whether it's really needed or not? Thanks, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleax@aleax.it Thu Jul 18 07:01:54 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 08:01:54 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> Message-ID: On Thursday 18 July 2002 01:39 am, Greg Ewing wrote: > Alex Martelli : > > the file object's is the only example of "fat interface" problem > > in Python -- an interface that exposes a lot of methods, with many > > objects claiming they implement that interface but actually lying > > Maybe the existing file object should be split up into > some number of other objects with smaller interfaces. In an ideal world, yes. In practice, I strongly doubt it's feasible to break backwards compatibility THAT heavily. > For example, instead of the file object actually accessing an > OS file itself, it could just be a wrapper around an > underlying "bytestream" object, which implements only > read() and write(). I suspect read and write would best be kept on separate interfaces. Ability to read, write, seek-and-tell, being three atoms of which it makes sense to have about 6 combos (R, W, R+W, each with or without S&T). Rewind might make sense separately from S&T if streaming tapes were still in fashion and OS's gave natural access to them. But I do think it's all pretty academic. Alex From mal@lemburg.com Thu Jul 18 20:19:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 21:19:21 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com> Message-ID: <3D3714B9.1060807@lemburg.com> Guido van Rossum wrote: >>>Perhaps we could have some kind of category for distutils >>>packages which marks them as system add-ons vs. site add-ons. >> >>+1 -- this should definitely be up to the package author/packager, not >>the local admin. I once tried to convince Guido that the ability to >>occasionally upgrade standard library modules/packages would be a good >>thing, but he wasn't having it. Any change of heart, O Mighty BDFL? > > > Before I answer that, here's a question. Why do we think it's a good > idea to distribute upgrades as separate add-ons while we don't think > it's okay to distribute such upgrades with bugfix releases? The idea is to provide bugfixes for Python versions which are no longer being maintained. Of course, the effect would only show a few years ahead. > Doesn't > this just increase the variability of site configurations, and hence > version interaction hell? I don't think that core packages are any different than other third party packages: they are usually independent enough from the rest of the code that upgrades don't affect the workings of the other code using it. The internals are free to change, though, e.g. to accomodate bug fixes, etc. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From xscottg@yahoo.com Thu Jul 18 20:24:50 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 18 Jul 2002 12:24:50 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <200207181842.g6IIgUo22271@odiug.zope.com> Message-ID: <20020718192450.15024.qmail@web40105.mail.yahoo.com> --- Guido van Rossum wrote: > > Yes. The PyBufferObject grabs the pointer from the PyBufferProcs > > supporting object when the PyBufferObject is created. If the > > PyBufferProcs supporting object reallocates the memory (possibly > > from a resize) the PyBufferObject can be left with a bad pointer. > > This is easily possible if you try to use the array module arrays > > as a buffer. > > > > I've submitted a patch to fix this particular problem (among others), > > but there are still enough things that the buffer object can't do that > > something new is needed. > > Can you remind me of the patch#? (I'm curious how you plan to fix > this...) > Patch number 552438. Instead of cacheing the pointer, it grabs it from the other object every time it is needed. Might be a little slower, but I think it's correct. > Barry (the PEP czar) forwarded me your PEP. I'll try to do some > triage on it so I can tell Barry whether to check it in (that doesn't > mean it's accepted :-). I'm bad at patience, but I'm not terribly naive. I fully expect everyone and their dog will find something to dislike before it gets approved/rejected. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From haering_python@gmx.de Thu Jul 18 20:28:51 2002 From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=) Date: Thu, 18 Jul 2002 21:28:51 +0200 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> Message-ID: <20020718192851.GA2759@lilith.my-fqdn.de> * Greg Ewing [2002-07-18 22:03 +1200]: > Someone told me that Pyrex should be generating > __declspec(dllexport) for the module init func. That's wrong. You should be using DL_EXPORT instead, which will do the right thing no matter which platform you're on: on Windows, it will expand to __declspec(dllexport), iff you're compiling an extension module (in contrast to compiling the Python core). I believe that on Unix, it will expand to an empty string :-) You also don't need any #ifdefs for win32 for setting ob_type, just set them _only_ in your init function and leave them as NULL in the declarations. Gerhard -- This sig powered by Python! Außentemperatur in München: 14.3 °C Wind: 1.9 m/s From guido@python.org Thu Jul 18 20:30:41 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 15:30:41 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 08:01:54 +0200." References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> Message-ID: <200207181930.g6IJUfX22643@odiug.zope.com> > I suspect read and write would best be kept on separate > interfaces. Ability to read, write, seek-and-tell, being three > atoms of which it makes sense to have about 6 combos > (R, W, R+W, each with or without S&T). Rewind might > make sense separately from S&T if streaming tapes were still in > fashion and OS's gave natural access to them. 5, because R+W without S&T makes little sense. > But I do think it's all pretty academic. C++ has tried very hard to do this with its istream, ostream and iostream classes; I believe I heard C++ people say once that it's not considered a success. I believe Java has tried to address this too. What do you think of Java's solution? --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Thu Jul 18 20:31:45 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Thu, 18 Jul 2002 12:31:45 -0700 (PDT) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <20020718100631.A3468@doublegemini.com> Message-ID: On Thu, 18 Jul 2002, Clark C . Evans wrote: > On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote: > | But i think this is more than a minor problem. This is a > | namespace collision problem, and that's significant. Naming > | the method "next" means that any object with a "next" method > | cannot be adapted to support the iterator protocol. Unfortunately > | "next" is a pretty common word and it's quite possible that such > | a method name is already in use. > > Ping, > > Do you have any suggestions for re-wording the Iterator questionare > at http://yaml.org/wk/survey?id=pyiter to reflect this paragraph above? I might add something like: One motivation for this change is that the name "next()" might collide with the name of an existing "next()" method. This could cause a problem if someone wants to implement the iterator protocol for an object that already happens to have a method called "next()". So far no one has reported encountering this situation. It seems plausible that there will be some objects where it would be nice to support the iterator protocol, and we have heard of some objects with methods named "next()", but we don't know how likely or unlikely it is that there's an object where both are true. Does that seem fair? -- ?!ng From jeremy@alum.mit.edu Thu Jul 18 20:32:14 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 15:32:14 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: <3D371361.7050908@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> Message-ID: <15671.6078.577033.943393@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> The configure script tests whether static forwards work or MAL> not. If you'd rip out the test as well, then I'd have to add MAL> those platforms which still have problems manually. MAL> The problem is: I don't know which platforms these are (because MAL> configure found these itself). If you think the configure test works, why do you have platform specific ifdefs in your header file? Jeremy From mal@lemburg.com Thu Jul 18 20:35:01 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 21:35:01 +0200 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> Message-ID: <3D371865.5070908@lemburg.com> Greg Ewing wrote: > Someone told me that Pyrex should be generating > __declspec(dllexport) for the module init func. > But someone else says this is only needed if > you're importing a dll as a library, and that > it's not needed for Python extensions. > > Can anyone who knows what they're doing on > Windows give me a definitive answer about > whether it's really needed or not? You need to export at least the init() API and that is usually done using the dllexport flag. Note that this is only needed for shared modules (DLLs), not modules which are linked statically. This is what I use for this: /* Macro to "mark" a symbol for DLL export */ #if (defined(_MSC_VER) && _MSC_VER > 850 \ || defined(__MINGW32__) || defined(__CYGWIN) || defined(__BEOS__)) # ifdef __cplusplus # define MX_EXPORT(type) extern "C" type __declspec(dllexport) # else # define MX_EXPORT(type) extern type __declspec(dllexport) # endif #elif defined(__WATCOMC__) # define MX_EXPORT(type) extern type __export #elif defined(__IBMC__) # define MX_EXPORT(type) extern type _Export #else # define MX_EXPORT(type) extern type #endif -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Thu Jul 18 20:34:58 2002 From: tim@zope.com (Tim Peters) Date: Thu, 18 Jul 2002 15:34:58 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > Someone told me that Pyrex should be generating > __declspec(dllexport) for the module init func. > But someone else says this is only needed if > you're importing a dll as a library, 1. What else could one do with a DLL? That is, in your view is the "importing ... as a library" part not redundant? 2. Does Pyrex compile to DLLs (or PYDs) on Windows? I simply don't know. > and that it's not needed for Python extensions. If an extension is compiled into a DLL/PYD, it must tell the linker which symbols are to be exported. __declspec(dllexport) in the source is one way to do that. Other possibilities include creating a .def file, or specifying exported names on the linker's command line (like "/export:init_sre"). The best thing to do for Windows is ask that Windows users supply patches. Or you could upgrade to Windows yourself . From fredrik@pythonware.com Thu Jul 18 20:37:09 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 18 Jul 2002 21:37:09 +0200 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz> Message-ID: <034701c22e92$9473dfc0$ced241d5@hagrid> greg wrote: > Someone told me that Pyrex should be generating > __declspec(dllexport) for the module init func. almost; for portability, it's better to use the DL_EXPORT provided by Python.h: DL_EXPORT(void) init_module(void) { ... } > But someone else says this is only needed if > you're importing a dll as a library, and that > it's not needed for Python extensions. that someone is confused; the dllexport declaration makes sure that the init function is exported from the DLL. if not, Python's PYD loader won't find the init function. From aleax@aleax.it Thu Jul 18 20:38:15 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 21:38:15 +0200 Subject: [Python-Dev] Re: Sets In-Reply-To: <200207181845.g6IIjfw22307@odiug.zope.com> References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <200207181845.g6IIjfw22307@odiug.zope.com> Message-ID: <02071821381500.04480@arthur> On Thursday 18 July 2002 20:45, Guido van Rossum wrote: ... > I believe I recommended to Greg to make sets "have" a dict instead of > "being" dicts, and I think he agreed. But I guess he never got to > implementing that change. Right. OK, guess I'll make a new patch using delegation instead of inheritance, then. Alex From guido@python.org Thu Jul 18 20:50:39 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 15:50:39 -0400 Subject: [Python-Dev] Re: Sets In-Reply-To: Your message of "Thu, 18 Jul 2002 21:38:15 +0200." <02071821381500.04480@arthur> References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <200207181845.g6IIjfw22307@odiug.zope.com> <02071821381500.04480@arthur> Message-ID: <200207181950.g6IJodg22778@odiug.zope.com> > > I believe I recommended to Greg to make sets "have" a dict instead of > > "being" dicts, and I think he agreed. But I guess he never got to > > implementing that change. > > Right. OK, guess I'll make a new patch using delegation instead > of inheritance, then. Maybe benchmark the performance too. If the "has" version is much slower, perhaps we could remove unwanted interfaces from the public API by overriding them with something that raises an exception (and rename the internal versions to some internal name if they are needed). --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@zesty.ca Thu Jul 18 20:59:01 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Thu, 18 Jul 2002 12:59:01 -0700 (PDT) Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207172121.g6HLLQH13946@odiug.zope.com> Message-ID: I wrote: > __iter__ is a red herring. [...blah blah blah...] > Iterators are currently asked to support both protocols. The > semantics of iteration come only from protocol 2; protocol 1 is > an effort to make iterators look sorta like sequences. But the > analogy is very weak -- these are "sequences" that destroy > themselves while you look at them -- not like any typical > sequence i've ever seen! > > The short of it is that whenever any Python programmer says > "for x in y", he or she had better be darned sure of whether > this is going to destroy y. Whatever we can do to make this > clear would be a good idea. On Wed, 17 Jul 2002, Guido van Rossum wrote: > This is a very good summary of the two iterator protocols. Ping, > would you mind adding this to PEP 234? I have now done so. I didn't add the whole thing verbatim, because the tone doesn't fit: it was written with the intent of motivating a change to the protocol, rather than describing what the protocol is. Presumably we don't want the PEP to say "__iter__ is a red herring". There's a bunch of issues flying around here, which i'll try to explain better in a separate posting. But i wanted to take care of Guido's request first. I have toned down and abridged my text somewhat, and strengthened the requirement for __iter__(). Here is what the "API specification" section now says: Classes can define how they are iterated over by defining an __iter__() method; this should take no additional arguments and return a valid iterator object. A class that wants to be an iterator should implement two methods: a next() method that behaves as described above, and an __iter__() method that returns self. The two methods correspond to two distinct protocols: 1. An object can be iterated over with "for" if it implements __iter__() or __getitem__(). 2. An object can function as an iterator if it implements next(). Container-like objects usually support protocol 1. Iterators are currently required to support both protocols. The semantics of iteration come only from protocol 2; protocol 1 is present to make iterators behave like sequences. But the analogy is weak -- unlike ordinary sequences, iterators are "sequences" that are destroyed by the act of looking at their elements. Consequently, whenever any Python programmer says "for x in y", he or she must be sure of whether this is going to destroy y. -- ?!ng From guido@python.org Thu Jul 18 20:58:50 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 15:58:50 -0400 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) In-Reply-To: Your message of "Thu, 18 Jul 2002 21:19:21 +0200." <3D3714B9.1060807@lemburg.com> References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com> <3D3714B9.1060807@lemburg.com> Message-ID: <200207181958.g6IJwoY22816@odiug.zope.com> > Guido van Rossum wrote: > >>>Perhaps we could have some kind of category for distutils > >>>packages which marks them as system add-ons vs. site add-ons. > >> > >>+1 -- this should definitely be up to the package author/packager, not > >>the local admin. I once tried to convince Guido that the ability to > >>occasionally upgrade standard library modules/packages would be a good > >>thing, but he wasn't having it. Any change of heart, O Mighty BDFL? > > > > > > Before I answer that, here's a question. Why do we think it's a good > > idea to distribute upgrades as separate add-ons while we don't think > > it's okay to distribute such upgrades with bugfix releases? [MAL] > The idea is to provide bugfixes for Python versions which are > no longer being maintained. Of course, the effect would only > show a few years ahead. Hm, if you really are fixing bugs in old versions, why not patch the Python installation in-place rather than trying to play nice? > > Doesn't > > this just increase the variability of site configurations, and hence > > version interaction hell? > > I don't think that core packages are any different than > other third party packages: they are usually independent > enough from the rest of the code that upgrades don't affect > the workings of the other code using it. The internals are > free to change, though, e.g. to accomodate bug fixes, etc. Well, I don't expect that we'll do independent upgrades for core packages, so I propose to end this thread. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 21:08:54 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 16:08:54 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 12:59:01 PDT." References: Message-ID: <200207182008.g6IK8tb22853@odiug.zope.com> > I didn't add the whole thing verbatim, because the tone doesn't fit: > it was written with the intent of motivating a change to the > protocol, rather than describing what the protocol is. Presumably > we don't want the PEP to say "__iter__ is a red herring". > > There's a bunch of issues flying around here, which i'll try to > explain better in a separate posting. But i wanted to take care > of Guido's request first. I have toned down and abridged my text > somewhat, and strengthened the requirement for __iter__(). Here > is what the "API specification" section now says: > > Classes can define how they are iterated over by defining an > __iter__() method; this should take no additional arguments and > return a valid iterator object. A class that wants to be an > iterator should implement two methods: a next() method that behaves > as described above, and an __iter__() method that returns self. > > The two methods correspond to two distinct protocols: > > 1. An object can be iterated over with "for" if it implements > __iter__() or __getitem__(). > > 2. An object can function as an iterator if it implements next(). > > Container-like objects usually support protocol 1. Iterators are > currently required to support both protocols. The semantics of > iteration come only from protocol 2; protocol 1 is present to make > iterators behave like sequences. But the analogy is weak -- unlike > ordinary sequences, iterators are "sequences" that are destroyed > by the act of looking at their elements. Find up to here. > Consequently, whenever any Python programmer says "for x in y", > he or she must be sure of whether this is going to destroy y. I don't understand why this is here. *Why* is it important to know whether this is going to destroy y? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 18 21:42:02 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 16:42:02 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 07:23:16 EDT." References: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207182042.g6IKg2n22947@odiug.zope.com> > Maybe the reasons behind having __iter__() returning itself should be > clearly expressed in the PEP, too. On this list, Tim gave one recently, > Guido gives another here, but unless I missed it, the PEP gives none. > Usually, PEPs explain the reasons behind the choices. Ping added this to the PEP: The two methods correspond to two distinct protocols: 1. An object can be iterated over with "for" if it implements __iter__() or __getitem__(). 2. An object can function as an iterator if it implements next(). Container-like objects usually support protocol 1. Iterators are currently required to support both protocols. The semantics of iteration come only from protocol 2; protocol 1 is present to make iterators behave like sequences. But the analogy is weak -- unlike ordinary sequences, iterators are "sequences" that are destroyed by the act of looking at their elements. (I could do without the last sentence, since this expresses a value judgement rather than fact -- not a good thing to have in a PEP's "specification" section.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jul 18 21:50:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 22:50:31 +0200 Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats) References: <3D220A86.5070003@lemburg.com> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com> <3D3714B9.1060807@lemburg.com> <200207181958.g6IJwoY22816@odiug.zope.com> Message-ID: <3D372A17.50509@lemburg.com> Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>>>Perhaps we could have some kind of category for distutils >>>>>packages which marks them as system add-ons vs. site add-ons. >>>> >>>>+1 -- this should definitely be up to the package author/packager, not >>>>the local admin. I once tried to convince Guido that the ability to >>>>occasionally upgrade standard library modules/packages would be a good >>>>thing, but he wasn't having it. Any change of heart, O Mighty BDFL? >>> >>> >>>Before I answer that, here's a question. Why do we think it's a good >>>idea to distribute upgrades as separate add-ons while we don't think >>>it's okay to distribute such upgrades with bugfix releases? >> > > [MAL] > >>The idea is to provide bugfixes for Python versions which are >>no longer being maintained. Of course, the effect would only >>show a few years ahead. > > > Hm, if you really are fixing bugs in old versions, why not patch the > Python installation in-place rather than trying to play nice? We don't have an easy way of doing this, unless of course we trick python setup.py install to install directly into .../lib/pythonX.X rather than a sub directory on the path. >>>Doesn't >>>this just increase the variability of site configurations, and hence >>>version interaction hell? >> >>I don't think that core packages are any different than >>other third party packages: they are usually independent >>enough from the rest of the code that upgrades don't affect >>the workings of the other code using it. The internals are >>free to change, though, e.g. to accomodate bug fixes, etc. > > Well, I don't expect that we'll do independent upgrades for core > packages, so I propose to end this thread. Barry is already doing this with the email package and I would expect more such packages to make their way into the core. The PyXML package also has a life of its own outside the core distribution and could benefit from this. I think it's too early to end the thread. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Thu Jul 18 20:21:59 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 15:21:59 -0400 Subject: [Python-Dev] configure problems porting to Tru64 In-Reply-To: Your message of "Thu, 18 Jul 2002 15:08:16 EDT." <15671.4640.361811.434411@slothrop.zope.com> References: <15671.4640.361811.434411@slothrop.zope.com> Message-ID: <200207181922.g6IJM0O22574@odiug.zope.com> > (I'll also note that configure.in has a rather complex test for this, > when it appears that autoconf has a builtin AC_FUNC_SETPGRP. Anyone > know why we don't use this?) I'll bet AC_FUNC_SETPGRP didn't exist in the autoconf version we were using when we wrote that test. Feel free to fix it. BTW, the snake farm build for AIX-2-000000042E00-hal now fails like this: ../python/dist/src/Modules/posixmodule.c: In function `posix_fdatasync': ../python/dist/src/Modules/posixmodule.c:902: `fdatasync' undeclared (first use this function) ../python/dist/src/Modules/posixmodule.c:902: (Each undeclared identifier is reported only once ../python/dist/src/Modules/posixmodule.c:902: for each function it appears in.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Thu Jul 18 21:52:31 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 22:52:31 +0200 Subject: [Python-Dev] Re: Sets In-Reply-To: <200207181950.g6IJodg22778@odiug.zope.com> References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <02071821381500.04480@arthur> <200207181950.g6IJodg22778@odiug.zope.com> Message-ID: On Thursday 18 July 2002 09:50 pm, Guido van Rossum wrote: > > > I believe I recommended to Greg to make sets "have" a dict instead of > > > "being" dicts, and I think he agreed. But I guess he never got to > > > implementing that change. > > > > Right. OK, guess I'll make a new patch using delegation instead > > of inheritance, then. > > Maybe benchmark the performance too. If the "has" version is much > slower, perhaps we could remove unwanted interfaces from the public > API by overriding them with something that raises an exception (and > rename the internal versions to some internal name if they are > needed). I've just updated patch 580995 with the has-A rather than is-A version. OK, I'll now run some simple benchmarks... Looks good, offhand. Here's the simple benchmark script: import time import set import sys clock = time.clock raw = range(10000) times = [None]*20 print "Timing Set %s (Python %s)" % (set.__version__, sys.version) print "Make 20 10k-items sets (no reps)...", start = clock() for i in times: s10k = set.Set(raw) stend = clock() print stend-start witre = range(1000)*10 print "Make 20 1k-items sets (x10 reps)...", for i in times: s1k1 = set.Set(witre) stend = clock() print stend-start raw1 = range(500, 1500) print "Make 20 more 1k-items sets (no reps)...", for i in times: s1k2 = set.Set(raw1) stend = clock() print stend-start print "20 unions of 1k-items sets 50% overlap...", for i in times: result = s1k1 | s1k2 stend = clock() print stend-start print "20 inters of 1k-items sets 50% overlap...", for i in times: result = s1k1 & s1k2 stend = clock() print stend-start print "20 diffes of 1k-items sets 50% overlap...", for i in times: result = s1k1 - s1k2 stend = clock() print stend-start print "20 simdif of 1k-items sets 50% overlap...", for i in times: result = s1k1 ^ s1k2 stend = clock() print stend-start And here's a few runs (with -O of course) on my PC: [alex@lancelot has]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.36 Make 20 more 1k-items sets (no reps)... 0.38 20 unions of 1k-items sets 50% overlap... 0.43 20 inters of 1k-items sets 50% overlap... 0.92 20 diffes of 1k-items sets 50% overlap... 1.41 20 simdif of 1k-items sets 50% overlap... 2.38 [alex@lancelot has]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]) Make 20 10k-items sets (no reps)... 0.22 Make 20 1k-items sets (x10 reps)... 0.37 Make 20 more 1k-items sets (no reps)... 0.39 20 unions of 1k-items sets 50% overlap... 0.44 20 inters of 1k-items sets 50% overlap... 0.93 20 diffes of 1k-items sets 50% overlap... 1.42 20 simdif of 1k-items sets 50% overlap... 2.39 [alex@lancelot has]$ cd ../is [alex@lancelot is]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.37 Make 20 more 1k-items sets (no reps)... 0.39 20 unions of 1k-items sets 50% overlap... 0.44 20 inters of 1k-items sets 50% overlap... 0.93 20 diffes of 1k-items sets 50% overlap... 1.42 20 simdif of 1k-items sets 50% overlap... 2.38 [alex@lancelot is]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26) [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]) Make 20 10k-items sets (no reps)... 0.22 Make 20 1k-items sets (x10 reps)... 0.38 Make 20 more 1k-items sets (no reps)... 0.4 20 unions of 1k-items sets 50% overlap... 0.44 20 inters of 1k-items sets 50% overlap... 0.93 20 diffes of 1k-items sets 50% overlap... 1.42 20 simdif of 1k-items sets 50% overlap... 2.41 [alex@lancelot is]$ They look much of a muchness to me. Sorry about the version stuck at 1.5 -- forgot to update that, but you can tell the difference by the directory name, 'is' and 'has' resp.:-). Python 2.3 (built from CVS 22 hours ago) is substantially faster at some tasks (intersections and differences): [alex@lancelot has]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05) [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.36 Make 20 more 1k-items sets (no reps)... 0.37 20 unions of 1k-items sets 50% overlap... 0.42 20 inters of 1k-items sets 50% overlap... 0.75 20 diffes of 1k-items sets 50% overlap... 1.08 20 simdif of 1k-items sets 50% overlap... 1.73 [alex@lancelot has]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05) [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.36 Make 20 more 1k-items sets (no reps)... 0.37 20 unions of 1k-items sets 50% overlap... 0.42 20 inters of 1k-items sets 50% overlap... 0.75 20 diffes of 1k-items sets 50% overlap... 1.08 20 simdif of 1k-items sets 50% overlap... 1.74 [alex@lancelot has]$ [alex@lancelot is]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05) [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.35 Make 20 more 1k-items sets (no reps)... 0.37 20 unions of 1k-items sets 50% overlap... 0.41 20 inters of 1k-items sets 50% overlap... 0.74 20 diffes of 1k-items sets 50% overlap... 1.07 20 simdif of 1k-items sets 50% overlap... 1.72 [alex@lancelot is]$ python -O ../bench_set.py Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05) [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)]) Make 20 10k-items sets (no reps)... 0.21 Make 20 1k-items sets (x10 reps)... 0.36 Make 20 more 1k-items sets (no reps)... 0.38 20 unions of 1k-items sets 50% overlap... 0.42 20 inters of 1k-items sets 50% overlap... 0.75 20 diffes of 1k-items sets 50% overlap... 1.08 20 simdif of 1k-items sets 50% overlap... 1.73 [alex@lancelot is]$ but as you can see, again it's uniformly faster on both 'is' and 'has' versions of sets. The 'has' version thus seems preferable here. Alex From jeremy@alum.mit.edu Thu Jul 18 20:10:22 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 15:10:22 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <15670.62365.517118.775364@slothrop.zope.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <15670.62365.517118.775364@slothrop.zope.com> Message-ID: <15671.4766.961501.277589@slothrop.zope.com> FWIW I confirm today that staticforward is not needed Tru64 5.1. Jerem From guido@python.org Thu Jul 18 20:18:38 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 15:18:38 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 07:57:37 +0200." References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> Message-ID: <200207181918.g6IJIcW22539@odiug.zope.com> > > I've just had a thought. Maybe it would be less of a mess > > if what we are calling "iterators" had been called "streams" > > Possibly -- I did use the "streams" name often in the tutorial > on iterators and generators, it's a very natural term. OTOH in C++ and Java, "stream" refers to an open file object (to emphasize the iteratorish feeling of a file opened for sequential reading or writing, as opposed to the concept of a file as a random-access array of bytes on disk). > Seekable files can be multi-pass, but in the strict sense > that you can rewind them -- it's still impractical to have > them produce multiple *independent* iterators (needing > some sort of in-memory caching). It would be trivial if you had an object representing the notion of a file on disk rather than an open file. Each iterator would be implemented as a separate open file referring to the same filename. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Thu Jul 18 22:00:05 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 17:00:05 -0400 Subject: [Python-Dev] configure problems porting to Tru64 In-Reply-To: <200207181922.g6IJM0O22574@odiug.zope.com> References: <15671.4640.361811.434411@slothrop.zope.com> <200207181922.g6IJM0O22574@odiug.zope.com> Message-ID: <15671.11349.924113.246257@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: GvR> BTW, the snake farm build for AIX-2-000000042E00-hal now fails GvR> like this: GvR> ../python/dist/src/Modules/posixmodule.c: In function GvR> `posix_fdatasync': GvR> ../python/dist/src/Modules/posixmodule.c:902: `fdatasync' GvR> undeclared (first use this function) GvR> ../python/dist/src/Modules/posixmodule.c:902: (Each undeclared GvR> identifier is reported only once GvR> ../python/dist/src/Modules/posixmodule.c:902: for each function GvR> it appears in.) (I already mentioned this to Guido, but) This problem has been occuring on AIX for a while. It's unrelated to staticforward. So we've now confirmed that staticforward is unneeded on AIX and Tru64. Perhaps MAL would like to find an SCO ODT compiler to try it out with. Jeremy From mal@lemburg.com Thu Jul 18 22:07:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 23:07:41 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> Message-ID: <3D372E1D.50009@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: >>>>> > > MAL> The configure script tests whether static forwards work or > MAL> not. If you'd rip out the test as well, then I'd have to add > MAL> those platforms which still have problems manually. > > MAL> The problem is: I don't know which platforms these are (because > MAL> configure found these itself). > > If you think the configure test works, why do you have platform > specific ifdefs in your header file? Because it doesn't always work :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Thu Jul 18 22:09:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jul 2002 23:09:34 +0200 Subject: [Python-Dev] Re: configure problems porting to Tru64 In-Reply-To: <15671.4640.361811.434411@slothrop.zope.com> References: <15671.4640.361811.434411@slothrop.zope.com> Message-ID: Jeremy Hylton writes: > (I'll also note that configure.in has a rather complex test for this, > when it appears that autoconf has a builtin AC_FUNC_SETPGRP. Anyone > know why we don't use this?) That test was introduced in configure.in 1.9, on 1994/11/03. It might well be that autoconf did not support that test at that time. > How should we actually fix this problem? It seems to me that the > right solution is to define _XOPEN_SOURCE in Tru64 and somehow > guarantee that configure runs its tests with that defined, too. How > would we achieve that? I think it is generally the right thing to define _XOPEN_SOURCE on Unix, providing a negative list of systems that cannot support this setting (or preferably solving whatever problems remain). I'd put an (unconditional) AC_DEFINE into configure.in early on; it *should* go into confdefs.h as configure proceeds, and thus be active when other tests are performed. Regards, Martin From aleax@aleax.it Thu Jul 18 22:12:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 23:12:11 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207181930.g6IJUfX22643@odiug.zope.com> References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> <200207181930.g6IJUfX22643@odiug.zope.com> Message-ID: On Thursday 18 July 2002 09:30 pm, Guido van Rossum wrote: > > I suspect read and write would best be kept on separate > > interfaces. Ability to read, write, seek-and-tell, being three > > atoms of which it makes sense to have about 6 combos > > (R, W, R+W, each with or without S&T). Rewind might > > make sense separately from S&T if streaming tapes were still in > > fashion and OS's gave natural access to them. > > 5, because R+W without S&T makes little sense. Reasonably little, yes -- hard to make up a non-contrived example ('preserve data up to the first occurrence of "bzz" and then overwrite the rest of the file with "spam"'...?-). > > But I do think it's all pretty academic. > > C++ has tried very hard to do this with its istream, ostream and > iostream classes; I believe I heard C++ people say once that it's not > considered a success. As a C++ person I agree. It's better by far than C, mind you -- for text I/O, at least -- but it's complex and intricate. > I believe Java has tried to address this too. > What do you think of Java's solution? In the only time in my life when I was using Java in earnest (in code intended for production purposes, though think3 later dropped the idea), Java hit me with a deprecation to the solar plexus exactly in this area, forcing me to do much unproductive rewriting -- so I find it hard to be unbiased. But even striving to be fair, I don't see the advantage compared e.g. to C++'s streams. Alex From jeremy@alum.mit.edu Thu Jul 18 22:16:09 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 17:16:09 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <3D372E1D.50009@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> Message-ID: <15671.12313.725886.680036@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> The configure script tests whether static forwards work or MAL> not. If you'd rip out the test as well, then I'd have to add MAL> those platforms which still have problems manually. MAL> The problem is: I don't know which platforms these are (because MAL> configure found these itself). >> >> If you think the configure test works, why do you have platform >> specific ifdefs in your header file? MAL> Because it doesn't always work :-) Let's make sure I've got this straight: You believe there are platforms on which staticforward is necessary, because you can not have a tentative definition of a static followed by a definition with an initializer. Yet the configure test of exactly this behavior succeeds. Further, you don't believe the configure test works but you want us to leave it in anyway? Jeremy From aleax@aleax.it Thu Jul 18 22:23:50 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 18 Jul 2002 23:23:50 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207181918.g6IJIcW22539@odiug.zope.com> References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> <200207181918.g6IJIcW22539@odiug.zope.com> Message-ID: On Thursday 18 July 2002 09:18 pm, Guido van Rossum wrote: > > > I've just had a thought. Maybe it would be less of a mess > > > if what we are calling "iterators" had been called "streams" > > > > Possibly -- I did use the "streams" name often in the tutorial > > on iterators and generators, it's a very natural term. > > OTOH in C++ and Java, "stream" refers to an open file object (to > emphasize the iteratorish feeling of a file opened for sequential > reading or writing, as opposed to the concept of a file as a > random-access array of bytes on disk). ...and in Unix Sys/V, if I recall correctly, it refered to an allegedly superior way to do things BSD did with sockets (and more). Any nice-looking term will be complicatedly overloaded by now. I think "seborrea" is still free, though (according to some old Dilbert strips, at least). > > Seekable files can be multi-pass, but in the strict sense > > that you can rewind them -- it's still impractical to have > > them produce multiple *independent* iterators (needing > > some sort of in-memory caching). > > It would be trivial if you had an object representing the notion of a > file on disk rather than an open file. Each iterator would be > implemented as a separate open file referring to the same filename. For a *read-only* disk file, yes -- at least on Unix-ish systems, you could also get the same effect with dup2 without even needing any filename around (e.g. on an already-unlinked file). Hmmm, I do think win32 has something like dup2 -- my copy of Richter remained with think3 (it was actually theirs:-), and I do little Windows these days so I haven't bought another, but I'm pretty sure half an hour on MSDN would let me find it. Maybe something can be built around this -- the underlying disk file as the container, dup2 or equivalent to make independent iterators/ streams (as long as nobody's writing the file... but that's not too different from iterating on e.g. a list, where an insert or del would mess things up...). But surely not by sticking with stdio. Which leads us back to my "this is rather academic" statement: don't we need to stick with stdio to support existing extensions which use FILE*'s, anyway? Alex From guido@python.org Thu Jul 18 22:28:03 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 18 Jul 2002 17:28:03 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Thu, 18 Jul 2002 23:23:50 +0200." References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> <200207181918.g6IJIcW22539@odiug.zope.com> Message-ID: <200207182128.g6ILS3u04720@odiug.zope.com> > > > > I've just had a thought. Maybe it would be less of a mess > > > > if what we are calling "iterators" had been called "streams" > > > > > > Possibly -- I did use the "streams" name often in the tutorial > > > on iterators and generators, it's a very natural term. > > > > OTOH in C++ and Java, "stream" refers to an open file object (to > > emphasize the iteratorish feeling of a file opened for sequential > > reading or writing, as opposed to the concept of a file as a > > random-access array of bytes on disk). > > ...and in Unix Sys/V, if I recall correctly, it refered to an allegedly > superior way to do things BSD did with sockets (and more). Any > nice-looking term will be complicatedly overloaded by now. I > think "seborrea" is still free, though (according to some old Dilbert > strips, at least). Bah. I rather like the idea of using "stream" to denote the future rewritten I/O object, so I don't want to use it for iterators. > Which leads us back to my "this is rather academic" statement: > don't we need to stick with stdio to support existing extensions > which use FILE*'s, anyway? We'll need to support the old style files for a long time. But that doesn't mean we can't invent something new that does't use stdio (or perhaps it uses stdio, just doesn't rely on stdio for various features). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jul 18 22:38:59 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 18 Jul 2002 23:38:59 +0200 Subject: [Python-Dev] staticforward References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> Message-ID: <3D373573.8070001@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: >>>>> > > MAL> The configure script tests whether static forwards work or > MAL> not. If you'd rip out the test as well, then I'd have to add > MAL> those platforms which still have problems manually. > > MAL> The problem is: I don't know which platforms these are (because > MAL> configure found these itself). > >> > >> If you think the configure test works, why do you have platform > >> specific ifdefs in your header file? > > MAL> Because it doesn't always work :-) > > Let's make sure I've got this straight: > > You believe there are platforms on which staticforward is necessary, > because you can not have a tentative definition of a static followed > by a definition with an initializer. Yet the configure test of > exactly this behavior succeeds. Yes. The test doesn't seem to catch the case of having arrays being declared as static forward. If you look in configure.in you'll find that the test code only checks whether struct behave well. > Further, you don't believe the > configure test works but you want us to leave it in anyway? I believe that it works in most cases, but not all of them. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jeremy@alum.mit.edu Thu Jul 18 23:02:41 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 18:02:41 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <3D373573.8070001@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> Message-ID: <15671.15105.563068.700997@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> Yes. The test doesn't seem to catch the case of having arrays MAL> being declared as static forward. If you look in configure.in MAL> you'll find that the test code only checks whether struct MAL> behave well. Then you'll be no better off if we leave the test in. I expect you don't actually have a problem. On the off chance that you do, you've already got all the ifdef trickery you need in your own .h file. Jeremy From barry@zope.com Thu Jul 18 23:05:31 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 18 Jul 2002 18:05:31 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability References: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net> <200207182042.g6IKg2n22947@odiug.zope.com> Message-ID: <15671.15275.429784.303580@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> Container-like objects usually support protocol 1. Iterators are >> currently required to support both protocols. The semantics of >> iteration come only from protocol 2; protocol 1 is present to make >> iterators behave like sequences. But the analogy is weak -- unlike >> ordinary sequences, iterators are "sequences" that are destroyed by >> the act of looking at their elements. GvR> (I could do without the last sentence, since this expresses a GvR> value judgement rather than fact -- not a good thing to have GvR> in a PEP's "specification" section.) What about: "...sequences. Note that the act of looking at an iterator's elements mutates the iterator." -Barry From tim@zope.com Thu Jul 18 23:26:47 2002 From: tim@zope.com (Tim Peters) Date: Thu, 18 Jul 2002 18:26:47 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <15671.15275.429784.303580@anthem.wooz.org> Message-ID: > What about: > > "...sequences. Note that the act of looking at an iterator's > elements mutates the iterator." That doesn't belong in the spec either -- nothing requires an iterator to have mutable state, let alone to mutate it when next() is called. From mal@lemburg.com Thu Jul 18 23:31:48 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 00:31:48 +0200 Subject: [Python-Dev] staticforward References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> Message-ID: <3D3741D4.8020408@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: >>>>> > > MAL> Yes. The test doesn't seem to catch the case of having arrays > MAL> being declared as static forward. If you look in configure.in > MAL> you'll find that the test code only checks whether struct > MAL> behave well. > > Then you'll be no better off if we leave the test in. I expect you > don't actually have a problem. On the off chance that you do, you've > already got all the ifdef trickery you need in your own .h file. Except that I don't know on which other platforms I'd have to enable it... and no, I don't want to go through another two years of user feedback to find out ! What are you after here ? Remove the configure.in test as well ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jeremy@alum.mit.edu Thu Jul 18 23:32:30 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 18:32:30 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <3D3741D4.8020408@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> Message-ID: <15671.16894.185299.672286@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> What are you after here ? Remove the configure.in test as well MAL> ? It is already gone. And earlier in this thread, we established that it did you no good, right? You only care about compilers that choke on static array decls with later initialization, and the test doesn't catch that. Jeremy From jeremy@alum.mit.edu Thu Jul 18 23:36:46 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 18:36:46 -0400 Subject: [Python-Dev] Re: configure problems porting to Tru64 In-Reply-To: References: <15671.4640.361811.434411@slothrop.zope.com> Message-ID: <15671.17150.922349.270282@slothrop.zope.com> Thanks. This suggestions gets the compile to succeed on Tru64 and does not harm on Linux. I'll check it in and see what happens on the snake farm tonight. There's one more problem with Tru64: cc -o python Modules/python.o libpython2.3.a -lrt -lpthread -lm -threads ld: Unresolved: makedev It looks like Tru64 doesn't have a makedev(). You added the patch that included this a while back. Do you have any idea what we should do on Tru64? Jeremy From skip@pobox.com Thu Jul 18 23:51:09 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 18 Jul 2002 17:51:09 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21 In-Reply-To: <3D372E1D.50009@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> Message-ID: <15671.18013.841675.41967@localhost.localdomain> >> If you think the configure test works, why do you have platform >> specific ifdefs in your header file? mal> Because it doesn't always work :-) Why not just add the necessary goo to configure so it does work for the various reported cases? Skip From mhammond@skippinet.com.au Fri Jul 19 00:03:38 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 19 Jul 2002 09:03:38 +1000 Subject: [Python-Dev] Review of build system patch requested In-Reply-To: <200207171418.g6HEIZo00747@odiug.zope.com> Message-ID: > > * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to > the compiler > > when building Python itself and any builtin modules. This flag is > > not passed to extension modules. > > My only concern would be that tools which parse the Makefile (I > believe distutils does this?) should not accidentally pick up the > "-DPy_BUILD_CORE" flag. > > Apart from that I trust your judgement and Neal's test drive. Thanks Guido. I mailed the distutils sig, and Andrew Kuchling replied that my change should be safe. Now I need some help checking this baby in! My change touches Makefile.pre.in and configure.in, and require that both "autoheader" and "autoconf" be run to correctly regenerate output files. How should I do this checkin? Is it necessary for me to perform any additional steps, or is there some magic that allows me to simply check these 2 files in and have everything else work? Thanks, Mark. From jeremy@alum.mit.edu Fri Jul 19 00:05:27 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 18 Jul 2002 19:05:27 -0400 Subject: [Python-Dev] Re: staticforward In-Reply-To: <15671.18013.841675.41967@localhost.localdomain> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.18013.841675.41967@localhost.localdomain> Message-ID: <15671.18871.846980.217653@slothrop.zope.com> >>>>> "SM" == Skip Montanaro writes: SM> Why not just add the necessary goo to configure so it does work SM> for the various reported cases? Because there are not first-hand reported cases. The only case that MAL has mentioned is an unnecessary use of staticforward with an array declaration and later initialization in a third-party extension module. There's nothing in the core that needs help from configure. Jeremy From mhammond@skippinet.com.au Fri Jul 19 00:15:46 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 19 Jul 2002 09:15:46 +1000 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <034701c22e92$9473dfc0$ced241d5@hagrid> Message-ID: Fredrik: > greg wrote: > > > Someone told me that Pyrex should be generating > > __declspec(dllexport) for the module init func. > > almost; for portability, it's better to use the DL_EXPORT > provided by Python.h: > > DL_EXPORT(void) > init_module(void) > { > ... > } > > > But someone else says this is only needed if > > you're importing a dll as a library, and that > > it's not needed for Python extensions. FWIW, www.python.org/sf/566100 deprecates DL_IMPORT/DL_EXPORT as it is broken! Once this patch is checked in, the new blessed way to declare your function will be: PyMODINIT_FUNC init_module(void) { ... } This macro will do the right thing in all situations and for all platforms. It even provides the 'extern "C"' if your extension is in a C++ module. The-patch-even-updates-the-doc ly, Mark. From neal@metaslash.com Fri Jul 19 01:49:38 2002 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 18 Jul 2002 20:49:38 -0400 Subject: [Python-Dev] Re: configure problems porting to Tru64 References: <15671.4640.361811.434411@slothrop.zope.com> <15671.17150.922349.270282@slothrop.zope.com> Message-ID: <3D376222.B0ED0D63@metaslash.com> Jeremy Hylton wrote: > > There's one more problem with Tru64: > > cc -o python Modules/python.o libpython2.3.a -lrt -lpthread -lm -threads > ld: > Unresolved: > makedev > > It looks like Tru64 doesn't have a makedev(). You added the patch > that included this a while back. Do you have any idea what we should > do on Tru64? >From a distant memory, makedev is a macro (or may be depending on #define's) and needs the proper header file. I hope my memory is correct, but I don't even trust it. ...maybe I should, there is a makedev macro in sys/types.h on a Compaq Tru64 UNIX V5.1 (Rev. 732) (192.233.54.155) (compaq testdrive box). It looks like _OSF_SOURCE must be defined, possibly other macros. Neal From greg@cosc.canterbury.ac.nz Fri Jul 19 01:48:47 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 12:48:47 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207190048.g6J0ml904071@oma.cosc.canterbury.ac.nz> Alex Martelli : > Me: > > Then the term "iterator" could have been reserved > > for the special case of an object that provides stream > > access to a random-access collection. > > > Nice touch, except that I keep quibbling on the "random > > access" need -- see my previous msg about sets. Well, substitute the term "non-destructively readable" or "multi-pass capable" or something like that if you prefer. > Seekable files can be multi-pass, but in the strict sense > that you can rewind them -- it's still impractical to have > them produce multiple *independent* iterators (needing > some sort of in-memory caching). Yes, that's the key idea I had in mind. So make it "independent multi-pass capable". :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jul 19 01:52:20 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 12:52:20 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207190052.g6J0qKS04080@oma.cosc.canterbury.ac.nz> Alex Martelli : > I suspect read and write would best be kept on separate > interfaces. Yes, obviously you would be allowed to have streams that implemented one or the other or both. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jul 19 01:55:22 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 12:55:22 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207181930.g6IJUfX22643@odiug.zope.com> Message-ID: <200207190055.g6J0tLk04092@oma.cosc.canterbury.ac.nz> > C++ has tried very hard to do this with its istream, ostream and > iostream classes; I believe I heard C++ people say once that it's not > considered a success. Well, everything in C++ seems to end up being way more complicated than it ought to. The Python version would be much simpler, since you wouldn't have to formally spell out all the interface conventions. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From neal@metaslash.com Fri Jul 19 02:04:13 2002 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 18 Jul 2002 21:04:13 -0400 Subject: [Python-Dev] Review of build system patch requested References: Message-ID: <3D37658D.E41060C4@metaslash.com> Mark Hammond wrote: > > > > * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to > > the compiler > > > when building Python itself and any builtin modules. This flag is > > > not passed to extension modules. > > > > My only concern would be that tools which parse the Makefile (I > > believe distutils does this?) should not accidentally pick up the > > "-DPy_BUILD_CORE" flag. > > Thanks Guido. I mailed the distutils sig, and Andrew Kuchling replied that > my change should be safe. > > Now I need some help checking this baby in! My change touches > Makefile.pre.in and configure.in, and require that both "autoheader" and > "autoconf" be run to correctly regenerate output files. > > How should I do this checkin? Is it necessary for me to perform any > additional steps, or is there some magic that allows me to simply check > these 2 files in and have everything else work? I regenerated configure and Makefile.pre.in and attached it to the patch. While regenerating I got a warning: autoheader: missing template: _XOPEN_SOURCE It would be good to have someone look over/test the new configure, etc. Neal From greg@cosc.canterbury.ac.nz Fri Jul 19 02:25:53 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 13:25:53 +1200 (NZST) Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: Message-ID: <200207190125.g6J1PrG04203@oma.cosc.canterbury.ac.nz> Tim Peters : > The best thing to do for Windows is ask that Windows users supply > patches. It was using a patch supplied by a Windows user that got me into this mess. He said that the DL_EXPORT macro didn't work for him. But it sounds like using DL_EXPORT is the officially correct thing to do, so I'll do that. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jul 19 02:40:06 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 13:40:06 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <01KK9VLD2I56A296UI@it.canterbury.ac.nz> Message-ID: <200207190140.g6J1e6U04243@oma.cosc.canterbury.ac.nz> > at least on Unix-ish systems, you > could also get the same effect with dup2 without even needing any > filename around No, you couldn't. dup() or dup2() will give you another file descriptor sharing the same file-position pointer. To get a completely independent access path I think you have to open the file again starting from the pathname. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Fri Jul 19 04:52:24 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 18 Jul 2002 23:52:24 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <200207190125.g6J1PrG04203@oma.cosc.canterbury.ac.nz> Message-ID: [Tim] > The best thing to do for Windows is ask that Windows users supply > patches. [Greg Ewing] > It was using a patch supplied by a Windows user that got > me into this mess. He said that the DL_EXPORT macro > didn't work for him. Sucker . > But it sounds like using DL_EXPORT is the officially > correct thing to do, so I'll do that. Until Mark's patch, yes (see his post in this thread). From tim.one@comcast.net Fri Jul 19 04:54:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 18 Jul 2002 23:54:16 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: Message-ID: [Mark Hammond] > FWIW, www.python.org/sf/566100 deprecates DL_IMPORT/DL_EXPORT as it is > broken! Once this patch is checked in, the new blessed way to > declare your function will be: > > PyMODINIT_FUNC init_module(void) > { > ... > } > > This macro will do the right thing in all situations and for all > platforms. > It even provides the 'extern "C"' if your extension is in a C++ module. > > The-patch-even-updates-the-doc ly, This patch is a Good Thing, and I demand that everyone show you more appreciation for it. for-my-next-act-i'll-command-the-tide-to-retreat-ly y'rs - tim From guido@python.org Fri Jul 19 05:24:13 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 00:24:13 -0400 Subject: [Python-Dev] Review of build system patch requested In-Reply-To: Your message of "Fri, 19 Jul 2002 09:03:38 +1000." References: Message-ID: <200207190424.g6J4ODA08239@pcp02138704pcs.reston01.va.comcast.net> > Now I need some help checking this baby in! My change touches > Makefile.pre.in and configure.in, and require that both "autoheader" and > "autoconf" be run to correctly regenerate output files. > > How should I do this checkin? Is it necessary for me to perform any > additional steps, or is there some magic that allows me to simply check > these 2 files in and have everything else work? You need to check in the files that result from running these two; I believe that's configure and pyconfig.h.in. Note that we require just about the latest and greatest autoconf. If you screw up MvL will correct you. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Fri Jul 19 05:50:03 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 19 Jul 2002 16:50:03 +1200 (NZST) Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: Message-ID: <200207190450.g6J4o3w05817@oma.cosc.canterbury.ac.nz> > > But it sounds like using DL_EXPORT is the officially > > correct thing to do, so I'll do that. > > Until Mark's patch, yes (see his post in this thread). Yeah, but I'm not going to worry about that until it becomes part of a regular release. Thanks, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleax@aleax.it Fri Jul 19 07:16:34 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 19 Jul 2002 08:16:34 +0200 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: References: Message-ID: On Friday 19 July 2002 12:26 am, Tim Peters wrote: > > What about: > > > > "...sequences. Note that the act of looking at an iterator's > > elements mutates the iterator." > > That doesn't belong in the spec either -- nothing requires an iterator to > have mutable state, let alone to mutate it when next() is called. Right, for unbounded iterators returning constant values, such as: class Ones: def __iter__(self): return self def next(self): return 1 However, such "exceptions that prove the rule" are rare enough that I wouldn't consider their existence as forbidding to say _anything_ about state mutation. I _would_ similarly say that x[y]=z normally mutates x, even though "del __setitem__(self, key): pass" is quite legal. Inserting an adverb such as "generally" or "usually" should suffice to make even the most grizzled sea lawyer happy while keeping the information in. Alex From mal@lemburg.com Fri Jul 19 09:31:50 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 10:31:50 +0200 Subject: [Python-Dev] staticforward References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> Message-ID: <3D37CE76.4020803@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: >>>>> > > MAL> What are you after here ? Remove the configure.in test as well > MAL> ? > > It is already gone. And earlier in this thread, we established that > it did you no good, right? No and I think I was clear about the fact that I don't want this to be removed. > You only care about compilers that choke > on static array decls with later initialization, and the test doesn't > catch that. The test tries to catch a general problem in some compilers: that static forward declarations cause compile time errors. However, it only tests this for structs, not arrays and functions. So not all problems related to static forward declarations are catched. That's why I had to add support for this to the header file I'm using. As a result, the test should be extended to also check for the array case and the function case, so that all relevant static forward declaration bugs in the compiler trigger the #define of BAD_STATIC_FORWARD since that's what the symbol is all about. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jul 19 09:44:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 10:44:17 +0200 Subject: [Python-Dev] Incompatible changes to xmlrpclib References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> Message-ID: <3D37D161.5@lemburg.com> > Any news on this one ? If noone objects, I'd like to restore the old interface. >> I noticed yesterday that the xmlrcplib.py version in CVS >> is incompatible with the version in Python 2.2: all the >> .dump_XXX() interfaces changed and now include a third >> argument. >> >> Since the Marshaller can be subclassed, this breaks all >> existing application space subclasses extending or changing >> the default xmlrpclib behaviour. >> >> I'd opt for moving back to the previous style of calling the >> write method via self.write. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Fri Jul 19 08:40:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jul 2002 09:40:22 +0200 Subject: [Python-Dev] Re: configure problems porting to Tru64 In-Reply-To: <15671.17150.922349.270282@slothrop.zope.com> References: <15671.4640.361811.434411@slothrop.zope.com> <15671.17150.922349.270282@slothrop.zope.com> Message-ID: jeremy@alum.mit.edu (Jeremy Hylton) writes: > It looks like Tru64 doesn't have a makedev(). You added the patch > that included this a while back. Do you have any idea what we should > do on Tru64? Neal says you need to define _OSF_SOURCE, but it would better if we could do without. If not, we should both define _OSF_SOURCE (perhaps only on OSF), and add an autoconf test for makedev. Regards, Martin From fredrik@pythonware.com Fri Jul 19 10:31:56 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 19 Jul 2002 11:31:56 +0200 Subject: [Python-Dev] Incompatible changes to xmlrpclib References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> <3D37D161.5@lemburg.com> Message-ID: <003701c22f07$21945140$0900a8c0@spiff> mal wrote: > > Any news on this one ? >=20 > If noone objects, I'd like to restore the old interface. the dump methods are an internal implementation details, and are only accessed through an internal dispatcher table. even if you override them, the marshaller won't use your new methods. so what exactly is your use case? (and whatever you did to make that use case work, how do I stop you from doing the same thing with some other internal part of the standard library? ;-) From mal@lemburg.com Fri Jul 19 10:46:18 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 11:46:18 +0200 Subject: [Python-Dev] PEP: Support for System Upgrades Message-ID: <3D37DFEA.9070506@lemburg.com> PEP: 0??? Title: Support for System Upgrades Version: $Revision: 0.0 $ Author: mal@lemburg.com (Marc-Andr? Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 19-Jul-2001 Post-History: Abstract This PEP proposes strategies to allow the Python standard library to be upgraded in parts without having to reinstall the complete distribution or having to wait for a new patch level release. Problem Python currently does not allow overriding modules or packages in the standard library per default. Even though this is possible by defining a PYTHONPATH environment variable (the paths defined in this variable are prepended to the Python standard library path), there is no standard way of achieving this without changing the configuration. Since Python's standard library is starting to host packages which are also available separately, e.g. the distutils, email and PyXML packages, which can also be installed independently of the Python distribution, it is desireable to have an option to upgrade these packages without having to wait for a new patch level release of the Python interpreter to bring along the changes. Proposed Solutions This PEP proposes two different but not necessarily conflicting solutions: 1. Adding a new standard search path to sys.path: $stdlibpath/system-packages just before the $stdlibpath entry. This complements the already existing entry for site add-ons $stdlibpath/site-packages which is appended to the sys.path at interpreter startup time. To make use of this new standard location, distutils will need to grow support for installing certain packages in $stdlibpath/system-packages rather than the standard location for third-party packages $stdlibpath/site-packages. 2. Tweaking distutils to install directly into $stdlibpath for the system upgrades rather than into $stdlibpath/site-packages. The first solution has a few advantages over the second: * upgrades can be easily identified (just look in $stdlibpath/system-packages) * upgrades can be deinstalled without affecting the rest of the interpreter installation * modules can be virtually removed from packages; this is due to the way Python imports packages: once it finds the top-level package directory it stay in this directory for all subsequent package submodule imports * the approach has an overall much cleaner design than the hackish install on top of an existing installation approach The only advantages of the second approach are that the Python interpreter does not have to changed and that it works with older Python versions. Both solutions require changes to distutils. These changes can also be implemented by package authors, but it would be better to define a standard way of switching on the proposed behaviour. Scope Solution 1: Python 2.3 and up Solution 2: all Python versions supported by distutils Credits None References None Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jul 19 11:00:42 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 12:00:42 +0200 Subject: [Python-Dev] Incompatible changes to xmlrpclib References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> <3D37D161.5@lemburg.com> <003701c22f07$21945140$0900a8c0@spiff> Message-ID: <3D37E34A.9050207@lemburg.com> Fredrik Lundh wrote: > mal wrote: > > >>>Any news on this one ? >> >>If noone objects, I'd like to restore the old interface. > > the dump methods are an internal implementation details, and are > only accessed through an internal dispatcher table. even if you > override them, the marshaller won't use your new methods. If I subclass the Marshaller and Unmarshaller and then use the subclasses, it would :-) > so what exactly is your use case? I needed to adapt the type mapping in xmlrpclib a bit to better fit our needs. This is done by adding a few more methods to the Marshaller and Unmarshaller (it's a hack, but the module doesn't allow any other method, AFAIK): def install_xmlrpclib_addons(xmlrpclib): m = xmlrpclib.Marshaller m.dump_datetime = _dump_datetime m.dispatch[DateTime.DateTimeType] = m.dump_datetime m.dump_buffer = _dump_buffer m.dispatch[types.BufferType] = m.dump_buffer m.dump_int = _dump_int m.dispatch[types.IntType] = m.dump_int u = xmlrpclib.Unmarshaller u.end_dateTime = _load_datetime u.dispatch['dateTime.iso8601'] = u.end_dateTime u.end_base64 = _load_buffer u.dispatch['base64'] = u.end_base64 u.end_boolean = _load_boolean u.dispatch['boolean'] = u.end_boolean > (and whatever you did to make that use case work, how do I stop > you from doing the same thing with some other internal part of the > standard library? ;-) It would be nice to open up the module a little more so that hacks like the one above are not necessary, e.g. by making the used classes parameters to the loads/dumps functions. Then you'd run into the same problem, though, since now subclasses would need to access the dump/load methods. PS: Standard support for None would be nice to have in xmlrpclib... at least for the Marshalling side, since this is a very common problem with xmlrpc. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jmiller@stsci.edu Fri Jul 19 12:29:37 2002 From: jmiller@stsci.edu (Todd Miller) Date: Fri, 19 Jul 2002 07:29:37 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() Message-ID: <3D37F821.8010908@stsci.edu> This is a re-post in plain text of a message I sent yesterday in HTML. Anyone not "consumed with interest" in the buffer object should probably skip it. Scott Gilbert wrote: >--- Todd Miller wrote: > >>>I don't understand what you say, but I believe you. >>> >>I meant we call PyBuffer_FromReadWriteObject and the resulting buffer >>lives longer than the extension function call that created it. I have >>heard that it is possible for the original object to "move" leaving the >>buffer object pointer to it dangling. >> > >Yes. The PyBufferObject grabs the pointer from the PyBufferProcs >supporting object when the PyBufferObject is created. If the PyBufferProcs >supporting object reallocates the memory (possibly from a resize) the > Thanks for the example. > >PyBufferObject can be left with a bad pointer. This is easily possible if >you try to use the array module arrays as a buffer. > This is good to know. > > >I've submitted a patch to fix this particular problem (among others), but >there are still enough things that the buffer object can't do that >something new is needed. > I understand. I saw your patches and they sounded good to me. > >>> >>>>>Maybe instead of the buffer() function/type, there should be a way to >>>>>allocate raw memory? >>>>> >>>>Yes. It would also be nice to be able to: >>>> >>>>1. Know (at the python level) that a type supports the buffer C-API. >>>> >>>Good idea. (I guess right now you can see if calling buffer() with an >>>instance as argument works. :-) >>> >>>>2. Copy bytes from one buffer to another (writeable buffer). >>>> > >And the copy operations shouldn't create any large temporaries: > I agree with this completely. I could summarize my opinion by saying that while I regard the current buffering system as pretty complete, the buffer object places emphasis on the wrong behavior. In terms of modelling memory regions, strings are the wrong way to go. > > > buf1 = memory(50000) > buf2 = memory(50000) > # no 10K temporary should be created in the next line > buf1[10000:20000] = buf2[30000:40000] > >The current buffer object could be used like this, but it would create a >temporary string. > Looking at buffering most of this week, the fact that mmap slicing also returns strings is one justification I've found for having a buffer object, i.e., mmap slicing is not a substitute for the buffer object. The buffer object makes it possible to partition a mmap or any bufferable object into pseudo-independent, possibly writable, pieces. One justification to have a new buffer object is pickling (one of Scott's posts alerted me to this). I think the behavior we want for numarray is to be able to pickle a view of a bufferable object more or less like a string containing the buffer image, and to unpickle it as a memory object. The prospect of adding pickling support makes me wonder if seperating the allocator and view aspects of the buffer object is a good idea; I thought it was, but now I wonder. > >So getting an efficient copy operation seems to require that slices just >create new "views" to the same memory. > Other justifications for a new buffer object might be: 1. The ability to partition any bufferable object into regions which can be passed around. These regions would themselves be buffers. 2. The ability to efficiently pickle a view of any bufferable object. > >>>Maybe you would like to work on a requirements gathering for a memory >>>object >>> >>Sure. I'd be willing to poll comp.lang.python (python-list?) and >>collate the results of any discussion that ensues. Is that what you had >>in mind? >> > > >In the PEP that I'm drafting, I've been calling the new object "bytes" >(since it is just a simple array of bytes). Now that you guys are >referring to it as the "memory object", should I change the name? Doesn't >really matter, but it might avoid confusion to know we're all talking about >the same thing. > Calling this a memory type sounds the best to me. The question I have not resolved for myself is whether there should be one type which "does it all" or two types, a memory allocator and a bufferable object manipulator. > > > >__________________________________________________ >Do You Yahoo!? >Yahoo! Autos - Get free new car price quotes >http://autos.yahoo.com > From ping@zesty.ca Fri Jul 19 12:44:09 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Fri, 19 Jul 2002 04:44:09 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207181422.g6IEMBr14526@odiug.zope.com> Message-ID: On Thu, 18 Jul 2002, Guido van Rossum wrote: > First of all, I'm not sure what exactly the issue is with destructive > for-loops. It's just not the way i expect for-loops to work. Perhaps we would need to survey people for objective data, but i feel that most people would be surprised if for x in y: print x for x in y: print x did not print the same thing twice, or if if x in y: print 'got it' if x in y: print 'got it' did not do the same thing twice. I realize this is my own opinion, but it's a fairly strong impression i have. Even if it's okay for for-loops to destroy their arguments, i still think it sets up a bad situation: we may end up with functions manipulating sequence-like things all over, but it becomes unclear whether they destroy their arguments or not. It becomes possible to write a function which sometimes destroys its argument and sometimes doesn't. Bugs get deeper and harder to find. I believe this is where the biggest debate lies: whether "for" should be non-destructive. I realize we are currently on the other side of the fence, but i foresee enough potential pain that i would like you to consider the value of keeping "for" loops non-destructive. > Maybe the for-loop is a red herring? Calling next() on an > iterator may or may not be destructive on the underlying "sequence" -- > if it is a generator, for example, I would call it destructive. Well, for a generator, there is no underlying sequence. while 1: print next(gen) makes it clear that there is no sequence, but for x in gen: print x seems to give me the impression that there is. > Perhaps you're trying to assign properties to the iterator abstraction > that aren't really there? I'm assigning properties to "for" that you aren't. I think they are useful properties, though, and worth considering. I don't think i'm assigning properties to the iterator abstraction; i expect iterators to destroy themselves. But the introduction of iterators, in the way they are now, breaks this property of "for" loops that i think used to hold almost all the time in Python, and that i think holds all the time in almost all other languages. > Next, I'm not sure how renaming next() to __next__() would affect the > situation w.r.t. the destructivity of for-loops. Or were you talking > about some other migration? The connection is indirect. The renaming is related to: (a) making __next__() a real, honest-to-goodness protocol independent of __iter__; and (b) getting rid of __iter__ on iterators. It's the presence of __iter__ on iterators that breaks the non-destructive-for property. I think the renaming of next() to __next__() is a good idea in any case. It is distant enough from the other issues that it can be done independently of any decisions about __iter__. -- ?!ng From ping@zesty.ca Fri Jul 19 12:28:32 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Fri, 19 Jul 2002 04:28:32 -0700 (PDT) Subject: [Python-Dev] The iterator story Message-ID: Here is a summary of the whole iterator picture as i currently see it. This is necessarily subjective, but i will try to be precise so that it's clear where i'm making a value judgement and where i'm trying to state fact, and so we can pinpoint areas where we agree and disagree. In the subjective sections, i have marked with [@] the places where i solicit agreement or disagreement. I would like to know your opinions on the issues listed below, and on the places marked [@]. Definitions (objective) ----------------------- Container: a thing that provides non-destructive access to a varying number of other things. Why "non-destructive"? Because i don't expect that merely looking at the contents will cause a container to be altered. For example, i expect to be able to look inside a container, see that there are five elements; leave it alone for a while, come back to it later and observe once again that there are five elements. Consequently, a file object is not a container in general. Given a file object, you cannot look at it to see if it contains an "A", and then later look at it once again to see if it contains an "A" and get the same result. If you could seek, then you could do this, but not all files support seeking. Even if you could seek, the act of reading the file would still alter the file object. The file object provides no way of getting at the contents without mutating itself. According to my definition, it's fine for a container to have ways of mutating itself; but there has to be *some* way of getting the contents without mutating the container, or it just ain't a container to me. A file object is better described as a stream. Hypothetically one could create an interface to seekable files that offered some non-mutating read operations; this would cause the file to look more like an array of bytes, and i would find it appropriate to call that interface a container. Iterator: a thing that you can poke (i.e. send a no-argument message), where each time you poke it, it either yields something or announces that it is exhausted. For an iterator to mutate itself every time you poke it is not part of my definition. But the only non-mutating iterator would be an iterator that returns the same thing forever, or an iterator that is always exhausted. So most iterators usually mutate. Some iterators are associated with a container, but not all. There can be many kinds of iterators associated with a container. The most natural kind is one that yields the elements of the container, one by one, mutating itself each time it is poked, until it has yielded all of the elements of the container and announces exhaustion. A Container's Natural Iterator: an iterator that yields the elements of the container, one by one, in the order that makes the most sense for the container. If the container has a finite size n, then the iterator can be poked exactly n times, and thereafter it is exhausted. Issues (objective) ------------------ I alluded to a set of issues in an earlier message, and i'll begin there, by defining what i meant more precisely. The Destructive-For Issue: In most languages i can think of, and in Python for the most part, a statement such as "for x in y: print x" is a non-destructive operation on y. Repeating "for x in y: print x" will produce exactly the same results once more. For pre-iterator versions of Python, this fails to be true only if y's __getitem__ method mutates y. The introduction of iterators has caused this to now be untrue when y is any iterator. The issue is, should "for" be non-destructive? The Destructive-In Issue: Notice that the iteration that takes place for the "in" operator is implemented in the same way as "for". So if "for" destroys its second operand, so will "in". The issue is, should "in" be non-destructive? (Similar issues exist for built-ins that iterate, like list().) The __iter__-On-Iterators Issue: Some people have mentioned that the presence of an __iter__() method is a way of signifying that an object supports the iterator protocol. It has been said that this is necessary because the presence of a "next()" method is not sufficiently distinguishing. Some have said that __iter__() is a completely distinct protocol from the iterator protocol. The issue is, what is __iter__() really for? And secondarily, if it is not part of the iterator protocol, then should we require __iter__() on iterators, and why? The __next__-Naming Issue: The iteration method is currently called "next()". Previous candidates for the name of this method were "next", "__next__", and "__call__". After some previous debate, it was pronounced to be "next()". There are concerns that "next()" might collide with existing methods named "next()". There is also a concern that "next()" is inconsistent because it is the only type-slot-method that does not have a __special__ name. The issue is, should it be called "next" or "__next__"? My Positions (subjective) ------------------------- I believe that "for" and "in" and list() should be non-destructive. I believe that __iter__() should not be required on iterators. I believe that __next__() is a better name than next(). Destructive-For, Destructive-In: I think "for" should be non-destructive because that's the way it has almost always behaved, and that's the way it behaves in any other language [@] i can think of. For a container's __getitem__ method to mutate the container is, in my opinion, bad behaviour. In pre-iterator Python, we needed some way to allow the convenience of "for" on user-implemented containers. So "for" supported a special protocol where it would call __getitem__ with increasing integers starting from 0 until it hit an IndexError. This protocol works great for sequence-like containers that were indexable by integers. But other containers had to be hacked somewhat to make them fit. For example, there was no good way to do "for" over a dictionary-like container. If you attempted "for" over a user-implemented dictionary, you got a really weird "KeyError: 0", which only made sense if you understood that the "for" loop was attempting __getitem__(0). (Hey! I just noticed that from UserDict import UserDict for k in UserDict(): print k still produces "KeyError: 0"! This oughta be fixed...) If you wanted to support "for" on something else, sometimes you would have to make __getitem__ mutate the object, like it does in the fileinput module. But then the user has to know that this object is a special case: "for" only works the first time. When iterators were introduced, i believed they were supposed to solve this problem. Currently, they don't. Currently, "in" can even be destructive. This is more serious. While one could argue that it's not so strange for for x in y: ... to alter y (even though i do think it is strange), i believe just about anyone would find it very counterintuitive for if x in y: to alter y. [@] __iter__-On-Iterators: I believe __iter__ is not a type flag. As i argued previously, i think that looking for the presence of methods that don't actually implement a protocol is a poor way to check for protocol support. And as things stand, the presence of __iter__ doesn't even work [@] as a type flag. There are objects with __iter__ that are not iterators (like most containers). And there are objects without __iter__ that work as iterators. I know you can legislate the latter away, but i think such legislation would amount to fighting the programmers -- and it is infeasible [@] to enforce the presence of __iter__ in practice. Based on Guido's positive response, in which he asked me to make an addition to the PEP, i believe Guido agrees with me that __iter__ is distinct from the protocol of an iterator. This surprised me because it runs counter to the philosophy previously expressed in the PEP. Now suppose we agree that __iter__ and next are distinct protocols. Then why require iterators to support both? The only reason we would want __iter__ on iterators is so that we can use "for" [@] with an iterator as the second operand. I have just argued, above, that it's *not* a good idea for "for" and "in" to be destructive. Since most iterators self-mutate, it follows that it's not advisable to use an iterator directly as the second operand of a "for" or "in". I realize this seems radical! This may be the most controversial point i have made. But if you accept that "in" should not destroy its second argument, the conclusion is unavoidable. __next__-Naming: I think the potential for collision, though small, is significant, and this makes "__next__" a better choice than "next". A built-in function next() should be introduced; this function would call the tp_iternext slot, and for instance objects tp_iternext would call the __next__ method implemented in Python. The connection between this issue and the __iter__ issue is that, if next() were renamed to __next__(), the argument that __iter__ is needed as a flag would also go away. The Current PEP (objective) --------------------------- The current PEP takes the position that "for" and "in" can be destructive; that __iter__() and next() represent two distinct protocols, yet iterators are required to support both; and that the name of the method on iterators is called "next()". My Ideal Protocol (subjective) ------------------------------ So by now the biggest question/objection you probably have is "if i can't use an iterator with 'for', then how can i use it?" The answer is that "for" is a great way to iterate over things; it's just that it iterates over containers and i want to preserve that. We need a different way to iterate over iterators. In my ideal world, we would allow a new form of "for", such as for line from file: print line The use if "from" instead of "in" would imply that we were (destructively) pulling things out of the iterator, and would remove any possible parallel to the test "x in y", which should rightly remain non-destructive. Here's the whole deal: - Iterators provide just one method, __next__(). - The built-in next() calls tp_iternext. For instances, tp_iternext calls __next__. - Objects wanting to be iterated over provide just one method, __iter__(). Some of these are containers, but not all. - The built-in iter(foo) calls tp_iter. For instances, tp_iter calls __iter__. - "for x in y" gets iter(y) and uses it as an iterator. - "for x from y" just uses y as the iterator. That's it. Benefits: - We have a nice clean division between containers and iterators. - When you see "for x in y" you know that y is a container. - When you see "for x from y" you know that y is an iterator. - "for x in y" never destroys y. - "if x in y" never destroys y. - If you have an object that is container-like, you can add an __iter__ method that gives its natural iterator. If you want, you can supply more iterators that do different things; no problem. No one using your object is confused about whether it mutates. - If you have an object that is cursor-like or stream-like, you can safely make it into an iterator by adding __next__. No one using your object is confused about whether it mutates. Other notes: - Iterator algebra still works fine, and is still easy to write: def alternate(it): while 1: yield next(it) next(it) - The file problem has a consistent solution. Instead of writing "for line in file" you write for line from file: print line Being forced to write "from" signals to you that the file is eaten up. There is no expectation that "for line from file" will work again. The best would be a convenience function "readlines", to make this even clearer: for line in readlines("foo.txt"): print line Now you can do this as many times as you want, and there is no possibility of confusion; there is no file object on which to call methods that might mess up the reading of lines. My Not-So-Ideal Protocol ------------------------ All right. So new syntax may be hard to swallow. An alternative is to introduce an adapter that turns an iterator into something that "for" will accept -- that is, the opposite of iter(). - The built-in seq(it) returns x such that iter(x) yields it. Then instead of writing for x from it: you would write for x in seq(it): and the rest would be the same. The use of "seq" here is what would flag the fact that "it" will be destroyed. -- ?!ng From jeremy@alum.mit.edu Fri Jul 19 13:20:20 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 19 Jul 2002 08:20:20 -0400 Subject: [Python-Dev] staticforward In-Reply-To: <3D37CE76.4020803@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> Message-ID: <15672.1028.161004.894848@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> What are you after here ? Remove the configure.in test as well MAL> ? >> >> It is already gone. And earlier in this thread, we established >> that it did you no good, right? MAL> No and I think I was clear about the fact that I don't want MAL> this to be removed. It's clear you don't want it to be removed, but not entirely clear why. We've got a whole alpha and beta cycle to see if anyone finds an actual compiler problem with the Python core. During that time, you can see if the problem occurs for the header file you mentioned. (The one where you use it for an array even though you could rearrange the code to eliminate it.) >> You only care about compilers that choke on static array decls >> with later initialization, and the test doesn't catch that. MAL> The test tries to catch a general problem in some compilers: No one has produced any evidence that there are still compilers that have this problem. MAL> that static forward declarations cause compile time MAL> errors. However, it only tests this for structs, not arrays and MAL> functions. So not all problems related to static forward MAL> declarations are catched. That's why I had to add support for MAL> this to the header file I'm using. The Python core has no need for tests on arrays or functions. (Indeed, staticforward was not intended for function prototypes.) Jeremy From neal@metaslash.com Fri Jul 19 13:42:58 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 19 Jul 2002 08:42:58 -0400 Subject: [Python-Dev] Re: configure problems porting to Tru64 References: <15671.4640.361811.434411@slothrop.zope.com> <15671.17150.922349.270282@slothrop.zope.com> Message-ID: <3D380952.CF927B10@metaslash.com> "Martin v. Loewis" wrote: > > jeremy@alum.mit.edu (Jeremy Hylton) writes: > > > It looks like Tru64 doesn't have a makedev(). You added the patch > > that included this a while back. Do you have any idea what we should > > do on Tru64? > > Neal says you need to define _OSF_SOURCE, but it would better if we > could do without. If not, we should both define _OSF_SOURCE (perhaps > only on OSF), and add an autoconf test for makedev. I agree with Martin. It would be best to only define _OSF_SOURCE if absolutely necessary and use autoconf. Neal From guido@python.org Fri Jul 19 13:59:15 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 08:59:15 -0400 Subject: [Python-Dev] staticforward In-Reply-To: Your message of "Fri, 19 Jul 2002 10:31:50 +0200." <3D37CE76.4020803@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> Message-ID: <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net> > The test tries to catch a general problem in some compilers: that > static forward declarations cause compile time errors. However, > it only tests this for structs, not arrays and functions. > So not all problems related to static forward declarations are > catched. That's why I had to add support for this to the > header file I'm using. > > As a result, the test should be extended to also check for the > array case and the function case, so that all relevant static > forward declaration bugs in the compiler trigger the > #define of BAD_STATIC_FORWARD since that's what the symbol > is all about. Sorry, Marc-Andre, this has lasted long enough. Compilers that don't support this are clearly broken according to the ANSI C std. When Python was first released, such broken compilers perhaps had the excuse that it was a tricky issue in the std and that K&R didn't do it that way. That was many years ago. Platforms whose compiler is still broken in this way ought to be extinct, and I have every reason to believe that they are. It's just not worth our while to try to cater for every possible way that compilers used to be broken in the distant past. When we spot a real live broken compiler, and there's no better work-around (like rewriting the code), and we care about that platform, and there's no alternative compiler available, we may add some cruft to the code. But there's no point in gathering cruft forever without every once in a while cleaning some things up. I'll gladly put this back in as soon as you have a paying customer who wants to run Python 2.3 on a platform where the compiler is still broken in this way. Until then, it's a non-issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 13:59:37 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 08:59:37 -0400 Subject: [Python-Dev] Incompatible changes to xmlrpclib In-Reply-To: Your message of "Fri, 19 Jul 2002 10:44:17 +0200." <3D37D161.5@lemburg.com> References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> <3D37D161.5@lemburg.com> Message-ID: <200207191259.g6JCxbW24819@pcp02138704pcs.reston01.va.comcast.net> > If noone objects, I'd like to restore the old interface. That's between you & Fredrik Lundh. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Fri Jul 19 14:23:51 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 19 Jul 2002 09:23:51 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: References: Message-ID: <20020719132351.GA40829@hishome.net> > The Destructive-For Issue: > > In most languages i can think of, and in Python for the most > part, a statement such as "for x in y: print x" is a > non-destructive operation on y. Repeating "for x in y: print x" > will produce exactly the same results once more. > > For pre-iterator versions of Python, this fails to be true only > if y's __getitem__ method mutates y. The introduction of > iterators has caused this to now be untrue when y is any iterator. The most significant example of an object that mutates on __getitem__ in pre-iterator Python is the xreadlines object. Its __getitem__ method increments an internal counter and raises an exception if accessed out of order. This hack may be the 'original sin' - the first widely used destructive for. I just wish the time machine could have picked up your posting when the iteration protcols were designed. Good work. Your questions will require some serious meditation on the relative importance of semantic purity and backward compatibility. Oren From mal@lemburg.com Fri Jul 19 14:41:40 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jul 2002 15:41:40 +0200 Subject: [Python-Dev] staticforward References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D381714.7040606@lemburg.com> Guido van Rossum wrote: >>The test tries to catch a general problem in some compilers: that >>static forward declarations cause compile time errors. However, >>it only tests this for structs, not arrays and functions. >>So not all problems related to static forward declarations are >>catched. That's why I had to add support for this to the >>header file I'm using. >> >>As a result, the test should be extended to also check for the >>array case and the function case, so that all relevant static >>forward declaration bugs in the compiler trigger the >>#define of BAD_STATIC_FORWARD since that's what the symbol >>is all about. > > > Sorry, Marc-Andre, this has lasted long enough. > > Compilers that don't support this are clearly broken according to the > ANSI C std. When Python was first released, such broken compilers > perhaps had the excuse that it was a tricky issue in the std and that > K&R didn't do it that way. That was many years ago. Platforms whose > compiler is still broken in this way ought to be extinct, and I have > every reason to believe that they are. """ Albert Chin-A-Young wrote on 2002-05-04: > > > > The AIX xlc ANSI compiler does not allow forward declaration of > > variables. This leads to a lot of problems with .c files that use > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of > > fixing these? """ I'm not making this up. > It's just not worth our while to try to cater for every possible way > that compilers used to be broken in the distant past. When we spot a > real live broken compiler, and there's no better work-around (like > rewriting the code), and we care about that platform, and there's no > alternative compiler available, we may add some cruft to the code. This sounds too much like "we == PythonLabs". Is that intended ? > But there's no point in gathering cruft forever without every once in > a while cleaning some things up. > > I'll gladly put this back in as soon as you have a paying customer who > wants to run Python 2.3 on a platform where the compiler is still > broken in this way. Until then, it's a non-issue. Hmm, a few messages ago you confirmed that my usage of staticforward and statichere was corrrect, later on, you say that it's not necessary anymore in the core so it's OK to rip it out. I am telling you that there are compilers around which don't get it right for arrays and propose to add a check for those as well -- if only to help extenions writers like myself. Nevermind, I'll add code to my stuff to emulate the configure.in check using distutils. Still, I find it frustrating that PythonLabs is giving me such a hard time because of 15 lines of code in configure.in. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jul 19 15:10:19 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 10:10:19 -0400 Subject: [Python-Dev] staticforward In-Reply-To: Your message of "Fri, 19 Jul 2002 15:41:40 +0200." <3D381714.7040606@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net> <3D381714.7040606@lemburg.com> Message-ID: <200207191410.g6JEAKf25935@pcp02138704pcs.reston01.va.comcast.net> > """ > Albert Chin-A-Young wrote on 2002-05-04: > > > > > > The AIX xlc ANSI compiler does not allow forward declaration of > > > variables. This leads to a lot of problems with .c files that use > > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of > > > fixing these? > """ > > I'm not making this up. He doesn't complain about the core. > > It's just not worth our while to try to cater for every possible way > > that compilers used to be broken in the distant past. When we spot a > > real live broken compiler, and there's no better work-around (like > > rewriting the code), and we care about that platform, and there's no > > alternative compiler available, we may add some cruft to the code. > > This sounds too much like "we == PythonLabs". Is that > intended ? I hope this is in general the attitude of most core Python developers. Adding cruft should be frowned upon! Else the code will become unmaintainable over time, and everybody loses. > Hmm, a few messages ago you confirmed that my usage of > staticforward and statichere was corrrect, later on, you say > that it's not necessary anymore in the core so it's OK > to rip it out. I am telling you that there are compilers > around which don't get it right for arrays and propose > to add a check for those as well -- if only to help extenions > writers like myself. You're the only person who seems to be suffering from this. > Nevermind, I'll add code to my stuff to emulate the > configure.in check using distutils. Still, I find > it frustrating that PythonLabs is giving me such a > hard time because of 15 lines of code in configure.in. I find it frustrating that you're not seeing our side. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <20020719132351.GA40829@hishome.net> Message-ID: <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com> From: "Oren Tirosh" > > The Destructive-For Issue: > > > > In most languages i can think of, and in Python for the most > > part, a statement such as "for x in y: print x" is a > > non-destructive operation on y. Repeating "for x in y: print x" > > will produce exactly the same results once more. > > > > For pre-iterator versions of Python, this fails to be true only > > if y's __getitem__ method mutates y. The introduction of > > iterators has caused this to now be untrue when y is any iterator. > > The most significant example of an object that mutates on __getitem__ in > pre-iterator Python is the xreadlines object. Its __getitem__ method > increments an internal counter and raises an exception if accessed out of > order. This hack may be the 'original sin' - the first widely used > destructive for. > > I just wish the time machine could have picked up your posting when the > iteration protcols were designed. Good work. Yeah, Ping's article sure went "thunk" when I read it. At the risk of boring everyone, I think I should explain why I started the multipass iterator thread. One of the most important jobs of Boost.Python is the conversion between C++ and Python types (and if you don't give a fig for C++, hang on, because I hope this will be relevant to pure Python also). In order to support wrapping of overloaded C++ functions and member functions, it's important to be able to be able to do this in two steps: 1. Discover whether a Python object is convertible to a given C++ type 2. Perform the conversion The overload resolution mechanism is currently pretty simple-minded: it looks through the overloaded function objects until it can find one for which all the arguments are convertible to the corresponding C++ type, then it converts them and calls the wrapped C++ function. My users really want to be able to define converters which, given any Python iterable/sequence type, can extract a particular C++ container type. In order to do that, we might commonly need to inspect each element of the source object to see that it's convertible to the C++ container's value type. It's pretty easy to see that if step 1 destroys the state of an argument, it can foul the whole scheme: even if we store the result somewhere so that step 2 can re-use it, overload resolution might fail for arguments later in the function signature. Then the other overloads will be looking at a different argument object. What we were looking for was a way to quickly reject an overload if the source object was not re-iterable, without modifying it. It sure seems to me that we'd benefit from being able to do the same sort of thing in Pure Python. It's not clear to me that anyone else cares about this, but I hope one day we'll get built-in overloading or multimethod dispatch in Python beyond what's currently offered by the numeric operators. Incidentally, I'm not sure whether PEP 246 provides much help here. If the adaptation protocol only gives us a way to say "is this, or can this be adapted to be a re-iterable sequence", something could easily answer: [ x for x in y ] Which would produce a re-iterable sequence, but might also destroy the source. Of course, I'll say up front I've only skimmed the PEP and might've missed something crucial. -Dave From aahz@pythoncraft.com Fri Jul 19 15:16:58 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 10:16:58 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: References: Message-ID: <20020719141658.GA7919@panix.com> [Mark Hammond's patch -- with docs!] On Thu, Jul 18, 2002, Tim Peters wrote: > > This patch is a Good Thing, and I demand that everyone show you more > appreciation for it. If I still used Windoze for anything, I would. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aleax@aleax.it Fri Jul 19 15:30:41 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 19 Jul 2002 16:30:41 +0200 Subject: [Python-Dev] The iterator story In-Reply-To: <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com> References: <20020719132351.GA40829@hishome.net> <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com> Message-ID: On Friday 19 July 2002 04:15 pm, David Abrahams wrote: ... > Incidentally, I'm not sure whether PEP 246 provides much help here. If the > adaptation protocol only gives us a way to say "is this, or can this be > adapted to be a re-iterable sequence", something could easily answer: Yes: that's all PEP 246 provides -- a unified way to express a request for adaptation of an object to a protocol, with the ability for the object's type, the protocol, AND a registry of installable adapters, to have a say about it (the registry is not well explained in the PEP as it stands, it's part of what I have to clarify when I rewrite it -- but my rewrite won't change what's being discussed in your quoted paragraph and the start of this one). > [ x for x in y ] or more concisely and speedily list(y). > Which would produce a re-iterable sequence, but might also destroy the > source. Of course, I'll say up front I've only skimmed the PEP and might've > missed something crucial. PEP 246 cannot in any way impede "something" (or more likely "somebody") from writing inappropriate or totally incorrect code, nor will it even try. Maybe I'm missing your point...? Alex From aahz@pythoncraft.com Fri Jul 19 15:23:49 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 10:23:49 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <200207181422.g6IEMBr14526@odiug.zope.com> Message-ID: <20020719142349.GA9051@panix.com> On Fri, Jul 19, 2002, Ka-Ping Yee wrote: > > I believe this is where the biggest debate lies: whether "for" should be > non-destructive. I realize we are currently on the other side of the > fence, but i foresee enough potential pain that i would like you to > consider the value of keeping "for" loops non-destructive. Consider for line in f.readlines(): in any version of Python. Adding iterators made this more convenient and efficient, but I just can't see your POV in the general case. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aleax@aleax.it Fri Jul 19 15:39:11 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 19 Jul 2002 16:39:11 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020719142349.GA9051@panix.com> References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> Message-ID: On Friday 19 July 2002 04:23 pm, Aahz wrote: > On Fri, Jul 19, 2002, Ka-Ping Yee wrote: > > I believe this is where the biggest debate lies: whether "for" should be > > non-destructive. I realize we are currently on the other side of the > > fence, but i foresee enough potential pain that i would like you to > > consider the value of keeping "for" loops non-destructive. > > Consider > > for line in f.readlines(): > > in any version of Python. Adding iterators made this more convenient > and efficient, but I just can't see your POV in the general case. The 'for', per se, is destroying nothing here -- the object returned by f.readlines() is destroyed by its reference count falling to 0 after the for, just as, say: for c in raw_input(): or x = raw_input()+raw_input() and so forth. I.e., any object gets destroyed if there are no more references to it -- that's a completely different issue. In all of these cases, you can, if you want, just bind a name to the object as you call the function, then use that object over and over again at will. _Method calls_ mutating the object on which they're called is indeed quite common, of course. f.readlines() does mutate f's state. But the object it returns, as long as there are references to it, remains. Alex From fredrik@pythonware.com Fri Jul 19 15:42:34 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 19 Jul 2002 16:42:34 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> Message-ID: <017a01c22f32$865123d0$0900a8c0@spiff> aahz wrote: > > I believe this is where the biggest debate lies: whether "for" = should be > > non-destructive. I realize we are currently on the other side of = the > > fence, but i foresee enough potential pain that i would like you to > > consider the value of keeping "for" loops non-destructive. > > Consider >=20 > for line in f.readlines(): >=20 > in any version of Python. and? for-in doesn't modify the object returned by f.readlines(), and never has. From David Abrahams" <20020719132351.GA40829@hishome.net> <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com> Message-ID: <0d6701c22f32$a135c0c0$6501a8c0@boostconsulting.com> From: "Alex Martelli" > PEP 246 cannot in any way impede "something" (or more likely "somebody") from > writing inappropriate or totally incorrect code, nor will it even try. Maybe > I'm missing your point...? Maybe, or maybe not. I guess if the reiterable sequence adapter says "list(x)", nobody should be using it to find out whether a thing is reiterable. Or maybe the reiterable sequence adapter shouldn't say "list(x)" because that's destructive -- though that begs the question of finding out whether x is reiterable. Maybe the PEP is just a red herring as far as the iterator problem is concerned. As long as the language has built-in facilities like 'for' and 'in' which use iteration protocols at the core of the language, re-iterability ought to be expressible likewise, in core language terms, regardless of the more-extensible mechanisms of PEP 246. whole-pile-of-maybes-ly y'rs, dave From barry@zope.com Fri Jul 19 15:59:33 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 10:59:33 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207181422.g6IEMBr14526@odiug.zope.com> Message-ID: <15672.10581.693016.553036@anthem.wooz.org> >>>>> "KY" == Ka-Ping Yee writes: KY> It's just not the way i expect for-loops to work. Perhaps we KY> would need to survey people for objective data, but i feel KY> that most people would be surprised if | for x in y: print x | for x in y: print x KY> did not print the same thing twice, or if As with many things Pythonic, it all depends. Specifically, I think it depends on the type of y. Certainly in a pre-iterator world there was little preventing (or encouraging?) you to write y's __getitem__() non-destructively, so I don't see much difference if y is an iterator. KY> Even if it's okay for for-loops to destroy their arguments, i KY> still think it sets up a bad situation: we may end up with KY> functions manipulating sequence-like things all over, but it KY> becomes unclear whether they destroy their arguments or not. KY> It becomes possible to write a function which sometimes KY> destroys its argument and sometimes doesn't. Bugs get deeper KY> and harder to find. How is that different than pre-iterators with __getitem__()? KY> I'm assigning properties to "for" that you aren't. I think KY> they are useful properties, though, and worth considering. These aren't properties of for-loops, they are properties of the things you're iterating (little-i) over. -Barry From aahz@pythoncraft.com Fri Jul 19 16:20:29 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 11:20:29 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <017a01c22f32$865123d0$0900a8c0@spiff> References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> Message-ID: <20020719152029.GA18810@panix.com> On Fri, Jul 19, 2002, Fredrik Lundh wrote: > aahz wrote: >>Ping: >>> >>> I believe this is where the biggest debate lies: whether "for" should be >>> non-destructive. I realize we are currently on the other side of the >>> fence, but i foresee enough potential pain that i would like you to >>> consider the value of keeping "for" loops non-destructive. >> >> Consider >> >> for line in f.readlines(): >> >> in any version of Python. > > and? for-in doesn't modify the object returned > by f.readlines(), and never has. While technically true, that seems to be sidestepping the point from my POV. I think that few people see for loops as inherently non-destructive due to the use case I presented above. Beyond that, the for loop is itself inherently mutating in Python older than 2.2, which I see as functionally equivalent to "destructive"; the primary intention of iterators (from my recollections of the tenor of the discussions) was to package that mutating state in a way that could capture the iterability of objects other than sequences. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From Paul.Moore@atosorigin.com Fri Jul 19 16:28:11 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 19 Jul 2002 16:28:11 +0100 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> Ka-Ping Yee writes: > It's just not the way i expect for-loops to work. Perhaps we > would need to survey people for objective data, but i feel > that most people would be surprised if > > for x in y: print x > for x in y: print x > > did not print the same thing twice, or if Overall, I think I would say "it depends". Barry pointed out that it depends on the type of y. That's what I mean, although my intuition isn't quite that specific by itself. By the way, not all languages that I am aware of even have "for ... in" constructs. Perl does, and Visual Basic does. C and C++ don't. In Perl, "for $x (<>)" or whatever magic line noise Perl uses, does the same as Python's "for line in f", so the same non-repeatable for issue exists there (at least for files, and I *bet* you can do nasty things with tied variables to have it happen elsewhere, too). Even in Visual Basic, "for each x in obj" can in theory do anything (depending on the type of obj), much like Python. So I think that existing practice goes against your expectation. There *is* an issue of some sort with being able to find out whether a given object offers reproducible for behaviour in the way you describe above. The problem is determining real-world cases where knowing is useful. There are a lot of theoretical issues here, but few simple, comprehensible, practical use cases. FWIW, - I'm +1 for renaming next() to __next__(). - I'm +0 on dropping the requirements that iterators *must* implement __iter__() (as per your description of the 2 orthogonal proposals). I'd like to see iterators strongly advised to implement __iter__() as returning self (and all built in ones doing so), but not have it mandated. - I'm -1 on your for...from syntax. Hope this helps, Paul. From barry@zope.com Fri Jul 19 16:36:45 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 11:36:45 -0400 Subject: [Python-Dev] The iterator story References: Message-ID: <15672.12813.512623.968270@anthem.wooz.org> Nice write-up Ka-Ping. Maybe you need to transform this into a PEP called Iterators.next() 1/2 :) -Barry From jafo-python-dev@tummy.com Fri Jul 19 16:43:03 2002 From: jafo-python-dev@tummy.com (Sean Reifschneider) Date: Fri, 19 Jul 2002 09:43:03 -0600 Subject: [Python-Dev] Judy for replacing internal dictionaries? Message-ID: <20020719094303.B24220@tummy.com> Recently at a Hacking Society meeting someone was working on packaging Judy for Debian. Apparently, Judy is a data-structure designed by some researchers at Hewlett-Packard. It's goal is to be a very fast implementation of an associative array or (possibly sparse) integer indexed array. Judy has recently been released under the LGPL. After reding the FAQ and 10 minute introduction, I started wondering about wether it could improve the overall performance of Python by replacing dictionaries used for namespaces, classes, etc... Since then, I've realized that I probably won't have time to do the implementation any time soon, and Evelyn urged me to bring it up here. I realize that Python's dictionaries are fairly well optimized. It sounds like Judy may be even faster though. It apparently works fairly hard at reducing L2 cache misses, for example. Some URLs: Judy FAQ: http://atwnt909.external.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,1949,00.html Judy 10 minute introduction: http://atwnt909.external.hp.com/dspp/ddl/ddl_Download_File_TRX/1,1249,702,00.pdf SourceForge Project Page: http://sourceforge.net/projects/judy/ Sean -- YOU ARE WITNESSING A FRONT THREE-QUARTER VIEW OF TWO ADULTS SHARING A TENDER MOMENT. -- Gordon Cole, _Twin_Peaks_ Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From fredrik@pythonware.com Fri Jul 19 17:07:21 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 19 Jul 2002 18:07:21 +0200 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> <20020719152029.GA18810@panix.com> Message-ID: <001b01c22f3e$5e25ab40$0900a8c0@spiff> aahz wrote: > While technically true, that seems to be sidestepping the point from = my > POV. really? are you arguing that when Ping says that for-in shouldn't destroy the target, he's really saying that python shouldn't allow methods to have side effects if they can be called from an expression used in a for-in statement? why would he say that? > I think that few people see for loops as inherently non-destructive > due to the use case I presented above. I think most people can tell the difference between an object and a method with side-effects. I doubt they would be able to get much done in Python if they couldn't. > Beyond that, the for loop is itself inherently mutating in Python > older than 2.2 in what sense? it calls the object's __getitem__ method with an integer index value, until it gets an IndexError. in what way is that "inherently mutating"? From martin@v.loewis.de Fri Jul 19 17:09:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jul 2002 18:09:40 +0200 Subject: [Python-Dev] staticforward In-Reply-To: <3D381714.7040606@lemburg.com> References: <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net> <3D381714.7040606@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > """ > Albert Chin-A-Young wrote on 2002-05-04: > > > > > > The AIX xlc ANSI compiler does not allow forward declaration of > > > variables. This leads to a lot of problems with .c files that use > > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of > > > fixing these? > """ > > I'm not making this up. Yes, but the user might be. I don't believe this statement is factually correct - the compiler most certainly does allow forward declaration of variables. Also, such a statement is of little value unless associated with an operating system release number (or better a compiler release number). This conversation snippet indicates that the problem has not been fully understood (atleast by Albert Chin-A-Young); solving an incompletely-understood problem is a recipe for desasters, when it comes to portability. Regards, Martin From pinard@iro.umontreal.ca Fri Jul 19 17:02:10 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 19 Jul 2002 12:02:10 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> Message-ID: [Moore, Paul] > - I'm +0 on dropping the requirements that iterators *must* > implement __iter__() (as per your description of the 2 > orthogonal proposals). In Ka-Ping's letter, I did not read that the proposals were orthogonal. __iter__ would not be required anymore to identify an iterator as such, because __next__ would be sufficient, alone, for this purpose. That would have the effect of cleaning up the iterator protocol from the double constraint it currently has, and probably makes things clearer as well. > I'd like to see iterators strongly advised to implement __iter__() as > returning self Strong advice should not be merely given "ex cathedra", there should be some kind of (convincing) justification behind it. It makes sense for generators at least, so they could be used in a few places where Python expects containers to provide their iterator. The justification is more fuzzy outside generators, especially when programmers do not see the need of obtaining an iterator from itself, the usual and only case I see right now is resuming an iterator which has not bee fully consumed. Ka-Ping also stresses, indirectly, that `element in iterator' (resuming an iterator instead of obtaining a new one from a container) could have a strange meaning, and might even represent a user error. I even wonder if it would not be wise to have iterators _not_ defining an __iter__ method! -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Fri Jul 19 17:30:43 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 12:30:43 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 12:02:10 EDT." References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> Message-ID: <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> > In Ka-Ping's letter, I did not read that the proposals were orthogonal. > __iter__ would not be required anymore to identify an iterator as such, > because __next__ would be sufficient, alone, for this purpose. That would > have the effect of cleaning up the iterator protocol from the double > constraint it currently has, and probably makes things clearer as well. I think there's been some confusion. I never intended the test for "is this an iterator" to be "does it have a next() and an __iter__() method". I *do* strongly advise iterators to define __iter__(), but only because I expect that "for x in iterator:" is useful in iterator algebra functions and the like. In fact, I don't really think that Python currently has foolproof ways to test for *any* kind of abstract protocol. Questions like "Is x a mapping" or "is x a sequence" are equally impossible to answer. The recommended approach is simply to go ahead and use something; if it doesn't obey the protocol, it will fail. Of course, you should *document* the requirements (e.g., "argument x should be a sequence), but I've always considered it a case of LBYL syndrome if code wants to check first. Note that you can't write code that does something different for a sequence than for a mapping; for example, the following class could be either: class C: def __getitem__(self, i): return i I realize that this won't make David Abrahams and his Boost users happy, but that's how Python has approached this issue since its inception. I'm fine with suggestions that we should really fix this; I expect that some way to assert interfaces or protocols will eventually find its way into the language. But I *don't* think that the current inability to test for iterator-ness (or iterable-ness, or multi-iteratable-ness, etc.) should be used as an argument that there's anything wrong with the iterator protocol. (And I've *still* not read Ping's original message...) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jul 19 17:50:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 19 Jul 2002 12:50:47 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <20020719141658.GA7919@panix.com> Message-ID: [Tim] > This patch is a Good Thing, and I demand that everyone show [MarkH] more > appreciation for it. [Aahz] > If I still used Windoze for anything, I would. Then you missed the point of the patch. My demand stands unabated. relentlessly y'rs - tim From aahz@pythoncraft.com Fri Jul 19 17:58:37 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 12:58:37 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <001b01c22f3e$5e25ab40$0900a8c0@spiff> References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> <20020719152029.GA18810@panix.com> <001b01c22f3e$5e25ab40$0900a8c0@spiff> Message-ID: <20020719165836.GA14402@panix.com> On Fri, Jul 19, 2002, Fredrik Lundh wrote: > aahz wrote: >> >> While technically true, that seems to be sidestepping the point from my >> POV. > > really? are you arguing that when Ping says that for-in shouldn't > destroy the target, he's really saying that python shouldn't allow > methods to have side effects if they can be called from an > expression used in a for-in statement? why would he say that? I'm saying that I think Ping is overstating the case in terms of the way people look at things. Whatever the technicalities of an implicit method versus an explicit method, people have long used for loops in destructive ways. >> I think that few people see for loops as inherently non-destructive >> due to the use case I presented above. > > I think most people can tell the difference between an object and > a method with side-effects. I doubt they would be able to get much > done in Python if they couldn't. To be sure. But I don't think there's much difference in the way for loops are actually used. Continuing my point above, I see the current usage of for loops as calling an implicit method with side-effects as opposed to an explicit method with side-effects. Lo and behold! That's actually the case. >> Beyond that, the for loop is itself inherently mutating in Python >> older than 2.2 > > in what sense? it calls the object's __getitem__ method with an > integer index value, until it gets an IndexError. in what way is that > "inherently mutating"? And how does that integer index change? The for loop in Python <2.2 has an internal state object. Iterators are the external manifestation of that state object, generalized to objects other than sequences. I'm surprised that anyone is surprised that the state object gets mutated/destroyed. I'm also surprised that people are surprised about what happens when that state object is coupled to an inherently mutating object such as file objects. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From barry@zope.com Fri Jul 19 18:07:29 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 13:07:29 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207181422.g6IEMBr14526@odiug.zope.com> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> <20020719152029.GA18810@panix.com> <001b01c22f3e$5e25ab40$0900a8c0@spiff> <20020719165836.GA14402@panix.com> Message-ID: <15672.18257.829735.736033@anthem.wooz.org> >>>>> "A" == Aahz writes: A> The for loop in Python <2.2 has an internal state object. A> Iterators are the external manifestation of that state object, A> generalized to objects other than sequences. I'm surprised A> that anyone is surprised that the state object gets A> mutated/destroyed. I'm also surprised that people are A> surprised about what happens when that state object is coupled A> to an inherently mutating object such as file objects. Well said. -Barry From aahz@pythoncraft.com Fri Jul 19 18:02:20 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 13:02:20 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: References: <20020719141658.GA7919@panix.com> Message-ID: <20020719170220.GB14402@panix.com> On Fri, Jul 19, 2002, Tim Peters wrote: > > [Tim] > > This patch is a Good Thing, and I demand that everyone show [MarkH] more > > appreciation for it. > > [Aahz] > > If I still used Windoze for anything, I would. > > Then you missed the point of the patch. My demand stands unabated. All right, then, I hereby show MarkH ill understood appreciation. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From David Abrahams" <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> From: "Guido van Rossum" > > In Ka-Ping's letter, I did not read that the proposals were orthogonal. > > __iter__ would not be required anymore to identify an iterator as such, > > because __next__ would be sufficient, alone, for this purpose. That would > > have the effect of cleaning up the iterator protocol from the double > > constraint it currently has, and probably makes things clearer as well. > > I think there's been some confusion. I never intended the test for > "is this an iterator" to be "does it have a next() and an __iter__() > method". Do you intend to have a test for "is this an iterator" at all? > I *do* strongly advise iterators to define __iter__(), but > only because I expect that "for x in iterator:" is useful in iterator > algebra functions and the like. Makes sense. > In fact, I don't really think that Python currently has foolproof ways > to test for *any* kind of abstract protocol. Questions like "Is x a > mapping" or "is x a sequence" are equally impossible to answer. True. > The recommended approach is simply to go ahead and use something; if > it doesn't obey the protocol, it will fail. Of course, you should > *document* the requirements (e.g., "argument x should be a sequence), > but I've always considered it a case of LBYL syndrome if code wants to > check first. If LBYL is bad, what is introspection good for? > Note that you can't write code that does something > different for a sequence than for a mapping; for example, the > following class could be either: > > class C: > def __getitem__(self, i): return i > > I realize that this won't make David Abrahams and his Boost users > happy, but that's how Python has approached this issue since its > inception. I understand that that's always been "the Python way". However, isn't there also some implication that some of the special functions are more than just a way to provide implementations of Python's syntax? Notes in the docs like those on __getitem__ tend to argue for that, at least by convention. Unless I'm misinterpreting things, "the Python way" isn't quite so one-sided where protocols are concerned. > I'm fine with suggestions that we should really fix this; I expect > that some way to assert interfaces or protocols will eventually find > its way into the language. > > But I *don't* think that the current inability to test for > iterator-ness (or iterable-ness, or multi-iteratable-ness, etc.) > should be used as an argument that there's anything wrong with the > iterator protocol. Just for the record, I never meant to imply that it was broken, only that I'd like to get a little more from it than I currently can. -Dave From paul-python@svensson.org Fri Jul 19 18:21:26 2002 From: paul-python@svensson.org (Paul Svensson) Date: Fri, 19 Jul 2002 13:21:26 -0400 (EDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <20020719165836.GA14402@panix.com> Message-ID: On Fri, 19 Jul 2002, Aahz wrote: >And how does that integer index change? The for loop in Python <2.2 has >an internal state object. Iterators are the external manifestation of >that state object, generalized to objects other than sequences. I'm >surprised that anyone is surprised that the state object gets >mutated/destroyed. I'm also surprised that people are surprised about >what happens when that state object is coupled to an inherently mutating >object such as file objects. All the surprises I see stem from confusion between what is the object being iterated over, and what is the object holding the state of the iteration. Iterators returning self for __iter__() is the major cause of this confusion. I agree that in the general case, the boundary may not always be clear, but Ping's proposal cleans up what's seen 99.9% of the time. Pending the pain of the yet unseen migration plan, I'm +1 on removing __iter__ from all core iterators +1 on renaming next() to __next__() +1 on presenting file objects as iterators rather than iterables +0 on the new 'for x from y' syntax /Paul From guido@python.org Fri Jul 19 18:23:19 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 13:23:19 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 13:16:27 EDT." <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> Message-ID: <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> > Do you intend to have a test for "is this an iterator" at all? Not right now, see the rest of my email. The best you can do is check for a next method and hope for the best. > If LBYL is bad, what is introspection good for? Ask Alex. > I understand that that's always been "the Python way". However, > isn't there also some implication that some of the special functions > are more than just a way to provide implementations of Python's > syntax? Like what? > Notes in the docs like those on __getitem__ tend to argue > for that, at least by convention. Unless I'm misinterpreting > things, "the Python way" isn't quite so one-sided where protocols > are concerned. Can you quote specific places in the docs you read this way? I don't see it, but I've only scanned chapter 3 of the Language Reference Manual. > Just for the record, I never meant to imply that it was broken, only > that I'd like to get a little more from it than I currently can. Maybe I should read Ping's email. From the discussion I figured he was arguing this way. I think you have to settle with what I proposed at the top. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 18:32:19 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 13:32:19 -0400 Subject: [Python-Dev] Where's time.daylight??? In-Reply-To: Your message of "Fri, 19 Jul 2002 13:13:40 EDT." <15672.18628.831787.897474@anthem.wooz.org> References: <15672.18628.831787.897474@anthem.wooz.org> Message-ID: <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> [Barry, in python-checkins] > I've noticed one breakage already I believe. On my systems (RH6.1 and > RH7.3) time.daylight as disappeared. > > I don't think test_time.py actually tests this parameter, but > test_email.py which is what's failing for me: [...] Yup, time.daylight has disappeared. But the bizarre thing is that if I roll back to rev. 1.129, it's *still* gone! Even rev 1.128 still doesn't fix this. I wonder if something in configure changed??? --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax@aleax.it Fri Jul 19 18:35:53 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 19 Jul 2002 19:35:53 +0200 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 19 July 2002 07:23 pm, Guido van Rossum wrote: > > Do you intend to have a test for "is this an iterator" at all? > > Not right now, see the rest of my email. The best you can do is check > for a next method and hope for the best. > > > If LBYL is bad, what is introspection good for? > > Ask Alex. Introspection is good when you need to dispatch in a way that is not supported by the language you're using. In Python (and most other languages), this mostly mean multiple dispatch -- you don't get it from the language, therefore, on the non-frequent occasions when you NEED it, you have to kludge it up. Very similar to multiple inheritance in languages that don't support THAT, really. (Particularly in how people who've never used multiple X don't really understand that it buys you anything -- try interesting a dyed-in-the-wool Smalltalker in multiple inheritance, or anybody *but* a CLOS-head or Dylan-head in multiple dispatch...:-). Other aspects of introspection help you implement other primitives lacking in the language. E.g. "make another like myself but not initialized" can be self.__class__.__new__(self.__class__) -- not the most elegant expression, but, hey, I've seen worse (such as NOT being able to express it at all, in languages lacking the needed ability to introspect:-). Looking at *ANOTHER* object this way isn't really INTROspection, btw -- it's EXTRAspection, by the Latin roots of these words:-). Alex From tim.one@comcast.net Fri Jul 19 18:36:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 19 Jul 2002 13:36:47 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <20020719170220.GB14402@panix.com> Message-ID: [Aahz] > All right, then, I hereby show MarkH ill understood appreciation. Excellent! One down, about two hundred thousand to go. From David Abrahams" <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com> From: "Guido van Rossum" > > Do you intend to have a test for "is this an iterator" at all? > > Not right now, see the rest of my email. The best you can do is check > for a next method and hope for the best. I only asked because the rest of your email seemed to imply that you didn't believe in such checks at this time, while the sentence above my question seemed to imply there is/should be such a test. Thanks for clarifying. > > If LBYL is bad, what is introspection good for? > > Ask Alex. OK. Alex, what's introspection good for? > > I understand that that's always been "the Python way". However, > > isn't there also some implication that some of the special functions > > are more than just a way to provide implementations of Python's > > syntax? > > Like what? > > > Notes in the docs like those on __getitem__ tend to argue > > for that, at least by convention. Unless I'm misinterpreting > > things, "the Python way" isn't quite so one-sided where protocols > > are concerned. > > Can you quote specific places in the docs you read this way? Just for example: __getitem__: "For sequence types, the accepted keys should be integers and slice objects. .... If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. Note: for loops expect that an IndexError will be raised for illegal indexes to allow proper detection of the end of the sequence. " __delitem__: "Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method." __iter__: "This method should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over the keys of the container, and should also be made available as the method iterkeys()." The way I read these, the behavior of an implementation of these functions isn't really open-ended. It ought to follow certain conventions, if you want your type to behave sensibly. And that's about as strong as any legislation I've seen anywhere in the Python docs. > I don't > see it, but I've only scanned chapter 3 of the Language Reference > Manual. > > > Just for the record, I never meant to imply that it was broken, only > > that I'd like to get a little more from it than I currently can. > > Maybe I should read Ping's email. From the discussion I figured he > was arguing this way. I think you have to settle with what I proposed > at the top. Of course I do; I never expected otherwise. Like most of my other suggestions, this is a case of "OK, whatever you say Guido... but as long as people are interested in discussing the issues I'd like them to understand my reasons for bringing it up". -Dave From David Abrahams" <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com> From: "Alex Martelli" > Introspection is good when you need to dispatch in a way that is > not supported by the language you're using. In Python (and most > other languages), this mostly mean multiple dispatch -- you don't > get it from the language, therefore, on the non-frequent occasions > when you NEED it, you have to kludge it up. Very similar to > multiple inheritance in languages that don't support THAT, really. > > (Particularly in how people who've never used multiple X don't > really understand that it buys you anything -- try interesting a > dyed-in-the-wool Smalltalker in multiple inheritance, or anybody > *but* a CLOS-head or Dylan-head in multiple dispatch...:-). Ahem. *I'm* interested in multiple-dispatch (never used CLOS or Dylan). You might not have noticed that I mentioned multimethods in my post about supporting overloading in Boost.Python. > Other aspects of introspection help you implement other primitives > lacking in the language. E.g. "make another like myself but not > initialized" can be self.__class__.__new__(self.__class__) -- not > the most elegant expression, but, hey, I've seen worse (such as > NOT being able to express it at all, in languages lacking the > needed ability to introspect:-). Is that really introspection? It doesn't seem to ask a question. > Looking at *ANOTHER* object this way isn't really INTROspection, > btw -- it's EXTRAspection, by the Latin roots of these words:-). Okay. I hope you won't be offended if I continue to use the wrong term so that everyone else can understand me ;-) -Dave From tim.one@comcast.net Fri Jul 19 18:50:19 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 19 Jul 2002 13:50:19 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: [Ping] > ... > I believe this is where the biggest debate lies: whether "for" should be > non-destructive. I realize we are currently on the other side of the > fence, but i foresee enough potential pain that i would like you to > consider the value of keeping "for" loops non-destructive. I'm having a hard time getting excited about this. If you had made this argument before the iterator protocol was implemented, it may have been more or less intriguing. But it was implemented and released some time ago, and I just haven't seen any evidence of such problems on c.l.py, the Help list, or the Tutor list (all of which I still pay significant attention to). "for" did and does work in accord with a simple protocol, and whether that's "destructive" depends on how the specific objects involved implement their pieces of the protocol, not on the protocol itself. The same is true of all of Python's hookable protocols. What's so special about "for" that it should pretend to deliver purely functional behavior in a highly non-functional language? State mutates. That's its purpose . From aahz@pythoncraft.com Fri Jul 19 18:54:56 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 13:54:56 -0400 (EDT) Subject: [Python-Dev] CANCEL: OSCON Community dinner Weds 7/24 6pm References: Message-ID: <200207191754.g6JHsuV00747@panix1.panix.com> Given the lack of response, I'm hereby canceling any official Python community dinner. I hope to see many of you at the conference, though. I'm including the original message below in case someone else wants to run with the ball. In article , Aahz wrote: >[posted to c.l.py with cc to c.l.py.announce and python-dev] > >I'm proposing a Python community dinner at OSCON next week, for Weds >7/24 at 6pm. Is there anyone familiar with the San Diego area who wants >to suggest a location near the Sheraton? If I don't get any >recommendations, we'll probably just have the dinner at the Sheraton. > >If you're interested, please send me an e-mail so I have some idea of >the number of people. Also, please include a way of getting in touch >with you at OSCON in case plans change (phone numbers accepted, but >e-mail addresses preferred). > >(There's a meeting for PSF members at 8pm, so some of us will likely >have to skip out early.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ -- From guido@python.org Fri Jul 19 19:08:57 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 14:08:57 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 13:50:19 EDT." References: Message-ID: <200207191808.g6JI8wE28214@pcp02138704pcs.reston01.va.comcast.net> > I'm having a hard time getting excited about this. If you had made > this argument before the iterator protocol was implemented, it may > have been more or less intriguing. But it was implemented and > released some time ago, and I just haven't seen any evidence of such > problems on c.l.py, the Help list, or the Tutor list (all of which I > still pay significant attention to). This is an important argument IMO that the theorists here seem to be missing somewhat. Releasing a feature and monitoring feedback is a good way of user testing, something that has been ignored too often by language designers. Elegant or minimal abstractions have their place; but in the end, users are more important. Quoting Steven Pemberton's home page (http://www.cwi.nl/~steven/): ABC: Simple but Powerful Interactive Programming Language and Environment. : A Simple but Powerful Interactive Programming Language and Environment. We did requirements and task analysis, iterative design, and user testing. You'd almost think programming languages were an interface between people and computers. Now famous because Python was strongly influenced by it. I still favor this approach to language design. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 19:15:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 14:15:45 -0400 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 13:41:24 EDT." <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com> Message-ID: <200207191815.g6JIFja28258@pcp02138704pcs.reston01.va.comcast.net> > The way I read these, the behavior of an implementation of these > functions isn't really open-ended. It ought to follow certain > conventions, if you want your type to behave sensibly. And that's > about as strong as any legislation I've seen anywhere in the Python > docs. Note the qualification: "if you want your type to behave sensibly". You can interpret the paragraphs you quoted as explaining what makes a good sequence or mapping. IOW they hint at some of the invariants of those protocols. But I wouldn't call this legislation. > Of course I do; I never expected otherwise. Like most of my other > suggestions, this is a case of "OK, whatever you say Guido... but as > long as people are interested in discussing the issues I'd like them > to understand my reasons for bringing it up". Maybe I should just tune out of this discussion if it's only of theoretical importance? --Guido van Rossum (home page: http://www.python.org/~guido/) From trentm@ActiveState.com Fri Jul 19 19:26:02 2002 From: trentm@ActiveState.com (Trent Mick) Date: Fri, 19 Jul 2002 11:26:02 -0700 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: ; from tim.one@comcast.net on Fri, Jul 19, 2002 at 01:36:47PM -0400 References: <20020719170220.GB14402@panix.com> Message-ID: <20020719112602.A17763@ActiveState.com> [Tim Peters wrote] > Excellent! One down, about two hundred thousand to go. Mark rocks! 1,999,999-ly, Trent -- Trent Mick TrentM@ActiveState.com From aahz@pythoncraft.com Fri Jul 19 19:29:22 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 14:29:22 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: References: <20020719165836.GA14402@panix.com> Message-ID: <20020719182922.GA9585@panix.com> On Fri, Jul 19, 2002, Paul Svensson wrote: > > Pending the pain of the yet unseen migration plan, I'm > +1 on removing __iter__ from all core iterators > +1 on renaming next() to __next__() > +1 on presenting file objects as iterators rather than iterables > +0 on the new 'for x from y' syntax I'd vote this way: -0 on removing __iter__ +1 on renaming next() to __next__() +0 on presenting file objects as iterators +1 on finishing up the patch that fixes the xreadlines() mess -1 on for x from y -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aahz@pythoncraft.com Fri Jul 19 19:30:30 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 14:30:30 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <20020719112602.A17763@ActiveState.com> References: <20020719170220.GB14402@panix.com> <20020719112602.A17763@ActiveState.com> Message-ID: <20020719183029.GB9585@panix.com> On Fri, Jul 19, 2002, Trent Mick wrote: > [Tim Peters wrote] >> >> Excellent! One down, about two hundred thousand to go. > > Mark rocks! > > 1,999,999-ly, Next up: MarkH writes a patch to fix Trent's arithmetic. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim.one@comcast.net Fri Jul 19 19:39:38 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 19 Jul 2002 14:39:38 -0400 Subject: [Python-Dev] Judy for replacing internal dictionaries? In-Reply-To: <20020719094303.B24220@tummy.com> Message-ID: [Sean Reifschneider] > Recently at a Hacking Society meeting someone was working on > packaging Judy for Debian. Apparently, Judy is a data-structure > designed by some researchers at Hewlett-Packard. It's goal is to > be a very fast implementation of an associative array or > (possibly sparse) integer indexed array. > > Judy has recently been released under the LGPL. > > After reding the FAQ and 10 minute introduction, I started wondering > about wether it could improve the overall performance of Python by > replacing dictionaries used for namespaces, classes, etc... Sorry, almost certainly not. In a typical Python namespace lookup, the pure overheads of calling and returning from the lookup function cost more than doing the lookup. Python dicts are more optimized for this use than you realize. Judy looks like it would be faster than Python dicts for large mappings, though (and given the boggling complexity of Judy's data structures, it damn well better be ). As a general replacement for Python dicts, it wouldn't fly because it requires a total ordering on keys, and an ordering explicitly given by bitstrings, not implicitly via calls to an opaque ordering function. Looks like it may be an excellent alternative to in-memory B-Trees keyed by manifest bitstrings (like ints and character strings or even addresses). From nas@python.ca Fri Jul 19 20:00:43 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 19 Jul 2002 12:00:43 -0700 Subject: [Python-Dev] The iterator story In-Reply-To: ; from ping@zesty.ca on Fri, Jul 19, 2002 at 04:28:32AM -0700 References: Message-ID: <20020719120043.A21503@glacier.arctrix.com> Ka-Ping Yee wrote: > I think "for" should be non-destructive because that's the way > it has almost always behaved, and that's the way it behaves in > any other language [@] i can think of. I agree that it can be surprising to have "for" destory the object it's looping over. I myself was bitten once by it. I'm not yet sure if this is something that will repeatedly bite. I suspect it might. :-( > And as things stand, the presence of __iter__ doesn't even work [@] > as a type flag. __iter__ is not a flag. When you want to loop over an object you call __iter__ to get an iterator. Since you should be able to loop over all iterators they should provide a __iter__ that returns self. > Now suppose we agree that __iter__ and next are distinct protocols. I suppose you can call them distinct but they both pertain to iteration. One gets the iterator, the other uses it. > Then why require iterators to support both? The only reason we > would want __iter__ on iterators is so that we can use "for" [@] > with an iterator as the second operand. Isn't that a good reason? It's not just "for" though. Anytime you have an object that you want to loop over you should call iter() to get an iterator and then call .next() on that object. > I think the potential for collision, though small, is significant, > and this makes "__next__" a better choice than "next". When this issue originally came up, my position was that double underscores should be used only if there is a risk of of namespace collision. The fact that the method was stored on a type slot is irrelevant. If objects implement iterators as a separate, specialized object there wouldn't be any namespace collisions. Now it looks like people want to have iterators that also do other things. In that case, __next__ would have been a better choice. > The connection between this issue and the __iter__ issue is that, > if next() were renamed to __next__(), the argument that __iter__ > is needed as a flag would also go away. Sorry, I don't see the connection. __iter__ is not a flag. How does renaming next() help? > In my ideal world, we would allow a new form of "for", such as > > for line from file: > print line Nice syntax but I think it creates other problems. Basically, you are saying that iterators should not implement __iter__ and we should have some other way of looping over them (in order to make it clear that they are being mutated). First, people could implement __iter__ such that it returns an iterator the mutates the original object (e.g. a file object __iter__ that returns xreadlines). Second, it will be confusing to have two different ways of looping over things. Imagine a library with this bit of code: for item in sequence: do something Now I want to use this library but I have an iterator, not something that implements __iter__. I would need to create a little wrapper with a __iter__ method that returns my object. Should people prefer to write: for item from iterator: do something when they only need to loop over something once? Doing so makes the code most generally useful. What about functions like map() and max()? Should they accept iterators or sequences as arguments? It would be confusing if some functions accepted iterators as arguments but not "container" objects (i.e. things that implement __iter__) and vice versa. People will wonder if they should call iter() before passing their sequence as an argument. To summarize, I agree that "for" mutating the object can be surprising. I don't think that removing the __iter__ from iterators is the right solution. Unfortunately I don't have any alternative suggestions. Neil From aleax@aleax.it Fri Jul 19 19:55:06 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 19 Jul 2002 20:55:06 +0200 Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability In-Reply-To: <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com> Message-ID: On Friday 19 July 2002 07:45 pm, David Abrahams wrote: ... > > dyed-in-the-wool Smalltalker in multiple inheritance, or anybody > > *but* a CLOS-head or Dylan-head in multiple dispatch...:-). > > Ahem. *I'm* interested in multiple-dispatch (never used CLOS or Dylan). You > might not have noticed that I mentioned multimethods in my post about > supporting overloading in Boost.Python. Sorry, I hadn't noticed. I never did production work in CLOS or Dylan, either, so I guess that enough C++ and templates warp one's brain enough to increase ones' perceptivity (only way to account for both of us:-). > > Other aspects of introspection help you implement other primitives > > lacking in the language. E.g. "make another like myself but not > > initialized" can be self.__class__.__new__(self.__class__) -- not > > the most elegant expression, but, hey, I've seen worse (such as > > NOT being able to express it at all, in languages lacking the > > needed ability to introspect:-). > > Is that really introspection? It doesn't seem to ask a question. "What is this concrete object's actual runtime class?" is a question, even though it may not look like one since the answer is in a special attribute rather than being obtained from a method call. Feel free to code type(self) instead of self.__class__ if this feels more question-ish, of course. Six of one, half a dozen of the other. The object is "looking inside itself" -> introspection. Specifically, looking as its own metadata. > > Looking at *ANOTHER* object this way isn't really INTROspection, > > btw -- it's EXTRAspection, by the Latin roots of these words:-). > > Okay. I hope you won't be offended if I continue to use the wrong term so > that everyone else can understand me ;-) How depressingly pragmatic. Alex From guido@python.org Fri Jul 19 20:10:30 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 15:10:30 -0400 Subject: [Python-Dev] Where's time.daylight??? In-Reply-To: Your message of "Fri, 19 Jul 2002 13:32:19 EDT." <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> > [Barry, in python-checkins] > > I've noticed one breakage already I believe. On my systems (RH6.1 and > > RH7.3) time.daylight as disappeared. > > > > I don't think test_time.py actually tests this parameter, but > > test_email.py which is what's failing for me: > [...] > > Yup, time.daylight has disappeared. But the bizarre thing is that if > I roll back to rev. 1.129, it's *still* gone! Even rev 1.128 still > doesn't fix this. I wonder if something in configure changed??? Alas, this is the effect of defining _XOPEN_SOURCE in configure.in. This somehow has the effect of not defining these symbols in pyconfig.h: HAVE_STRUCT_TM_TM_ZONE HAVE_TM_ZONE HAVE_TZNAME I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can try to figure out what the right thing is for Tru64. --Guido van Rossum (home page: http://www.python.org/~guido/) From andymac@bullseye.apana.org.au Fri Jul 19 14:32:18 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sat, 20 Jul 2002 00:32:18 +1100 (edt) Subject: [Python-Dev] test_socket failure on FreeBSD In-Reply-To: <200207181627.g6IGRPE21459@odiug.zope.com> Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---888574994-29658-1027085538=:42796 Content-Type: TEXT/PLAIN; charset=US-ASCII On Thu, 18 Jul 2002, Guido van Rossum wrote: {...} > > Testing recvfrom() in chunks over TCP. ... > > seg1='Michael Gilfix was he', addr='None' > > seg2='re > > ', addr='None' > > ERROR > > Hm. This looks like recvfrom() on a TCP stream doesn't return an > address; not entirely unreasonable. I wonder if > self.cli_conn.getpeername() returns the expected address; can you > check this? Add this after each recvfrom() call. > > if addr is None: > addr = self.cli_conn.getpeername() This appears to have the effect you desired. See the attached log. {...} > > Testing non-blocking accept. ... > > conn= > > addr=('127.0.0.1', 3144) > > FAIL > > This is different. It seems that the accept() call doesn't time out. > But this could be because the client thread connects too fast. Can > you add a sleep (e.g. time.sleep(5)) to _testAccept() before the > connect() call? Likewise. I took the sleep down to 1ms without failure, though that system has HZ=100 so std resolution I expect would be 10ms. I have also attached for info the log of the same modifications on EMX - situation improved, but still a hiccup there. Also attached is the diff I applied to test_socket.py (as of about 1900 UTC 020719). -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia ---888574994-29658-1027085538=:42796 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.fbsd44" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.log.fbsd44 Content-Disposition: attachment; filename="test_socket.log.fbsd44" dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZGVmYXVsdCB0aW1lb3V0LiAuLi4g b2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4gb2sNClRlc3Rpbmcg Z2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9zdG5hbWUgcmVzb2x1 dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBzdXJlIGdldG5hbWVp bmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVyLiAuLi4gb2sNClRl c3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lhbCBjb25zdGFudHMu IC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQgZm9yIGdldG5hbWVp bmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgpLiAuLi4gb2sNClRl c3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0aW5nIHRoYXQgc29j a2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRlc3RpbmcgZnJvbWZk KCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNodW5rcyBvdmVyIFRD UC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIg VENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGUnLCBhZGRy PScoJzEyNy4wLjAuMScsIDM4OTgpJw0Kc2VnMj0ncmUNCicsIGFkZHI9Jygn MTI3LjAuMC4xJywgMzg5OCknDQpvaw0KVGVzdGluZyBsYXJnZSByZWNlaXZl IG92ZXIgVENQLiAuLi4gb2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBv dmVyIFRDUC4gLi4uIA0KbXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0K JywgYWRkcj0nKCcxMjcuMC4wLjEnLCAzOTAwKScNCm9rDQpUZXN0aW5nIHNl bmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0cmluZyBvdmVyIFRDUC4gLi4u IG9rDQpUZXN0aW5nIHNodXRkb3duKCkuIC4uLiBvaw0KVGVzdGluZyByZWN2 ZnJvbSgpIG92ZXIgVURQLiAuLi4gb2sNClRlc3Rpbmcgc2VuZHRvKCkgYW5k IFJlY3YoKSBvdmVyIFVEUC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2lu ZyBhY2NlcHQuIC4uLiBvaw0KVGVzdGluZyBub24tYmxvY2tpbmcgY29ubmVj dC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2luZyByZWN2LiAuLi4gb2sN ClRlc3Rpbmcgd2hldGhlciBzZXQgYmxvY2tpbmcgd29ya3MuIC4uLiBvaw0K UGVyZm9ybWluZyBmaWxlIHJlYWRsaW5lIHRlc3QuIC4uLiBvaw0KUGVyZm9y bWluZyBzbWFsbCBmaWxlIHJlYWQgdGVzdC4gLi4uIG9rDQpQZXJmb3JtaW5n IHVuYnVmZmVyZWQgZmlsZSByZWFkIHRlc3QuIC4uLiBvaw0KDQotLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tDQpSYW4gMjcgdGVzdHMgaW4gMTAuMzEycw0K DQpPSw0KMSB0ZXN0IE9LLg0KQ0FVVElPTjogIHN0ZG91dCBpc24ndCBjb21w YXJlZCBpbiB2ZXJib3NlIG1vZGU6ICBhIHRlc3QNCnRoYXQgcGFzc2VzIGlu IHZlcmJvc2UgbW9kZSBtYXkgZmFpbCB3aXRob3V0IGl0Lg0KGg== ---888574994-29658-1027085538=:42796 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.os2emx" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.log.os2emx Content-Disposition: attachment; filename="test_socket.log.os2emx" dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZGVmYXVsdCB0aW1lb3V0LiAuLi4g b2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4gb2sNClRlc3Rpbmcg Z2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9zdG5hbWUgcmVzb2x1 dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBzdXJlIGdldG5hbWVp bmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVyLiAuLi4gb2sNClRl c3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lhbCBjb25zdGFudHMu IC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQgZm9yIGdldG5hbWVp bmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgpLiAuLi4gb2sNClRl c3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0aW5nIHRoYXQgc29j a2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRlc3RpbmcgZnJvbWZk KCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNodW5rcyBvdmVyIFRD UC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIg VENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGUnLCBhZGRy PScoJzEyNy4wLjAuMScsIDQyNzQpJw0Kc2VnMj0ncmUNCicsIGFkZHI9Jygn MTI3LjAuMC4xJywgNDI3NCknDQpvaw0KVGVzdGluZyBsYXJnZSByZWNlaXZl IG92ZXIgVENQLiAuLi4gb2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBv dmVyIFRDUC4gLi4uIA0KbXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0K JywgYWRkcj0nKCcxMjcuMC4wLjEnLCA0Mjc2KScNCm9rDQpUZXN0aW5nIHNl bmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0cmluZyBvdmVyIFRDUC4gLi4u IEZBSUwNClRlc3Rpbmcgc2h1dGRvd24oKS4gLi4uIG9rDQpUZXN0aW5nIHJl Y3Zmcm9tKCkgb3ZlciBVRFAuIC4uLiBvaw0KVGVzdGluZyBzZW5kdG8oKSBh bmQgUmVjdigpIG92ZXIgVURQLiAuLi4gb2sNClRlc3Rpbmcgbm9uLWJsb2Nr aW5nIGFjY2VwdC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2luZyBjb25u ZWN0LiAuLi4gRVJST1INClRlc3Rpbmcgbm9uLWJsb2NraW5nIHJlY3YuIC4u LiBvaw0KVGVzdGluZyB3aGV0aGVyIHNldCBibG9ja2luZyB3b3Jrcy4gLi4u IG9rDQpQZXJmb3JtaW5nIGZpbGUgcmVhZGxpbmUgdGVzdC4gLi4uIG9rDQpQ ZXJmb3JtaW5nIHNtYWxsIGZpbGUgcmVhZCB0ZXN0LiAuLi4gb2sNClBlcmZv cm1pbmcgdW5idWZmZXJlZCBmaWxlIHJlYWQgdGVzdC4gLi4uIG9rDQoNCj09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0aW5nIG5vbi1i bG9ja2luZyBjb25uZWN0Lg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0K VHJhY2ViYWNrIChtb3N0IHJlY2VudCBjYWxsIGxhc3QpOg0KICBGaWxlICIu Li8uLi9MaWIvdGVzdC90ZXN0X3NvY2tldC5weSIsIGxpbmUgMTE3LCBpbiBf dGVhckRvd24NCiAgICBzZWxmLmZhaWwobXNnKQ0KICBGaWxlICJGOi9ERVYv Q1ZTX1RFU1QvUFlUSE9OLUNWUy9MaWIvdW5pdHRlc3QucHkiLCBsaW5lIDI1 NCwgaW4gZmFpbA0KICAgIHJhaXNlIHNlbGYuZmFpbHVyZUV4Y2VwdGlvbiwg bXNnDQpBc3NlcnRpb25FcnJvcjogKDU2LCAnU29ja2V0IGlzIGFscmVhZHkg Y29ubmVjdGVkJykNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRkFJ TDogVGVzdGluZyBzZW5kYWxsKCkgd2l0aCBhIDIwNDggYnl0ZSBzdHJpbmcg b3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpUcmFjZWJh Y2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6DQogIEZpbGUgIi4uLy4uL0xp Yi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSA0MTIsIGluIHRlc3RTZW5k QWxsDQogICAgc2VsZi5hc3NlcnRfKGxlbihyZWFkKSA9PSAxMDI0LCAiRXJy b3IgcGVyZm9ybWluZyBzZW5kYWxsLiIpDQogIEZpbGUgIkY6L0RFVi9DVlNf VEVTVC9QWVRIT04tQ1ZTL0xpYi91bml0dGVzdC5weSIsIGxpbmUgMjYyLCBp biBmYWlsVW5sZXNzDQogICAgaWYgbm90IGV4cHI6IHJhaXNlIHNlbGYuZmFp bHVyZUV4Y2VwdGlvbiwgbXNnDQpBc3NlcnRpb25FcnJvcjogRXJyb3IgcGVy Zm9ybWluZyBzZW5kYWxsLg0KDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t DQpSYW4gMjcgdGVzdHMgaW4gMTAuMDkwcw0KDQpGQUlMRUQgKGZhaWx1cmVz PTEsIGVycm9ycz0xKQ0KdGVzdCB0ZXN0X3NvY2tldCBmYWlsZWQgLS0gZXJy b3JzIG9jY3VycmVkOyBydW4gaW4gdmVyYm9zZSBtb2RlIGZvciBkZXRhaWxz DQoxIHRlc3QgZmFpbGVkOg0KdGVzdF9zb2NrZXQNCg== ---888574994-29658-1027085538=:42796 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.py.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: test_socket.py.diff Content-Disposition: attachment; filename="test_socket.py.diff" KioqIHRlc3Rfc29ja2V0LnB5Lm9yaWcJRnJpIEp1bCAxOSAyMzoxOTowMCAy MDAyDQotLS0gdGVzdF9zb2NrZXQucHkJRnJpIEp1bCAxOSAyMzozMjozNiAy MDAyDQoqKioqKioqKioqKioqKioNCioqKiA4LDEzICoqKioNCi0tLSA4LDE0 IC0tLS0NCiAgaW1wb3J0IHRpbWUNCiAgaW1wb3J0IHRocmVhZCwgdGhyZWFk aW5nDQogIGltcG9ydCBRdWV1ZQ0KKyBpbXBvcnQgdHJhY2ViYWNrDQogIA0K ICBQT1JUID0gNTAwMDcNCiAgSE9TVCA9ICdsb2NhbGhvc3QnDQoqKioqKioq KioqKioqKioNCioqKiAzNzQsMzc5ICoqKioNCi0tLSAzNzUsMzgzIC0tLS0N CiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2VsZik6DQogICAgICAgICAgIiIi VGVzdGluZyBsYXJnZSByZWN2ZnJvbSgpIG92ZXIgVENQLiIiIg0KICAgICAg ICAgIG1zZywgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20oMTAyNCkN CisgICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAgICAgICAgICAgIGFk ZHIgPSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkNCisgICAgICAgICBw cmludCAiXG5tc2c9JyVzJywgYWRkcj0nJXMnIiAlIChtc2csIHJlcHIoYWRk cikpDQogICAgICAgICAgaG9zdG5hbWUsIHBvcnQgPSBhZGRyDQogICAgICAg ICAgIyNzZWxmLmFzc2VydEVxdWFsKGhvc3RuYW1lLCBzb2NrZXQuZ2V0aG9z dGJ5bmFtZSgnbG9jYWxob3N0JykpDQogICAgICAgICAgc2VsZi5hc3NlcnRF cXVhbChtc2csIE1TRykNCioqKioqKioqKioqKioqKg0KKioqIDM4NCwzOTEg KioqKg0KLS0tIDM4OCw0MDEgLS0tLQ0KICAgICAgZGVmIHRlc3RPdmVyRmxv d1JlY3ZGcm9tKHNlbGYpOg0KICAgICAgICAgICIiIlRlc3RpbmcgcmVjdmZy b20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuIiIiDQogICAgICAgICAgc2VnMSwg YWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20obGVuKE1TRyktMykNCisg ICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAgICAgICAgICAgIGFkZHIg PSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkNCisgICAgICAgICBwcmlu dCAiXG5zZWcxPSclcycsIGFkZHI9JyVzJyIgJSAoc2VnMSwgcmVwcihhZGRy KSkNCiAgICAgICAgICBzZWcyLCBhZGRyID0gc2VsZi5jbGlfY29ubi5yZWN2 ZnJvbSgxMDI0KQ0KKyAgICAgICAgIGlmIGFkZHIgaXMgTm9uZToNCisgICAg ICAgICAgICAgYWRkciA9IHNlbGYuY2xpX2Nvbm4uZ2V0cGVlcm5hbWUoKQ0K ICAgICAgICAgIG1zZyA9IHNlZzEgKyBzZWcyDQorICAgICAgICAgcHJpbnQg InNlZzI9JyVzJywgYWRkcj0nJXMnIiAlIChzZWcyLCByZXByKGFkZHIpKQ0K ICAgICAgICAgIGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0KICAgICAgICAgICMj c2VsZi5hc3NlcnRFcXVhbChob3N0bmFtZSwgc29ja2V0LmdldGhvc3RieW5h bWUoJ2xvY2FsaG9zdCcpKQ0KICAgICAgICAgIHNlbGYuYXNzZXJ0RXF1YWwo bXNnLCBNU0cpDQoqKioqKioqKioqKioqKioNCioqKiA0NDQsNDQ5ICoqKioN Ci0tLSA0NTQsNDYxIC0tLS0NCiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2Vs Zik6DQogICAgICAgICAgIiIiVGVzdGluZyByZWN2ZnJvbSgpIG92ZXIgVURQ LiIiIg0KICAgICAgICAgIG1zZywgYWRkciA9IHNlbGYuc2Vydi5yZWN2ZnJv bShsZW4oTVNHKSkNCisgICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAg ICAgICAgICAgIGFkZHIgPSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkN CiAgICAgICAgICBob3N0bmFtZSwgcG9ydCA9IGFkZHINCiAgICAgICAgICAj I3NlbGYuYXNzZXJ0RXF1YWwoaG9zdG5hbWUsIHNvY2tldC5nZXRob3N0Ynlu YW1lKCdsb2NhbGhvc3QnKSkNCiAgICAgICAgICBzZWxmLmFzc2VydEVxdWFs KG1zZywgTVNHKQ0KKioqKioqKioqKioqKioqDQoqKiogNDc4LDQ4MyAqKioq DQotLS0gNDkwLDQ5NiAtLS0tDQogICAgICAgICAgZXhjZXB0IHNvY2tldC5l cnJvcjoNCiAgICAgICAgICAgICAgcGFzcw0KICAgICAgICAgIGVsc2U6DQor ICAgICAgICAgICAgIHByaW50ICJcbmNvbm49IiArIHJlcHIoY29ubikgKyAi XG5hZGRyPSIgKyByZXByKGFkZHIpDQogICAgICAgICAgICAgIHNlbGYuZmFp bCgiRXJyb3IgdHJ5aW5nIHRvIGRvIG5vbi1ibG9ja2luZyBhY2NlcHQuIikN CiAgICAgICAgICByZWFkLCB3cml0ZSwgZXJyID0gc2VsZWN0LnNlbGVjdChb c2VsZi5zZXJ2XSwgW10sIFtdKQ0KICAgICAgICAgIGlmIHNlbGYuc2VydiBp biByZWFkOg0KKioqKioqKioqKioqKioqDQoqKiogNDg2LDQ5MSAqKioqDQot LS0gNDk5LDUwNSAtLS0tDQogICAgICAgICAgICAgIHNlbGYuZmFpbCgiRXJy b3IgdHJ5aW5nIHRvIGRvIGFjY2VwdCBhZnRlciBzZWxlY3QuIikNCiAgDQog ICAgICBkZWYgX3Rlc3RBY2NlcHQoc2VsZik6DQorICAgICAgICAgdGltZS5z bGVlcCg1KQ0KICAgICAgICAgIHNlbGYuY2xpLmNvbm5lY3QoKEhPU1QsIFBP UlQpKQ0KICANCiAgICAgIGRlZiB0ZXN0Q29ubmVjdChzZWxmKToNCioqKioq KioqKioqKioqKg0KKioqIDUwNSw1MTAgKioqKg0KLS0tIDUxOSw1MjUgLS0t LQ0KICAgICAgICAgIGV4Y2VwdCBzb2NrZXQuZXJyb3I6DQogICAgICAgICAg ICAgIHBhc3MNCiAgICAgICAgICBlbHNlOg0KKyAgICAgICAgICAgICBwcmlu dCAiXG5jb25uPSIgKyByZXByKGNvbm4pICsgIlxuYWRkcj0iICsgcmVwcihh ZGRyKQ0KICAgICAgICAgICAgICBzZWxmLmZhaWwoIkVycm9yIHRyeWluZyB0 byBkbyBub24tYmxvY2tpbmcgcmVjdi4iKQ0KICAgICAgICAgIHJlYWQsIHdy aXRlLCBlcnIgPSBzZWxlY3Quc2VsZWN0KFtjb25uXSwgW10sIFtdKQ0KICAg ICAgICAgIGlmIGNvbm4gaW4gcmVhZDoNCioqKioqKioqKioqKioqKg0KKioq IDUxNSw1MjAgKioqKg0KLS0tIDUzMCw1MzYgLS0tLQ0KICANCiAgICAgIGRl ZiBfdGVzdFJlY3Yoc2VsZik6DQogICAgICAgICAgc2VsZi5jbGkuY29ubmVj dCgoSE9TVCwgUE9SVCkpDQorICAgICAgICAgdGltZS5zbGVlcCg1KQ0KICAg ICAgICAgIHNlbGYuY2xpLnNlbmQoTVNHKQ0KICANCiAgY2xhc3MgRmlsZU9i amVjdENsYXNzVGVzdENhc2UoU29ja2V0Q29ubmVjdGVkVGVzdCk6DQoqKioq KioqKioqKioqKioNCioqKiA1NzQsNTgwICoqKioNCiAgICAgICAgICBzZWxm LmNsaV9maWxlLndyaXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxl LmZsdXNoKCkNCiAgDQohIGRlZiBtYWluKCk6DQogICAgICBzdWl0ZSA9IHVu aXR0ZXN0LlRlc3RTdWl0ZSgpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0 ZXN0Lm1ha2VTdWl0ZShHZW5lcmFsTW9kdWxlVGVzdHMpKQ0KICAgICAgc3Vp dGUuYWRkVGVzdCh1bml0dGVzdC5tYWtlU3VpdGUoQmFzaWNUQ1BUZXN0KSkN Ci0tLSA1OTAsNTk2IC0tLS0NCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndy aXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAg DQohIGRlZiB0ZXN0X21haW4oKToNCiAgICAgIHN1aXRlID0gdW5pdHRlc3Qu VGVzdFN1aXRlKCkNCiAgICAgIHN1aXRlLmFkZFRlc3QodW5pdHRlc3QubWFr ZVN1aXRlKEdlbmVyYWxNb2R1bGVUZXN0cykpDQogICAgICBzdWl0ZS5hZGRU ZXN0KHVuaXR0ZXN0Lm1ha2VTdWl0ZShCYXNpY1RDUFRlc3QpKQ0KKioqKioq KioqKioqKioqDQoqKiogNTg0LDU4NyAqKioqDQogICAgICB0ZXN0X3N1cHBv cnQucnVuX3N1aXRlKHN1aXRlKQ0KICANCiAgaWYgX19uYW1lX18gPT0gIl9f bWFpbl9fIjoNCiEgICAgIG1haW4oKQ0KLS0tIDYwMCw2MDMgLS0tLQ0KICAg ICAgdGVzdF9zdXBwb3J0LnJ1bl9zdWl0ZShzdWl0ZSkNCiAgDQogIGlmIF9f bmFtZV9fID09ICJfX21haW5fXyI6DQohICAgICB0ZXN0X21haW4oKQ0K ---888574994-29658-1027085538=:42796-- From andymac@bullseye.apana.org.au Fri Jul 19 14:37:12 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sat, 20 Jul 2002 00:37:12 +1100 (edt) Subject: [Python-Dev] test_socket failure on FreeBSD In-Reply-To: Message-ID: On Sat, 20 Jul 2002, Andrew MacIntyre wrote: {...} > Also attached is the diff I applied to test_socket.py (as of about 1900 > UTC 020719). Oops, that timestamp is still a couple of hours in the future. Should have been 1900 UTC 020718. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From gsw@agere.com Fri Jul 19 20:41:09 2002 From: gsw@agere.com (Gerald S. Williams) Date: Fri, 19 Jul 2002 15:41:09 -0400 Subject: [Python-Dev] The iterator story (Single- vs. Multi-pass iterability?) In-Reply-To: <20020719185602.21423.41415.Mailman@mail.python.org> Message-ID: I started to type this before looking back at the other threads, so feel free to ignore it if it's entirely superfluous. I'm sorry that I didn't have time to follow the "Single- vs. Multi-pass iterability" thread. Code freeze is today. :-) I'm a little confused about this destructive-for/iterator issue. Sure an iterator that destroys the original object might be unexpected, but wouldn't you expect a non-destructive iterator to be the default for any object unless there's a pretty good reason to use a destructive one? If there's a chance that the object may be destroyed/altered (such as a file stream or an iterator), shouldn't you already have some reason to suspect that? -Jerry Strong typing is for weak minds. Weak typing is for the real troublemakers. ;-) P.S. Leaving off the original subject line can be mildly annoying to those of us subscribing to the digest version of the list. Probably more so to those who read our responses. :-) From guido@python.org Fri Jul 19 21:24:04 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 16:24:04 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src configure,1.322,1.323 configure.in,1.333,1.334 pyconfig.h.in,1.43,1.44 In-Reply-To: Your message of "Fri, 19 Jul 2002 16:06:24 EDT." References: Message-ID: <200207192024.g6JKO4c14964@pcp02138704pcs.reston01.va.comcast.net> [Tim, in python-checkins] > I don't understand why this helps. Are you sure it does? Python.h still > contains: > > #ifndef _XOPEN_SOURCE > # define _XOPEN_SOURCE 500 > #endif > > The configure changes were consequences of that change, IIRC. We surely > shouldn't be defining this one way in Python.h and a different way in > config, right? I'm certain that it helps: test_time failed since Jeremy made the change to configure, now it succeeds again. It may not be the right fix, sure, but I recommend that we don't check in a fix that breaks other things. The search is on, and I trust that Jeremy and Martin will figure something out (and that Jeremy will run autoconf, autoheader, configure, *and* the test suite before checking in more changes). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 21:29:29 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 16:29:29 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 04:44:09 PDT." References: Message-ID: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> > It's just not the way i expect for-loops to work. Perhaps we would > need to survey people for objective data, but i feel that most people > would be surprised if > > for x in y: print x > for x in y: print x > > did not print the same thing twice, or if > > if x in y: print 'got it' > if x in y: print 'got it' > > did not do the same thing twice. I realize this is my own opinion, > but it's a fairly strong impression i have. I think it's a naive persuasion that doesn't hold under scrutiny. For a long time people have faked iterators by providing pseudo-sequences that did unspeakable things. In general, I'm pretty sure that if I asked an uninitiated user what "for line in file" would do, if it did anything, they would understand that if you tried that a second time you'd hit EOF right away. > Even if it's okay for for-loops to destroy their arguments, i still > think it sets up a bad situation: we may end up with functions > manipulating sequence-like things all over, but it becomes unclear > whether they destroy their arguments or not. It becomes possible > to write a function which sometimes destroys its argument and sometimes > doesn't. Bugs get deeper and harder to find. This sounds awfully similar to the old argument "functions (as opposed to procedures) should never have side effects". ABC implemented that literally (the environment was saved and restored around function calls, with an exception for the seed for the built-in random generator), with the hope that it would provide fewer surprises. It did the opposite: it drove people crazy because the language was trying to be smarter than them. > I believe this is where the biggest debate lies: whether "for" should be > non-destructive. I realize we are currently on the other side of the > fence, but i foresee enough potential pain that i would like you to > consider the value of keeping "for" loops non-destructive. I don't see any real debate. I only see you chasing windmills. Sorry. For-loops have had the possibility to destroy their arguments since the day __getitem__ was introduced. > > Maybe the for-loop is a red herring? Calling next() on an > > iterator may or may not be destructive on the underlying "sequence" -- > > if it is a generator, for example, I would call it destructive. > > Well, for a generator, there is no underlying sequence. > > while 1: print next(gen) > > makes it clear that there is no sequence, but > > for x in gen: print x > > seems to give me the impression that there is. This seems to be a misrepresentation. The idiom for using any iterator (not just generators) *without* using a for-loop would have to be something like: while 1: try: item = it.next() # or it.__next__() or next(it) except StopIteration: break ...do something with item... (Similar to the traditional idiom for looping over the lines of a file.) The for-loop over an iterator was invented so you could write this as: for item in it: ...do something with item... I'm not giving that up so easily! > > Perhaps you're trying to assign properties to the iterator abstraction > > that aren't really there? > > I'm assigning properties to "for" that you aren't. I think they > are useful properties, though, and worth considering. I'm trying to be open-minded, but I just don't see it. The for loop is more flexible than you seem to want it to be. Alas, it's been like this for years, and I don't think the for-loop needs a face lift. > I don't think i'm assigning properties to the iterator abstraction; > i expect iterators to destroy themselves. But the introduction of > iterators, in the way they are now, breaks this property of "for" > loops that i think used to hold almost all the time in Python, and > that i think holds all the time in almost all other languages. Again, the widespread faking of iterators using destructive __getitem__ methods that were designed to be only used in a for-loop defeats your assertion. > > Next, I'm not sure how renaming next() to __next__() would affect the > > situation w.r.t. the destructivity of for-loops. Or were you talking > > about some other migration? > > The connection is indirect. The renaming is related to: (a) making > __next__() a real, honest-to-goodness protocol independent of __iter__; next() is a real, honest-to-goodness protocol now, and it is independent of __iter__() now. > and (b) getting rid of __iter__ on iterators. It's the presence of > __iter__ on iterators that breaks the non-destructive-for property. So you prefer the while-loop version above over the for-loop version? Gotta be kidding. > I think the renaming of next() to __next__() is a good idea in any > case. It is distant enough from the other issues that it can be done > independently of any decisions about __iter__. Yeah, it's just a pain that it's been deployed in Python 2.2 since last December, and by the time 2.3 is out it will probably have been at least a full year. Worse, 2.2 is voted to be Python-in-a-Tie, giving that particular idiom a very long lifetime. I simply don't think we can break compatibility that easily. Remember the endless threads we've had about the pace of change and stability. We have to live with warts, alas. And this is a pretty minor one if you ask me. (I realize that you're proposing another way out in a separate message. I'll reply to that next. Since you changed the subject, I can't wery well reply to it here.) --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jul 19 21:57:09 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 19 Jul 2002 13:57:09 -0700 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 11:03:48AM -0400 References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com> Message-ID: <20020719135709.A22330@glacier.arctrix.com> Guido van Rossum wrote: > - There really isn't anything "broken" about the current situation; > it's just that "next" is the only method name mapped to a slot in > the type object that doesn't have leading and trailing double > underscores. Are you saying the _only_ reason to rename it is for consistency with the other type slot method names? That's really weak, IMHO, and not worth any kind of backwards incompatibility (which seems unavoidable). Neil From paul@svensson.org Fri Jul 19 21:49:54 2002 From: paul@svensson.org (Paul Svensson) Date: Fri, 19 Jul 2002 16:49:54 -0400 (EDT) Subject: [Python-Dev] The iterator story In-Reply-To: <20020719120043.A21503@glacier.arctrix.com> Message-ID: On Fri, 19 Jul 2002, Neil Schemenauer wrote: >__iter__ is not a flag. When you want to loop over an object you call >__iter__ to get an iterator. Since you should be able to loop over all >iterators they should provide a __iter__ that returns self. But you don't really loop _over_ the iterator, you loop _thru_ it. To me there's a fundamental difference between providing a new object and providing a reference to an existing object. This difference is mostly noticable for objects containing state. The raison d'etre for iterators is to contain state. If it's sensible to sometimes return an old object and sometimes a new, then we could have 'list(x) is x' being true when x is already a list. What I'm trying to get to is, __iter__(x) returning an existing object (self in this case) is really something very much different from __iter__() creating new state, and returning that. The problem is that we do want a way to loop _thru_ an iterator, and having __iter__() return self gives us that, at the cost of the above mentioned confusing conflagration. Ping's suggested seq() function solves that quite nicely: class seq: def __init__(self, i): self._iter = i def __iter__(self): return self._iter /Paul From paul-python@svensson.org Fri Jul 19 21:52:42 2002 From: paul-python@svensson.org (Paul Svensson) Date: Fri, 19 Jul 2002 16:52:42 -0400 (EDT) Subject: [Python-Dev] The iterator story Message-ID: On Fri, 19 Jul 2002, Neil Schemenauer wrote: >__iter__ is not a flag. When you want to loop over an object you call >__iter__ to get an iterator. Since you should be able to loop over all >iterators they should provide a __iter__ that returns self. But you don't really loop _over_ the iterator, you loop _thru_ it. To me there's a fundamental difference between providing a new object and providing a reference to an existing object. This difference is mostly noticable for objects containing state. The raison d'etre for iterators is to contain state. If it's sensible to sometimes return an old object and sometimes a new, then we could as well have 'list(x) is x' being true when x is already a list. What I'm trying to get to is, __iter__(x) returning an existing object (self in this case) is really something very much different from __iter__() creating new state, and returning that. The problem is that we do want a way to loop _thru_ an iterator, and having __iter__() return self gives us that, at the cost of the above mentioned confusing conflagration. Ping's suggested seq() function solves that quite nicely: class seq: def __init__(self, i): self._iter = i def __iter__(self): return self._iter /Paul From Jack.Jansen@oratrix.com Fri Jul 19 21:58:57 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 19 Jul 2002 22:58:57 +0200 Subject: [Python-Dev] Added platform-specific directories to sys.path Message-ID: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> I've a question that I'd like some feedback on. On MacOSX there's a set of directories that are meant especially for storing extensions to applications, and there's requests on the pythonmac-sig that I add these directories to the Python search path. This could easily be done optionally, with a .pth file in site-python. MacOSX has rationalized where preferences, libraries, licenses, extensions, etc are stored, and for all of these there's a hierarchy of folders. In the case of Python extension modules the logical places would be ~/Library/Application Support/Python (for user-installed extension modules), /Library/Application Support/Python (for machine-wide installed extension modules) and /Network/Library/Application Support/Python (for workgroup-wide installed modules). The final location, in /System, is for factory-installed stuff from Apple, not needed just yet for this example:-). I sympathize with the idea of making things more conform to the platform standard, on the other hand I'm a bit reluctant to do things differently again from what other Pythons do. But, one of the things that is sorely missing from Python is a standard place to install per-user extension modules, so this might well be the thing that triggers inclusion of such functionality into the grand scheme of things (including distutils support, etc). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Fri Jul 19 22:10:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 17:10:45 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: Your message of "Fri, 19 Jul 2002 04:28:32 PDT." References: Message-ID: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> > Here is a summary of the whole iterator picture as i currently see it. > This is necessarily subjective, but i will try to be precise so that > it's clear where i'm making a value judgement and where i'm trying to > state fact, and so we can pinpoint areas where we agree and disagree. > > In the subjective sections, i have marked with [@] the places where > i solicit agreement or disagreement. > > I would like to know your opinions on the issues listed below, > and on the places marked [@]. > > > Definitions (objective) > ----------------------- > > Container: a thing that provides non-destructive access to a varying > number of other things. > > Why "non-destructive"? Because i don't expect that merely looking > at the contents will cause a container to be altered. For example, > i expect to be able to look inside a container, see that there are > five elements; leave it alone for a while, come back to it later > and observe once again that there are five elements. > > Consequently, a file object is not a container in general. Given > a file object, you cannot look at it to see if it contains an "A", > and then later look at it once again to see if it contains an "A" > and get the same result. If you could seek, then you could do > this, but not all files support seeking. Even if you could seek, > the act of reading the file would still alter the file object. > > The file object provides no way of getting at the contents without > mutating itself. According to my definition, it's fine for a > container to have ways of mutating itself; but there has to be > *some* way of getting the contents without mutating the container, > or it just ain't a container to me. > > A file object is better described as a stream. Hypothetically > one could create an interface to seekable files that offered some > non-mutating read operations; this would cause the file to look > more like an array of bytes, and i would find it appropriate to > call that interface a container. > > Iterator: a thing that you can poke (i.e. send a no-argument message), > where each time you poke it, it either yields something or announces > that it is exhausted. > > For an iterator to mutate itself every time you poke it is not > part of my definition. But the only non-mutating iterator would > be an iterator that returns the same thing forever, or an iterator > that is always exhausted. So most iterators usually mutate. > > Some iterators are associated with a container, but not all. > > There can be many kinds of iterators associated with a container. > The most natural kind is one that yields the elements of the > container, one by one, mutating itself each time it is poked, > until it has yielded all of the elements of the container and > announces exhaustion. > > A Container's Natural Iterator: an iterator that yields the elements > of the container, one by one, in the order that makes the most sense > for the container. If the container has a finite size n, then the > iterator can be poked exactly n times, and thereafter it is exhausted. Sure. But I note that there are hybrids, and I think files (at least seekable files) fall in the hybrid category. Other examples of hybrids: - Some dbm variants (e.g. dbhash and gdbm) provide first() and next() or firstkey() and nextkey() methods that combine iterator state with the container object. These objects simply provide two different interfaces, a containerish interface (__getitem__ in fact), and an iteratorish interface. - Before we invented the concept of interators, I believe it was common for tree data structures to provide iterators that didn't put the iteration state in a separate object, but simply kept a pointer to the current node of the iteration pass somewhere in the root of the tree. The idea that a container also has some iterator state, and that you have to do something simple (like calling firstkey() or seek(0)) to reset the iterator, is quite common. You may argue that this is poor design that should be fixed, and in general I would agree (the firstkey()/nextkey() protocol in particular is clumsy to use), but it is common nevertheless, and sometimes common usage patterns as well as the relative cost of random access make it a cood compromise sometimes. For example, while a tape file is a container in the sense that reading the data doesn't destroy it, it's very heavily geared towards sequential access, and you can't realistically have two iterators going over the same tape at once. If you're too young to remember, think of files on CD media -- there, random access, while possible, is several orders of magnitude slower than sequential access (better than tape, but a lot worse than regular magnetic hard drives). > Issues (objective) > ------------------ > > I alluded to a set of issues in an earlier message, and i'll begin > there, by defining what i meant more precisely. > > The Destructive-For Issue: > > In most languages i can think of, and in Python for the most > part, a statement such as "for x in y: print x" is a > non-destructive operation on y. Repeating "for x in y: print x" > will produce exactly the same results once more. > > For pre-iterator versions of Python, this fails to be true only > if y's __getitem__ method mutates y. The introduction of > iterators has caused this to now be untrue when y is any iterator. > > The issue is, should "for" be non-destructive? I don't see the benefit. We've done this for years and the only conceptual problem was the abuse of __getitem__, not the destructiveness of the for-loop. > The Destructive-In Issue: > > Notice that the iteration that takes place for the "in" operator > is implemented in the same way as "for". So if "for" destroys > its second operand, so will "in". > > The issue is, should "in" be non-destructive? If it can't be helped otherwise, sure, why not? > (Similar issues exist for built-ins that iterate, like list().) At least list() keeps a copy of all the items, so you can then iterate over them as often as you want. :-) > The __iter__-On-Iterators Issue: > > Some people have mentioned that the presence of an __iter__() > method is a way of signifying that an object supports the > iterator protocol. It has been said that this is necessary > because the presence of a "next()" method is not sufficiently > distinguishing. Not me. > Some have said that __iter__() is a completely distinct protocol > from the iterator protocol. > > The issue is, what is __iter__() really for? To support iter() and for-loops. > And secondarily, if it is not part of the iterator protocol, > then should we require __iter__() on iterators, and why? So that you can use an iterator in a for-loop. > The __next__-Naming Issue: > > The iteration method is currently called "next()". > > Previous candidates for the name of this method were "next", > "__next__", and "__call__". After some previous debate, > it was pronounced to be "next()". > > There are concerns that "next()" might collide with existing > methods named "next()". There is also a concern that "next()" > is inconsistent because it is the only type-slot-method that > does not have a __special__ name. > > The issue is, should it be called "next" or "__next__"? That's a separate issue, and cleans up only a small wart that in practice hasn't hurt anybody AFAIK. > My Positions (subjective) > ------------------------- > > I believe that "for" and "in" and list() should be non-destructive. > I believe that __iter__() should not be required on iterators. > I believe that __next__() is a better name than next(). > > Destructive-For, Destructive-In: > > I think "for" should be non-destructive because that's the way > it has almost always behaved, and that's the way it behaves in > any other language [@] i can think of. > > For a container's __getitem__ method to mutate the container is, > in my opinion, bad behaviour. In pre-iterator Python, we needed > some way to allow the convenience of "for" on user-implemented > containers. So "for" supported a special protocol where it would > call __getitem__ with increasing integers starting from 0 until > it hit an IndexError. This protocol works great for sequence-like > containers that were indexable by integers. > > But other containers had to be hacked somewhat to make them fit. > For example, there was no good way to do "for" over a dictionary-like > container. If you attempted "for" over a user-implemented dictionary, > you got a really weird "KeyError: 0", which only made sense if you > understood that the "for" loop was attempting __getitem__(0). > > (Hey! I just noticed that > > from UserDict import UserDict > for k in UserDict(): print k > > still produces "KeyError: 0"! This oughta be fixed...) Check the CVS logs. At one point before 2.2 was released, UserDict has a __iter__ method. But then SF bug 448153 was filed, presenting evidence that this broke previously working code. So a separate class, IterableUserDict, was added that has the __iter__ method. I agree that this is less than ideal, but that's life. > If you wanted to support "for" on something else, sometimes you > would have to make __getitem__ mutate the object, like it does > in the fileinput module. But then the user has to know that > this object is a special case: "for" only works the first time. This was and still is widespread. There are a lot of objects that have a way to return an iterators (old style using fake __getitem__, and new ones using __iter__ and next) that are intended to be looped over, once. I have no desire to deprecate this behavior, since (a) it would be a major upheaval for the user community (a lot worse than integer division), and (b) I don't see that "fixing" this prevents a particular category of programming errors. > When iterators were introduced, i believed they were supposed > to solve this problem. Currently, they don't. No, they solve the conceptual ugliness of providing a __getitem__ that can only be called once. The new rule is, if you provide __getitem__, it must support random access; otherwise, you should provide __iter__. > Currently, "in" can even be destructive. This is more serious. > While one could argue that it's not so strange for > > for x in y: ... > > to alter y (even though i do think it is strange), i believe > just about anyone would find it very counterintuitive for > > if x in y: > > to alter y. [@] That falls in the category of "then don't do that". > __iter__-On-Iterators: > > I believe __iter__ is not a type flag. As i argued previously, > i think that looking for the presence of methods that don't actually > implement a protocol is a poor way to check for protocol support. > And as things stand, the presence of __iter__ doesn't even work [@] > as a type flag. And I never said it was a type flag. I'm tired of repeating myself, but you keep repeating this broken argument, so I have to keep correcting you. > There are objects with __iter__ that are not iterators (like most > containers). And there are objects without __iter__ that work as > iterators. I know you can legislate the latter away, but i think > such legislation would amount to fighting the programmers -- and > it is infeasible [@] to enforce the presence of __iter__ in practice. I think having next without having __iter__ is like having __getitem__ without having __len__. There are corner cases where you might get away with this because you know it won't be called, but (as I've repeated umpteen times now), a for-loop over an iterator is a common idiom. > Based on Guido's positive response, in which he asked me to make > an addition to the PEP, i believe Guido agrees with me that > __iter__ is distinct from the protocol of an iterator. This > surprised me because it runs counter to the philosophy previously > expressed in the PEP. I recognize that they are separate protocols. But because I like the for-loop as a convenient way to get all of the elements of an iterator, I want iterators to support __iter__. The alternative would be for iter() to see if the object implements next (after finding that it has neither __iter__ nor __getitem__), and return the object itself unchanged. If we had picked __next__ instead of 'next', that would perhaps been my choice (though I might *still* have recommended implementing __iter__ returning self, to avoid two failing getattr calls). > Now suppose we agree that __iter__ and next are distinct protocols. > Then why require iterators to support both? The only reason we > would want __iter__ on iterators is so that we can use "for" [@] > with an iterator as the second operand. Right. Finally you got it. > I have just argued, above, that it's *not* a good idea for "for" > and "in" to be destructive. Since most iterators self-mutate, > it follows that it's not advisable to use an iterator directly > as the second operand of a "for" or "in". > > I realize this seems radical! This may be the most controversial > point i have made. But if you accept that "in" should not > destroy its second argument, the conclusion is unavoidable. Since I have little sympathy for your premise, this conclusion is all from unavoidable for me. :-) > __next__-Naming: > > I think the potential for collision, though small, is significant, > and this makes "__next__" a better choice than "next". A built-in > function next() should be introduced; this function would call the > tp_iternext slot, and for instance objects tp_iternext would call > the __next__ method implemented in Python. > > The connection between this issue and the __iter__ issue is that, > if next() were renamed to __next__(), the argument that __iter__ > is needed as a flag would also go away. I really wish we had had this insight 18 months ago. Right now, it's too late. Dragging all the other stuff in doesn't strengthen the argument for fixing it now. > The Current PEP (objective) > --------------------------- > > The current PEP takes the position that "for" and "in" can be > destructive; that __iter__() and next() represent two distinct > protocols, yet iterators are required to support both; and that > the name of the method on iterators is called "next()". > > > My Ideal Protocol (subjective) > ------------------------------ > > So by now the biggest question/objection you probably have is > "if i can't use an iterator with 'for', then how can i use it?" > > The answer is that "for" is a great way to iterate over things; > it's just that it iterates over containers and i want to preserve > that. We need a different way to iterate over iterators. > > In my ideal world, we would allow a new form of "for", such as > > for line from file: > print line > > The use if "from" instead of "in" would imply that we were > (destructively) pulling things out of the iterator, and would > remove any possible parallel to the test "x in y", which should > rightly remain non-destructive. Alternative syntaxes for for-loops have been proposed as solutions to all sorts of things (e.g. what's called enumerate() in 2.3, and a simplified syntax for range(), and probably other things). I'm not keen on this. I don't want to user-test it, but I expect that it's too subtle a difference, and that we would see Aha! experiences of the kind "Oh, it's a for-*from* loop! I never noticed that, I always read it as a for-*in* loop! That explains the broken behavior." > Here's the whole deal: > > - Iterators provide just one method, __next__(). > > - The built-in next() calls tp_iternext. For instances, > tp_iternext calls __next__. > > - Objects wanting to be iterated over provide just one method, > __iter__(). Some of these are containers, but not all. > > - The built-in iter(foo) calls tp_iter. For instances, > tp_iter calls __iter__. > > - "for x in y" gets iter(y) and uses it as an iterator. > > - "for x from y" just uses y as the iterator. > > That's it. > > Benefits: > > - We have a nice clean division between containers and iterators. > > - When you see "for x in y" you know that y is a container. > > - When you see "for x from y" you know that y is an iterator. > > - "for x in y" never destroys y. > > - "if x in y" never destroys y. > > - If you have an object that is container-like, you can add > an __iter__ method that gives its natural iterator. If > you want, you can supply more iterators that do different > things; no problem. No one using your object is confused > about whether it mutates. > > - If you have an object that is cursor-like or stream-like, > you can safely make it into an iterator by adding __next__. > No one using your object is confused about whether it mutates. > > Other notes: > > - Iterator algebra still works fine, and is still easy to write: > > def alternate(it): > while 1: > yield next(it) > next(it) > > - The file problem has a consistent solution. Instead of writing > "for line in file" you write > > for line from file: > print line > > Being forced to write "from" signals to you that the file is > eaten up. There is no expectation that "for line from file" > will work again. > > The best would be a convenience function "readlines", to > make this even clearer: > > for line in readlines("foo.txt"): > print line > > Now you can do this as many times as you want, and there is > no possibility of confusion; there is no file object on which > to call methods that might mess up the reading of lines. > > > My Not-So-Ideal Protocol > ------------------------ > > All right. So new syntax may be hard to swallow. An alternative > is to introduce an adapter that turns an iterator into something > that "for" will accept -- that is, the opposite of iter(). > > - The built-in seq(it) returns x such that iter(x) yields it. > > Then instead of writing > > for x from it: > > you would write > > for x in seq(it): > > and the rest would be the same. The use of "seq" here is what > would flag the fact that "it" will be destroyed. I don't feel I have to drive it home any further, so I'll leave these last few paragraphs without comments. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 22:20:35 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 17:20:35 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 13:57:09 PDT." <20020719135709.A22330@glacier.arctrix.com> References: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com> <20020719135709.A22330@glacier.arctrix.com> Message-ID: <200207192120.g6JLKZw15241@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum wrote: > > - There really isn't anything "broken" about the current situation; > > it's just that "next" is the only method name mapped to a slot in > > the type object that doesn't have leading and trailing double > > underscores. > > Are you saying the _only_ reason to rename it is for consistency with > the other type slot method names? That's really weak, IMHO, and not > worth any kind of backwards incompatibility (which seems unavoidable). > > Neil Almost. This means that we're retroactively saying that all objects with a next method are iterators, thereby slightly stomping on the user's namespace. But as long a you don't use such an object as an iterator, it's harmless. And if my position wasn't clear already, I agree it's not worth "fixing". :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jul 19 22:23:07 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 17:23:07 -0400 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: Your message of "Fri, 19 Jul 2002 22:58:57 +0200." <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> Message-ID: <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net> > I've a question that I'd like some feedback on. On MacOSX > there's a set of directories that are meant especially for > storing extensions to applications, and there's requests on the > pythonmac-sig that I add these directories to the Python search > path. This could easily be done optionally, with a .pth file in > site-python. > > MacOSX has rationalized where preferences, libraries, licenses, > extensions, etc are stored, and for all of these there's a > hierarchy of folders. In the case of Python extension modules > the logical places would be ~/Library/Application Support/Python > (for user-installed extension modules), /Library/Application > Support/Python (for machine-wide installed extension modules) > and /Network/Library/Application Support/Python (for > workgroup-wide installed modules). The final location, in > /System, is for factory-installed stuff from Apple, not needed > just yet for this example:-). > > I sympathize with the idea of making things more conform to the > platform standard, on the other hand I'm a bit reluctant to do > things differently again from what other Pythons do. But, one of > the things that is sorely missing from Python is a standard > place to install per-user extension modules, so this might well > be the thing that triggers inclusion of such functionality into > the grand scheme of things (including distutils support, etc). Traditionally, on Unix per-user extensions are done by pointing PYTHONPATH to your per-user directory (-ies) in your .profile. On Windows you can do this too, but I bet most people just have a per-user computer. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Fri Jul 19 22:34:40 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 19 Jul 2002 23:34:40 +0200 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <20020719112602.A17763@ActiveState.com> Message-ID: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> On vrijdag, juli 19, 2002, at 08:26 , Trent Mick wrote: > [Tim Peters wrote] >> Excellent! One down, about two hundred thousand to go. > > Mark rocks! Oh, it's MarkH appreciation that's wanted! In that case I'll gladly chime in, I was was afraid it was __declspec(dllexport) appreciation. Mark is one cool dude who knows where his towel is! 199998 to go. Should we start taking a poll who'll be the next python-devver we start appreciating when the counter hits zero? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From barry@zope.com Fri Jul 19 22:46:30 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 17:46:30 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? References: <20020719112602.A17763@ActiveState.com> <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> Message-ID: <15672.34998.636509.747342@anthem.wooz.org> >>>>> "JJ" == Jack Jansen writes: JJ> Oh, it's MarkH appreciation that's wanted! In that case I'll JJ> gladly chime in, I was was afraid it was __declspec(dllexport) JJ> appreciation. Mark is one cool dude who knows where his towel JJ> is! JJ> 199998 to go. Should we start taking a poll who'll be the next JJ> python-devver we start appreciating when the counter hits JJ> zero? My everlasting appreciation of MarkH was cemented the night, many IPCs ago, that he drank me under the table and called us "purple". 199997-to-go-ly y'rs, -Barry From barry@zope.com Fri Jul 19 22:48:53 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 17:48:53 -0400 Subject: [Python-Dev] Added platform-specific directories to sys.path References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15672.35141.803094.488541@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Traditionally, on Unix per-user extensions are done by GvR> pointing PYTHONPATH to your per-user directory (-ies) in your GvR> .profile. Or adding them to sys.path via your $PYTHONSTARTUP file. OTOH, it might be nice if the distutils `install' command had some switches to make installing in some of these common alternative locations a little easier. That might dovetail nicely if/when we decide to add a site-updates directory to sys.path. -Barry From tommy@ilm.com Fri Jul 19 23:11:07 2002 From: tommy@ilm.com (Hambozo) Date: Fri, 19 Jul 2002 15:11:07 -0700 (PDT) Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <15672.34998.636509.747342@anthem.wooz.org> References: <20020719112602.A17763@ActiveState.com> <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> <15672.34998.636509.747342@anthem.wooz.org> Message-ID: <15672.36408.362000.540999@mace.lucasdigital.com> Barry A. Warsaw writes: | | My everlasting appreciation of MarkH was cemented the night, many IPCs | ago, that he drank me under the table and called us "purple". When anyone asks my opinion of Mark I always say: "F**kin' Ripper!" :) 199996 and counting... -Tommy From barry@zope.com Fri Jul 19 23:10:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 19 Jul 2002 18:10:59 -0400 Subject: [Python-Dev] Do we still need Lib/test/data? Message-ID: <15672.36467.645262.622848@anthem.wooz.org> I'm about to check in some changes to the email package, which will include a re-organization of its test suite. Part of this will be so that I can add some huge torture tests to the standalone mimelib project without committing megs of email samples to the Python project. It will also makes it easier for me to create the mimelib distro because I'll then be able to put the setup.py file in the email directory instead of having to maintain a fake hierarchy elsewhere just to make distutils happy. Specifically, I'm going to move the bulk of Lib/test_email.py and Lib/test_email_codes.py to Lib/email/test and make email.test a full-fledged subpackage of the email package. I'm also going to move the Lib/test/data directory to Lib/email/test. I'll do this by creating a new directory and cvs adding a copy of the files to the new location (the cvs revision history isn't important enough to preserve). I believe this should be entirely transparent to most of you. My question is whether I should cvsrm the files that are currently in Lib/test/data or not? On the one hand, I don't want to maintain duplicates, but OTOH, I'm not sure if any other code or tests depends on those files (I did some attempts at grepping for this and didn't /see/ anything but I'm trying to be conservative). Needless to say I won't be actually removing the Lib/test/data directory, but a "cvs up -P" would hide it from you. Any opinions? -Barry From neal@metaslash.com Fri Jul 19 23:32:09 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 19 Jul 2002 18:32:09 -0400 Subject: [Python-Dev] The iterator story References: <20020719120043.A21503@glacier.arctrix.com> Message-ID: <3D389369.948547E0@metaslash.com> Neil Schemenauer wrote: > > Ka-Ping Yee wrote: > > I think "for" should be non-destructive because that's the way > > it has almost always behaved, and that's the way it behaves in > > any other language [@] i can think of. > > I agree that it can be surprising to have "for" destory the object it's > looping over. I myself was bitten once by it. I'm not yet sure if this > is something that will repeatedly bite. I suspect it might. :-( In what context? Were you iterating over a file or something else? I'm wondering if this is a problem, perhaps pychecker could generate a warning? Neal From aahz@pythoncraft.com Fri Jul 19 23:29:38 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 19 Jul 2002 18:29:38 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> References: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020719222938.GA23413@panix.com> On Fri, Jul 19, 2002, Guido van Rossum wrote: >Ping: >> >> I think the renaming of next() to __next__() is a good idea in any >> case. It is distant enough from the other issues that it can be done >> independently of any decisions about __iter__. > > Yeah, it's just a pain that it's been deployed in Python 2.2 since > last December, and by the time 2.3 is out it will probably have been > at least a full year. Worse, 2.2 is voted to be Python-in-a-Tie, > giving that particular idiom a very long lifetime. I simply don't > think we can break compatibility that easily. Remember the endless > threads we've had about the pace of change and stability. We have to > live with warts, alas. And this is a pretty minor one if you ask me. Is this a Pronouncement, or are we still waiting on the results of the survey? Note that several people have suggested a multi-release strategy for fixing this problem; does that make any difference? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From neal@metaslash.com Fri Jul 19 23:47:38 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 19 Jul 2002 18:47:38 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability References: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> <20020719222938.GA23413@panix.com> Message-ID: <3D38970A.2693833E@metaslash.com> Aahz wrote: > > On Fri, Jul 19, 2002, Guido van Rossum wrote: > >Ping: > >> > >> I think the renaming of next() to __next__() is a good idea in any > >> case. It is distant enough from the other issues that it can be done > >> independently of any decisions about __iter__. > > > > Yeah, it's just a pain that it's been deployed in Python 2.2 since > > last December, and by the time 2.3 is out it will probably have been > > at least a full year. Worse, 2.2 is voted to be Python-in-a-Tie, > > giving that particular idiom a very long lifetime. I simply don't > > think we can break compatibility that easily. Remember the endless > > threads we've had about the pace of change and stability. We have to > > live with warts, alas. And this is a pretty minor one if you ask me. > > Is this a Pronouncement, or are we still waiting on the results of the > survey? Note that several people have suggested a multi-release > strategy for fixing this problem; does that make any difference? Would it be good to use __next__() if it exists, else try next()? This doesn't fix the current 'wart,' however, it could allow moving closer to the desired end. It could cause confusion. For compatability, one would only need to do: next = __next__ or vica versa. Not sure this is worth it. But if there is a transition, it could ease the pain. Neal From nas@python.ca Sat Jul 20 00:22:26 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 19 Jul 2002 16:22:26 -0700 Subject: [Python-Dev] The iterator story In-Reply-To: <3D389369.948547E0@metaslash.com>; from neal@metaslash.com on Fri, Jul 19, 2002 at 06:32:09PM -0400 References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> Message-ID: <20020719162226.A22929@glacier.arctrix.com> Neal Norwitz wrote: > In what context? Were you iterating over a file or something else? > I'm wondering if this is a problem, perhaps pychecker could generate > a warning? I was switching between implementing something as a generator and returning a list. I was curious why I was getting different behavior until I realized I was iterating over the result twice. I don't think pychecker could warn about such a bug. Neil From martin@v.loewis.de Sat Jul 20 01:02:11 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 20 Jul 2002 02:02:11 +0200 Subject: [Python-Dev] Where's time.daylight??? In-Reply-To: <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can > try to figure out what the right thing is for Tru64. This is the wrong solution; instead, you need to define _GNU_SOURCE in addition to _XOPEN_SOURCE. Regards, Martin From martin@v.loewis.de Sat Jul 20 01:06:51 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 20 Jul 2002 02:06:51 +0200 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> Message-ID: Jack Jansen writes: > I sympathize with the idea of making things more conform to the > platform standard, on the other hand I'm a bit reluctant to do things > differently again from what other Pythons do. But, one of the things > that is sorely missing from Python is a standard place to install > per-user extension modules, so this might well be the thing that > triggers inclusion of such functionality into the grand scheme of > things (including distutils support, etc). If that is the platform convention, I see no problem following it. Windows already does things differently from Unix, by using the registry to compute sys.path. Regards, Martin From guido@python.org Sat Jul 20 01:30:04 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 20:30:04 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 18:29:38 EDT." <20020719222938.GA23413@panix.com> References: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> <20020719222938.GA23413@panix.com> Message-ID: <200207200030.g6K0U4P26218@pcp02138704pcs.reston01.va.comcast.net> > > Yeah, it's just a pain that it's been deployed in Python 2.2 since > > last December, and by the time 2.3 is out it will probably have been > > at least a full year. Worse, 2.2 is voted to be Python-in-a-Tie, > > giving that particular idiom a very long lifetime. I simply don't > > think we can break compatibility that easily. Remember the endless > > threads we've had about the pace of change and stability. We have to > > live with warts, alas. And this is a pretty minor one if you ask me. > > Is this a Pronouncement, or are we still waiting on the results of the > survey? That is my current opinion. I'm waiting for the results of the survey to see if I'll be swayed (but I don't think it's likely). > Note that several people have suggested a multi-release > strategy for fixing this problem; does that make any difference? Such a big gun for such a minor problem. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 20 01:41:18 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 20:41:18 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Your message of "Fri, 19 Jul 2002 18:47:38 EDT." <3D38970A.2693833E@metaslash.com> References: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> <20020719222938.GA23413@panix.com> <3D38970A.2693833E@metaslash.com> Message-ID: <200207200041.g6K0fIX26940@pcp02138704pcs.reston01.va.comcast.net> > Would it be good to use __next__() if it exists, else try next()? Then the code in typeobject.c (e.g. resolve_slotdups) would have to map tp_iternext to *both* __next__ and next. > This doesn't fix the current 'wart,' however, it could allow > moving closer to the desired end. It could cause confusion. > For compatability, one would only need to do: > > next = __next__ > > or vica versa. > > Not sure this is worth it. But if there is a transition, it could > ease the pain. I don't think it's worth it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 20 01:43:21 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 19 Jul 2002 20:43:21 -0400 Subject: [Python-Dev] Where's time.daylight??? In-Reply-To: Your message of "Sat, 20 Jul 2002 02:02:11 +0200." References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> > > I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can > > try to figure out what the right thing is for Tru64. > > This is the wrong solution; instead, you need to define _GNU_SOURCE in > addition to _XOPEN_SOURCE. Can you check that in? I'm about to disappear to OSCON for a week. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 20 07:06:29 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 20 Jul 2002 02:06:29 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: Your message of "Mon, 24 Jun 2002 21:33:18 EDT." <20020624213318.A5740@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> Message-ID: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net> > Any chance something like this could make it into the standard python > library? It would save a lot of time for lazy people like myself. :-) > > def heappush(heap, item): > pos = len(heap) > heap.append(None) > while pos: > parentpos = (pos - 1) / 2 > parent = heap[parentpos] > if item <= parent: > break > heap[pos] = parent > pos = parentpos > heap[pos] = item > > def heappop(heap): > endpos = len(heap) - 1 > if endpos <= 0: > return heap.pop() > returnitem = heap[0] > item = heap.pop() > pos = 0 > while 1: > child2pos = (pos + 1) * 2 > child1pos = child2pos - 1 > if child2pos < endpos: > child1 = heap[child1pos] > child2 = heap[child2pos] > if item >= child1 and item >= child2: > break > if child1 > child2: > heap[pos] = child1 > pos = child1pos > continue > heap[pos] = child2 > pos = child2pos > continue > if child1pos < endpos: > child1 = heap[child1pos] > if child1 > item: > heap[pos] = child1 > pos = child1pos > break > heap[pos] = item > return returnitem I have read (or at least skimmed) this entire thread now. After I reconstructed the algorithm in my head, I went back to Kevin's code; I admire the compactness of his code. I believe that this would make a good addition to the standard library, as a friend of the bisect module. The only change I would make would be to make heap[0] the lowest value rather than the highest. (That's one thing that I liked better about François Pinard's version, but a class seems too heavy for this, just like it is overkill for bisect [*]. Oh, and maybe we can borrow a few lines of François's description of the algorithm. :-) I propose to call it heapq.py. (Got a better name? Now or never.) [*] Afterthought: this could be made into an new-style class by adding something like this to the end of module: class heapq(list): __slots__ = [] heappush = heappush heappop = heappop A similar addition could easily be made to the bisect module. But this is very different from François' class, which hides the other list methods. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Sat Jul 20 07:18:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 20 Jul 2002 02:18:16 -0400 Subject: [Python-Dev] Sorting Message-ID: An enormous amount of research has been done on sorting since the last time I wrote a sort for Python. Major developments have been in two areas: 1. Adaptive sorting. Sorting algorithms are usually tested on random data, but in real life you almost never see random data. Python's sort tries to catch some common cases of near-order via special- casing. The literature has since defined more than 15 formal measures of disorder, and developed algorithms provably optimal in the face of one or more of them. But this is O() optimality, and theoreticians aren't much concerned about how big the constant factor is. Some researchers are up front about this, and toward the end of one paper with "practical" in its title, the author was overjoyed to report that an implementation was only twice as slow as a naive quicksort . 2. Pushing the worst-case number of comparisons closer to the information-theoretic limit (ceiling(log2(N!))). I don't care much about #2 -- in experiments conducted when it was new, I measured the # of comparisons our samplesort hybrid did on random inputs, and it was never more than 2% over the theoretical lower bound, and typically closer. As N grows large, the expected case provably converges to the theoretical lower bound. There remains a vanishly small chance for a bad case, but nobody has reported one, and at the time I gave up trying to construct one. Back on Earth, among Python users the most frequent complaint I've heard is that list.sort() isn't stable. Alex is always quick to trot out the appropriate DSU (Decorate Sort Undecorate) pattern then, but the extra memory burden for that can be major (a new 2-tuple per list element costs about 32 bytes, then 4 more bytes for a pointer to it in a list, and 12 more bytes that don't go away to hold each non-small index). After reading all those papers, I couldn't resist taking a crack at a new algorithm that might be practical, and have something you might call a non-recursive adaptive stable natural mergesort / binary insertion sort hybrid. In playing with it so far, it has two bad aspects compared to our samplesort hybrid: + It may require temp memory, up to 2*N bytes worst case (one pointer each for no more than half the array elements). + It gets *some* benefit for arrays with many equal elements, but not nearly as much as I was able to hack samplesort to get. In effect, paritioning is very good at moving equal elements close to each other quickly, but merging leaves them spread across any number of runs. This is especially irksome because we're sticking to Py_LT for comparisons, so can't even detect a==b without comparing a and b twice (and then it's a deduction from that not a < b and not b < a). Given the relatively huge cost of comparisons, it's a timing disaster to do that (compare twice) unless it falls out naturally. It was fairly natural to do so in samplesort, but not at all in this sort. It also has good aspects: + It's stable (items that compare equal retain their relative order, so, e.g., if you sort first on zip code, and a second time on name, people with the same name still appear in order of increasing zip code; this is important in apps that, e.g., refine the results of queries based on user input). + The code is much simpler than samplesort's (but I think I can fix that ). + It gets benefit out of more kinds of patterns, and without lumpy special-casing (a natural mergesort has to identify ascending and descending runs regardless, and then the algorithm builds on just that). + Despite that I haven't micro-optimized it, in the random case it's almost as fast as the samplesort hybrid. In fact, it might have been a bit faster had I run tests yesterday (the samplesort hybrid got sped up by 1-2% last night). This one surprised me the most, because at the time I wrote the samplesort hybrid, I tried several ways of coding mergesorts and couldn't make it as fast. + It has no bad cases (O(N log N) is worst case; N-1 compares is best). Here are some typical timings, taken from Python's sortperf.py, over identical lists of floats: Key: *sort: random data \sort: descending data /sort: ascending data 3sort: ascending data but with 3 random exchanges ~sort: many duplicates =sort: all equal !sort: worst case scenario That last one was a worst case for the last quicksort Python had before it grew the samplesort, and it was a very bad case for that. By sheer coincidence, turns out it's an exceptionally good case for the experimental sort: samplesort i 2**i *sort \sort /sort 3sort ~sort =sort !sort 15 32768 0.13 0.01 0.01 0.10 0.04 0.01 0.11 16 65536 0.24 0.02 0.02 0.23 0.08 0.02 0.24 17 131072 0.54 0.05 0.04 0.49 0.18 0.04 0.53 18 262144 1.18 0.09 0.09 1.08 0.37 0.09 1.16 19 524288 2.58 0.19 0.18 2.34 0.76 0.17 2.52 20 1048576 5.58 0.37 0.36 5.12 1.54 0.35 5.46 timsort 15 32768 0.16 0.01 0.02 0.05 0.14 0.01 0.02 16 65536 0.24 0.02 0.02 0.06 0.19 0.02 0.04 17 131072 0.55 0.04 0.04 0.13 0.42 0.04 0.09 18 262144 1.19 0.09 0.09 0.25 0.91 0.09 0.18 19 524288 2.60 0.18 0.18 0.46 1.97 0.18 0.37 20 1048576 5.61 0.37 0.35 1.00 4.26 0.35 0.74 If it weren't for the ~sort column, I'd seriously suggest replacing the samplesort with this. 2*N extra bytes isn't as bad as it might sound, given that, in the absence of massive object duplication, each list element consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for the list pointer. Add 'em all up and that's a 13% worst-case temp memory overhead. From martin@v.loewis.de Sat Jul 20 09:59:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 20 Jul 2002 10:59:55 +0200 Subject: [Python-Dev] Where's time.daylight??? In-Reply-To: <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Can you check that in? I'm about to disappear to OSCON for a week. Done. I have no OSF/1 (aka whatever) system, so I can't really test whether it still helps on these systems. Regards, Martin From jacobs@penguin.theopalgroup.com Sat Jul 20 12:11:36 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Sat, 20 Jul 2002 07:11:36 -0400 (EDT) Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: On Sat, 20 Jul 2002, Tim Peters wrote: > After reading all those papers, I couldn't resist taking a crack at a new > algorithm that might be practical, and have something you might call a > non-recursive adaptive stable natural mergesort / binary insertion sort > hybrid. Great work, Tim! I've got several Python implementations of stable-sorts that I can now retire. > If it weren't for the ~sort column, I'd seriously suggest replacing the > samplesort with this. If duplicate keys cannot be more efficiently handled, why not add a list.stable_sort() method? That way the user gets to decide if they want the ~sort tax. If that case is fixed later, then there is little harm in having list.sort == list.stable_sort. > 2*N extra bytes isn't as bad as it might sound, given > that, in the absence of massive object duplication, each list element > consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for > the list pointer. Add 'em all up and that's a 13% worst-case temp memory > overhead. It doesn't bother me in the slightest (and I tend to sort big things). 13% is a reasonable trade-off for stability. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From pinard@iro.umontreal.ca Sat Jul 20 13:24:45 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jul 2002 08:24:45 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net> References: <20020624213318.A5740@arizona.localdomain> <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > Oh, and maybe we can borrow a few lines of François's description of > the algorithm. :-) Borrow liberally! I would prefer that nothing worth remains un-borrowed from mine, so I can happily get rid of my copy when the time comes! :-) > I propose to call it heapq.py. (Got a better name? Now or never.) I like `heapq' as it is not an English common name, like `heap' would be, so less likely to clash with user chosen variable names! This principle should be good in general. Sub-classing `heapq' from `list' is a good idea! P.S. - In other languages, I have been using `string' a lot, and this has been one of the minor irritations when I came to Python, that it forced me away of that identifier; so I'm now using `text' everywhere, instead. Another example is the name `socket', which is kind of reserved from the module name, I never really know how to name variables holding sockets :-). -- François Pinard http://www.iro.umontreal.ca/~pinard From ping@zesty.ca Sat Jul 20 13:32:41 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Sat, 20 Jul 2002 05:32:41 -0700 (PDT) Subject: [Python-Dev] The iterator story In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> Message-ID: If you only have ten seconds read this: --------------------------------------- Guido, i believe i understand your position. My interpretation is: I'd like "iterate destructively" and "iterate non-destructively" to be spelled differently. You don't. I'd like to be able to establish conventions so that "x in y" doesn't destroy y. This isn't so important to you. We have a difference of opinion. I don't think we have a failure in understanding. If the opinions won't change, we might as well move on. I did not mean to waste your time, only to achieve understanding. Actual reply follows: --------------------- On Fri, 19 Jul 2002, Guido van Rossum wrote: > But I note that there are hybrids, and I think files (at least > seekable files) fall in the hybrid category. Indeed, files are unusual. In the particular way that i've chosen my definitions, though, classification of files is clear: files are not containers (there's no non-mutating read) and files are iterators (due to the behaviour of the read() method). Files aside, i do agree that hybrids exist. The dbm and tree examples you gave indeed mix container and iterator behaviour. I agree with you that mixing these things isn't usually a good design. In some cases you do end up providing both container-like and iterator-like interfaces. This is fine. But then when you use the object, you ought to be able to know which interface you are using. The argument in the "iterator story" message is that we should have a way to say "i want to use the non-destructive interface" and a way to say "i want to use the destructive interface". Depending what makes sense, one can choose to implement either interface, or both. > For example, while a tape file is a > container in the sense that reading the data doesn't destroy it, it's > very heavily geared towards sequential access, and you can't > realistically have two iterators going over the same tape at once. Indeed, you can't. But a tape file object is not a container (if we're using my definition), because the act of reading changes the tape file object -- it advances the tape. It's the same as file.read() -- even though file.read() doesn't mutate the data on the disk, it does mutate the file object, and that is what makes the file object not a container. It's precisely because tapes are too slow for practical random access that we would want a tape file object to provide an iterator-style interface and not provide a container-style interface. > If you're too young to remember Hee hee. I've used tapes. I've used *cassette* tapes, even. :) > > The issue is, should "for" be non-destructive? > > I don't see the benefit. We've done this for years and the only > conceptual problem was the abuse of __getitem__, not the > destructiveness of the for-loop. [...] > > The issue is, should "in" be non-destructive? > > If it can't be helped otherwise, sure, why not? Obviously we see these "problems" differently. Having "x in y" possibly destroy y is scary to me, but no big deal to you. All right. > > still produces "KeyError: 0"! This oughta be fixed...) > > Check the CVS logs. At one point before 2.2 was released, UserDict > has a __iter__ method. But then SF bug 448153 was filed, presenting > evidence that this broke previously working code. So a separate > class, IterableUserDict, was added that has the __iter__ method. Oh. :( Okay. Thanks for explaining. > There are a lot of objects that > have a way to return an iterators (old style using fake __getitem__, > and new ones using __iter__ and next) that are intended to be looped > over, once. I have no desire to deprecate this behavior, since (a) it > would be a major upheaval for the user community (a lot worse than > integer division), and (b) I don't see that "fixing" this prevents a > particular category of programming errors. As you can tell by now, i think it does prevent a certain category of errors. The general description is "mixing up mutating and non-mutating interfaces". The closest analogy i can think of is an alternate world in which "+" and "+=" had the same name, and the only way you could tell if the left operand would get mutated is by knowing the implementation of the left-hand object at runtime. Of course, in real Python you have to trust that the implementation "+" does not mutate. But at least we are able to set a convention, because "+" and "+=" are distinct operators. In the weird alternate world where "+" and "+=" are both written "+", you would have no hope of telling the difference. We'd look at "x + y" and say "Will x change? I don't know." And so it is with "for x in y": we'd look at that and say "Will y change? I don't know." We have no way of telling whether y is a container or an iterator, thus no way to establish a convention about what this should do. "for x in y" is polymorphic on y, but this is not how i think polymorphism is supposed to work. You could say you don't care whether y changes. (Well, you *are* saying you don't care.) Well, okay. I just want to make sure we both understand each other and see the issue at hand. If we do, then it just comes down to a difference of opinion about how significant a mixup this is, and so be it. > > I believe __iter__ is not a type flag. [...] > And I never said it was a type flag. I'm tired of repeating myself, > but you keep repeating this broken argument, so I have to keep > correcting you. I know you didn't say this. Please don't be offended. I apologize if i seemed to be wilfully ignoring you -- you don't have to repeat things many times in order to "drive home" your position to me. I was trying to summarize all the positions (not just yours), organize them, and explain them all at once. -- ?!ng From ping@zesty.ca Sat Jul 20 13:45:48 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Sat, 20 Jul 2002 05:45:48 -0700 (PDT) Subject: [Python-Dev] Re: The iterator story In-Reply-To: <20020719120043.A21503@glacier.arctrix.com> Message-ID: On Fri, 19 Jul 2002, Neil Schemenauer wrote: > First, people could implement __iter__ such that it returns an iterator > the mutates the original object (e.g. a file object __iter__ that > returns xreadlines). Yes, but then they would be violating the convention. The way things currently stand, we aren't even able to say what the convention *is*. > Second, it will be confusing to have two different ways of looping over > things. It's a difference in perspective. To me it seems confusing to have only one way of looping that might do two different things. But Guido basically agrees with you. (As in, destructive and non-destructive looping are not really that different; or, they are different but it's not worth the bother.) > Now I want to use this library but I have an iterator, not something > that implements __iter__. I would need to create a little wrapper with > a __iter__ method that returns my object. Yeah, that's seq(). > To summarize, I agree that "for" mutating the object can be surprising. The rub is, the only way for it to *not* be surprising is to have a way to *say* "loop destructively". If you can't express your expectations, there's no way to meet them. -- ?!ng From ping@zesty.ca Sat Jul 20 13:58:39 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Sat, 20 Jul 2002 05:58:39 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: On Fri, 19 Jul 2002, Tim Peters wrote: > "for" did and does work in accord with a simple protocol, and whether that's > "destructive" depends on how the specific objects involved implement their > pieces of the protocol, not on the protocol itself. The same is true of all > of Python's hookable protocols. Name any protocol for which the question "does this mutate?" has no answer. (I ask you to accept that __call__ is a special case.) > What's so special about "for" that it > should pretend to deliver purely functional behavior in a highly > non-functional language? Who said anything about functional behaviour? I'm not requiring that looping *never* mutate. I just want to be able to tell *whether* it will. -- ?!ng From oren-py-d@hishome.net Sat Jul 20 13:58:51 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 20 Jul 2002 08:58:51 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> References: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020720125850.GA5862@hishome.net> > > Based on Guido's positive response, in which he asked me to make > > an addition to the PEP, i believe Guido agrees with me that > > __iter__ is distinct from the protocol of an iterator. This > > surprised me because it runs counter to the philosophy previously > > expressed in the PEP. > > I recognize that they are separate protocols. But because I like the > for-loop as a convenient way to get all of the elements of an > iterator, I want iterators to support __iter__. Is this the only reason iterators are required to support __iter__? It seems like a strange design decision to put the burden on all iterator implementers to write a dummy method returning self instead of just checking if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers to write a dummy __str__ method that calls __repr__ instead of implementing the automatic fallback to __repr__ in PyObject_Str when no __str__ is available. Oren From aahz@pythoncraft.com Sat Jul 20 14:00:01 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 20 Jul 2002 09:00:01 -0400 Subject: [Python-Dev] Sorting In-Reply-To: References: Message-ID: <20020720130000.GA11845@panix.com> On Sat, Jul 20, 2002, Tim Peters wrote: > > If it weren't for the ~sort column, I'd seriously suggest replacing the > samplesort with this. 2*N extra bytes isn't as bad as it might sound, given > that, in the absence of massive object duplication, each list element > consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for > the list pointer. Add 'em all up and that's a 13% worst-case temp memory > overhead. Any reason the list object can't grow a .stablesort() method? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From oren-py-d@hishome.net Sat Jul 20 14:28:57 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 20 Jul 2002 09:28:57 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: <20020719162226.A22929@glacier.arctrix.com> References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> Message-ID: <20020720132857.GB5862@hishome.net> On Fri, Jul 19, 2002 at 04:22:26PM -0700, Neil Schemenauer wrote: > Neal Norwitz wrote: > > In what context? Were you iterating over a file or something else? > > I'm wondering if this is a problem, perhaps pychecker could generate > > a warning? > > I was switching between implementing something as a generator and > returning a list. I was curious why I was getting different behavior > until I realized I was iterating over the result twice. I don't > think pychecker could warn about such a bug. That's the scenario that bit me too. For me it was a little more difficult to find because it was wrapped in a few layers of chained transformations. I can't tell by the last element in the chain whether the first one is re-iterable or not. One approach to solve this is Ka-Ping Yee's proposal to specify in advance whether you are expecting an iterator or a re-iterable container using either 'for x in y' or 'for x from y'. I don't think this will work. There's already too much code that uses for x in y where y is an iterator. Another problem is that a transformation shouldn't care whether its upstream source is an iterator or an iterable - it's a generic reusable building block. My suggestion (which was rejected by Guido) was to raise an error when an iterator's .next() method is called afer it raises StopIteration. This way, if I try to iterate over the result again at least I'll get and error like "IteratorExhaustedError" instead something that is indistinguishable from an iterator of an empty container. I hate silent errors. This shouldn't be required from all iterator implementers but if all built-in iterators supported this (especially generators) it would help a lot to find such errors. Oren P.S. My definition of a transformation is a function taking one iterable argument and returning an iterator. It is usually implemented as a generator function. From oren-py-d@hishome.net Sat Jul 20 14:39:26 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 20 Jul 2002 09:39:26 -0400 Subject: [Python-Dev] Re: The iterator story In-Reply-To: References: <20020719120043.A21503@glacier.arctrix.com> Message-ID: <20020720133926.GC5862@hishome.net> On Sat, Jul 20, 2002 at 05:45:48AM -0700, Ka-Ping Yee wrote: > > To summarize, I agree that "for" mutating the object can be surprising. > > The rub is, the only way for it to *not* be surprising is to have a > way to *say* "loop destructively". If you can't express your > expectations, there's no way to meet them. It doesn't seem very useful to say "loop destructively" - in these cases I don't usually care whether it's destructive or not. It is useful, though, to be able to say "loop INdestructively". That's how I do it: def reiter(obj): """ Return an object's iterator, raise exception if object does not appear to support multiple iterations """ assert not isintance(obj, file) itr = iter(obj) assert itr is not obj return itr Oren From aahz@pythoncraft.com Sat Jul 20 15:09:23 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 20 Jul 2002 10:09:23 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: <20020720132857.GB5862@hishome.net> References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> Message-ID: <20020720140923.GA18716@panix.com> On Sat, Jul 20, 2002, Oren Tirosh wrote: > On Fri, Jul 19, 2002 at 04:22:26PM -0700, Neil Schemenauer wrote: >> Neal Norwitz wrote: >>> >>> In what context? Were you iterating over a file or something else? >>> I'm wondering if this is a problem, perhaps pychecker could generate >>> a warning? >> >> I was switching between implementing something as a generator and >> returning a list. I was curious why I was getting different behavior >> until I realized I was iterating over the result twice. I don't >> think pychecker could warn about such a bug. > > That's the scenario that bit me too. For me it was a little more difficult > to find because it was wrapped in a few layers of chained transformations. > I can't tell by the last element in the chain whether the first one is > re-iterable or not. > > My suggestion (which was rejected by Guido) was to raise an error when an > iterator's .next() method is called afer it raises StopIteration. This > way, if I try to iterate over the result again at least I'll get and error > like "IteratorExhaustedError" instead something that is indistinguishable > from an iterator of an empty container. I hate silent errors. I'm still not understanding how this would help. When a chainable transformer gets StopIteration, it should immediately return. What else do you want to do? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Sat Jul 20 15:10:57 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 20 Jul 2002 10:10:57 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: Your message of "Sat, 20 Jul 2002 05:32:41 PDT." References: Message-ID: <200207201410.g6KEAvY29349@pcp02138704pcs.reston01.va.comcast.net> > If you only have ten seconds read this: > --------------------------------------- > > Guido, i believe i understand your position. My interpretation is: > > I'd like "iterate destructively" and "iterate non-destructively" > to be spelled differently. You don't. > > I'd like to be able to establish conventions so that "x in y" > doesn't destroy y. This isn't so important to you. > > We have a difference of opinion. I don't think we have a failure in > understanding. If the opinions won't change, we might as well move on. > I did not mean to waste your time, only to achieve understanding. Aye, aye, Sir. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 20 15:13:34 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 20 Jul 2002 10:13:34 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: Your message of "Sat, 20 Jul 2002 08:58:51 EDT." <20020720125850.GA5862@hishome.net> References: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> <20020720125850.GA5862@hishome.net> Message-ID: <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net> > > > Based on Guido's positive response, in which he asked me to make > > > an addition to the PEP, i believe Guido agrees with me that > > > __iter__ is distinct from the protocol of an iterator. This > > > surprised me because it runs counter to the philosophy previously > > > expressed in the PEP. > > > > I recognize that they are separate protocols. But because I like the > > for-loop as a convenient way to get all of the elements of an > > iterator, I want iterators to support __iter__. > > Is this the only reason iterators are required to support __iter__? Yes. > It seems like a strange design decision to put the burden on all iterator > implementers to write a dummy method returning self instead of just checking > if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers > to write a dummy __str__ method that calls __repr__ instead of implementing > the automatic fallback to __repr__ in PyObject_Str when no __str__ is > available. I suppose you meant "check for tp_iter==NULL and tp_iternext!=NULL. --Guido van Rossum (home page: http://www.python.org/~guido/) From cce@clarkevans.com Sat Jul 20 17:21:01 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Sat, 20 Jul 2002 12:21:01 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 19, 2002 at 05:10:45PM -0400 References: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020720122101.A38901@doublegemini.com> On Fri, Jul 19, 2002 at 05:10:45PM -0400, Guido van Rossum wrote: | > The __iter__-On-Iterators Issue: | > | > Some people have mentioned that the presence of an __iter__() | > method is a way of signifying that an object supports the | > iterator protocol. It has been said that this is necessary | > because the presence of a "next()" method is not sufficiently | > distinguishing. | | Not me. As I remember the debate last year, Ping is expressing the concensus which was reached. This issue was tied directly, although not so articulately, to the namespace collision issue. I remember being concerned about next() not having leading and trailing __ but my concerns were put to rest knowing that every iterator had to have a __iter__ such that __iter__ returned self. I wasn't on the list for that long due to time constraints, but this linkage was there at least for me. | > The iteration method is currently called "next()". | > | > Previous candidates for the name of this method were "next", | > "__next__", and "__call__". After some previous debate, | > it was pronounced to be "next()". | > | > There are concerns that "next()" might collide with existing | > methods named "next()". There is also a concern that "next()" | > is inconsistent because it is the only type-slot-method that | > does not have a __special__ name. | > | > The issue is, should it be called "next" or "__next__"? | | That's a separate issue, and cleans up only a small wart that in | practice hasn't hurt anybody AFAIK. Today/tomorow I'll finish peicing together the survey so that it clearly articulates the issue (and I'll be sure to note that you are against the idea). Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From neal@metaslash.com Sat Jul 20 17:52:49 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 20 Jul 2002 12:52:49 -0400 Subject: [Python-Dev] Where's time.daylight??? References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D399561.A474C77A@metaslash.com> "Martin v. Loewis" wrote: > > Can you check that in? I'm about to disappear to OSCON for a week. > > Done. I have no OSF/1 (aka whatever) system, so I can't really test > whether it still helps on these systems. It doesn't work on dec^w alpha^w compaq ... I've got an autoconf patch which works on Linux & OSF: http://python.org/sf/584245 There are some test failures I will look at later: test test_dl crashed -- exceptions.SystemError: module dl requires sizeof(int) == sizeof(long) == sizeof(char*) test test_nis crashed -- exceptions.SystemError: error return without exception set test_pwd may have hung which is the last test run Neal From tim.one@comcast.net Sun Jul 21 04:26:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 20 Jul 2002 23:26:44 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: Quick update. I left off here: samplesort i 2**i *sort \sort /sort 3sort ~sort =sort !sort 15 32768 0.13 0.01 0.01 0.10 0.04 0.01 0.11 16 65536 0.24 0.02 0.02 0.23 0.08 0.02 0.24 17 131072 0.54 0.05 0.04 0.49 0.18 0.04 0.53 18 262144 1.18 0.09 0.09 1.08 0.37 0.09 1.16 19 524288 2.58 0.19 0.18 2.34 0.76 0.17 2.52 20 1048576 5.58 0.37 0.36 5.12 1.54 0.35 5.46 timsort 15 32768 0.16 0.01 0.02 0.05 0.14 0.01 0.02 16 65536 0.24 0.02 0.02 0.06 0.19 0.02 0.04 17 131072 0.55 0.04 0.04 0.13 0.42 0.04 0.09 18 262144 1.19 0.09 0.09 0.25 0.91 0.09 0.18 19 524288 2.60 0.18 0.18 0.46 1.97 0.18 0.37 20 1048576 5.61 0.37 0.35 1.00 4.26 0.35 0.74 With a lot of complication (albeit principled complication), timsort now looks like 15 32768 0.14 0.01 0.01 0.04 0.10 0.01 0.02 16 65536 0.24 0.02 0.02 0.05 0.17 0.02 0.04 17 131072 0.54 0.05 0.04 0.13 0.38 0.04 0.09 18 262144 1.18 0.09 0.09 0.24 0.81 0.09 0.18 19 524288 2.57 0.18 0.18 0.46 1.77 0.18 0.37 20 1048576 5.55 0.37 0.35 0.99 3.81 0.35 0.74 on the same data (tiny improvements in *sort and 3sort, significant improvement in ~sort, huge improvements for some patterns that aren't touched by this test). For contrast and a sanity check, I also implemented Edelkamp and Stiegeler's "Next-to-m" refinement of weak heapsort. If you know what heapsort is, this is weaker . In the last decade, Dutton had the bright idea that a heap is stronger than you need for sorting: it's enough if you know only that a parent node's value dominates the right child's values, and then ensure that the root node has no left child. That implies the root node has the maximum value in the (weak) heap. It doesn't matter what's in the left child for the other nodes, provided only that they're weak heaps too. The weaker requirements allow faster (but trickier) code for maintaining the weak-heap invariant as sorting proceeds, and in particular it requires far fewer element comparisons than a (strong)heap sort. Edelkamp and Stiegeler complicated this algorithm in several ways to cut the comparisons even more. I stopped at their first refinement, which does a worst-case number of comparisons N*k - 2**k + N - 2*k where k = ceiling(logbase2(N)) so that even the worst case is very good. They have other gimmicks to cut it more (we're close to the theoretical limit here, so don't read too much into "more"!), but the first refinement proved so far from being promising that I dropped it: weakheapsort i 2**i *sort \sort /sort 3sort ~sort =sort !sort 15 32768 0.19 0.12 0.11 0.11 0.11 0.11 0.12 16 65536 0.31 0.26 0.23 0.23 0.24 0.23 0.26 17 131072 0.71 0.55 0.49 0.49 0.51 0.48 0.56 18 262144 1.59 1.15 1.03 1.04 1.08 1.02 1.19 19 524288 3.57 2.43 2.18 2.18 2.27 2.14 2.51 20 1048576 8.01 5.08 4.57 4.58 4.77 4.50 5.29 The number of compares isn't the problem with this. The problem appears to be heapsort's poor cache behavior, leaping around via multiplying and dividing indices by 2. This is exacerbated in weak heapsort because it also requires allocating a bit vector, to attach a "which of my children should I think of as being 'the right child'?" flag to each element, and that also gets accessed in the same kinds of cache-hostile ways at the same time. The samplesort and mergesort variants access memory sequentially. What I haven't accounted for is why weakheapsort appears to get a major benefit from *any* kind of regularity in the input -- *sort is always the worst case on each line, and by far (note that this implementation does no special-casing of any kind, so it must be an emergent property of the core algorithm). If I were a researcher, I bet I could get a good paper out of that . From tim.one@comcast.net Sun Jul 21 06:19:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 01:19:03 -0400 Subject: [Python-Dev] Sorting In-Reply-To: <20020720130000.GA11845@panix.com> Message-ID: [Aahz] > Any reason the list object can't grow a .stablesort() method? I'm not sure. Python's samplesort implementation is right up there among the most complicated (by any measure) algorithms in the code base, and the mergesort isn't any simpler anymore. Yet another large mass of difficult code can make for a real maintenance burden after I'm dead. Here, guess what this does: static int gallop_left(PyObject *pivot, PyObject** p, int n, PyObject *compare) { int k; int lo, hi; PyObject **pend; assert(pivot && p && n); pend = p+(n-1); lo = 0; hi = -1; for (;;) { IFLT(*(pend - lo), pivot) break; hi = lo; lo = (lo << 1) + 1; if (lo >= n) { lo = n; break; } } lo = n - lo; hi = n-1 - hi; while (lo < hi) { int m = (lo + hi) >> 1; IFLT(p[m], pivot) lo = m+1; else hi = m; } return lo; fail: return -1; } There are 12 other functions that go into this, some less obscure, some more. Change "hi = -1" to "hi = 0" and you'll get a core dump, etc; it's exceedingly delicate, and because truly understanding it essentially requires doing a formal correctness proof, it's difficult to maintain; fight your way to that understanding, and you'll know why it sorts, but still won't have a clue about why it's so fast. I'm disinclined to add more code of this nature unless I can use it to replace code at least as difficult (which samplesort is). An irony is that stable sorts are, by definition, pointless unless you *do* have equal elements, and the many-equal-elements case is the one known case where the new algorithm is much slower than the current one (indeed, I have good reason to suspect it's the only such case, and reasons beyond just that God loves a good joke ). It's OK by me if this were to become Python's only sort. Short of that, I'd be happier contributing the code to a sorting extension module. There are other reasons the latter may be a good idea; e.g., if you know you're sorting C longs, it's not particularly difficult to do that 10x faster than Python's generic list.sort() can do it; ditto if you know you're comparing strings; etc. Exposing the binary insertion sort (which both samplesort and mergesort use) would also be useful to some people (it's a richer variant of bisect.insort_right). I'd prefer that Python-the-language have just one "really good general sort" built in. From oren-py-d@hishome.net Sun Jul 21 06:33:40 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 21 Jul 2002 08:33:40 +0300 Subject: [Python-Dev] The iterator story In-Reply-To: <20020720140923.GA18716@panix.com>; from aahz@pythoncraft.com on Sat, Jul 20, 2002 at 10:09:23AM -0400 References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com> Message-ID: <20020721083340.A13156@hishome.net> On Sat, Jul 20, 2002 at 10:09:23AM -0400, Aahz wrote: > > That's the scenario that bit me too. For me it was a little more difficult > > to find because it was wrapped in a few layers of chained transformations. > > I can't tell by the last element in the chain whether the first one is > > re-iterable or not. > > > > My suggestion (which was rejected by Guido) was to raise an error when an > > iterator's .next() method is called afer it raises StopIteration. This > > way, if I try to iterate over the result again at least I'll get and error > > like "IteratorExhaustedError" instead something that is indistinguishable > > from an iterator of an empty container. I hate silent errors. > > I'm still not understanding how this would help. When a chainable > transformer gets StopIteration, it should immediately return. What else > do you want to do? The tranformations are fine the way they are. The problem is the source - if the source is an exhausted iterator and you ask it for a new iterator it will happily return itself and report StopIteration on each .next(). This behavior is indistringuishable from a valid iterator on an empty container. What I would like is for iterators to return StopIteration exactly once and then switch to a different exception. This way the transformations will not need to care whether their upstream source is restartable or not - the exception will propagate through the entire chain and notify the consumer at the end of the chain that the source at the beginning of the chain is not re-iterable. I'm not suggesting that all iterator implementers much do this - having it on just the builtin iterators will be a great help. Right now I am using tricks like special-casing files and checking if iter(x) is x. It works but I hate it. Oren From oren-py-d@hishome.net Sun Jul 21 06:40:14 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 21 Jul 2002 08:40:14 +0300 Subject: [Python-Dev] The iterator story In-Reply-To: <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 20, 2002 at 10:13:34AM -0400 References: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> <20020720125850.GA5862@hishome.net> <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020721084014.A13189@hishome.net> On Sat, Jul 20, 2002 at 10:13:34AM -0400, Guido van Rossum wrote: > > It seems like a strange design decision to put the burden on all iterator > > implementers to write a dummy method returning self instead of just checking > > if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers > > to write a dummy __str__ method that calls __repr__ instead of implementing > > the automatic fallback to __repr__ in PyObject_Str when no __str__ is > > available. > > I suppose you meant "check for tp_iter==NULL and tp_iternext!=NULL. Yes. Any comments on my analogy of __iter__/next with __str__/__repr__ and the burden of implementation? Oren From tim.one@comcast.net Sun Jul 21 06:38:17 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 01:38:17 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: [Ping] > Name any protocol for which the question "does this mutate?" has > no answer. Heh -- you must not use Zope much <0.6 wink>. I'm hard pressed to think of a protocol where that does have a reliable answer. Here: x1 = y.z x2 = y.z Are x1 and x2 the same object after that? At least equal? Did either line mutate y? You simply can't know without knowing how y's type implements __getattr__, and with the introduction of computed attributes (properties) it's just going to get muddier. > (I ask you to accept that __call__ is a special case.) It's not to me -- if a protocol invokes user-defined Python code, there's nothing you can say about mutability "in general", and people do both use and abuse that. >> What's so special about "for" that it should pretend to deliver >> purely functional behavior in a highly non-functional language? > Who said anything about functional behaviour? I'm not requiring that > looping *never* mutate. I just want to be able to tell *whether* it > will. I don't blame you, and sometimes I'd like to know whether y.z (or "y += z", etc) mutates y too. It cuts deeper than loops, so a loop-focused gimmick seems inadequate to me (provided "something needs to be done about it" at all -- I'm not sure, but doubt it). From tim.one@comcast.net Sun Jul 21 06:55:00 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 01:55:00 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> Message-ID: [Jack Jansen] > Oh, it's MarkH appreciation that's wanted! In that case I'll > gladly chime in, I was was afraid it was __declspec(dllexport) > appreciation. Mark is one cool dude who knows where his towel is! > > 199998 to go. Should we start taking a poll who'll be the next > python-devver we start appreciating when the counter hits zero? It would have been you, Jack, except Mark was much cleverer about this. You make the Mac support so invisible to the rest of us that the only thing we can ever thank you for is stopping refcount abuse of immortal strings. Mark put some sort of Windows gimmick on 79% of the lines in the whole code base, thus ensuring a never-ending supply of reasons to thank him for getting rid of it one line at a time . i-demand-that-everyone-appreciate-jack-more-too-ly y'rs - tim From smurf@noris.de Sun Jul 21 09:29:30 2002 From: smurf@noris.de (Matthias Urlichs) Date: Sun, 21 Jul 2002 10:29:30 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code Message-ID: Oren Tirosh : > When I want to sort a list I just use .sort(). I don't care which algorithm > is used. The point in this discussion, though, is that frequently you don't need a sorted list. You just need a list which yields all elements in order when you pop them. Heaps are a nice low-overhead implementation of that idea, and therefore should be in the standard library. -- Matthias Urlichs From pinard@iro.umontreal.ca Sun Jul 21 11:26:55 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 21 Jul 2002 06:26:55 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: References: Message-ID: [Matthias Urlichs] > Oren Tirosh : > > When I want to sort a list I just use .sort(). I don't care which > > algorithm is used. > The point in this discussion, though, is that frequently you don't need > a sorted list. You just need a list which yields all elements in order > when you pop them. Heaps are a nice low-overhead implementation of that > idea, and therefore should be in the standard library. This is especially true when you need only the first few elements from the sorted set, which is a pretty common case in practice. A blind sort is not always the optimal solution, when you want to spare some CPU time. A caricatural example of abuse would be to implement `max' as `sort' followed by peeking at the first element of the result. Heaps are also an efficient enough representation if you insert while sorting, as it often happens in simulations. Someone I know studied this intensely, and came up with better algorithms on average of his reference benchmark, but with much worse worst cases -- so it depends of the characteristics of the simulation. Heaps do quite well on average, and do acceptably well also in their worst cases. -- François Pinard http://www.iro.umontreal.ca/~pinard From aahz@pythoncraft.com Sun Jul 21 14:25:50 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 21 Jul 2002 09:25:50 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: References: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> Message-ID: <20020721132550.GC25525@panix.com> On Sun, Jul 21, 2002, Tim Peters wrote: > > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs - tim My iBook and OSCON class members thank Jack. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From ping@zesty.ca Sun Jul 21 14:51:30 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Sun, 21 Jul 2002 06:51:30 -0700 (PDT) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: On Sun, 21 Jul 2002, Tim Peters wrote: > x1 = y.z > x2 = y.z > > Are x1 and x2 the same object after that? At least equal? Did either line > mutate y? You simply can't know without knowing how y's type implements > __getattr__, and with the introduction of computed attributes (properties) > it's just going to get muddier. That's not the point. You could claim that *any* polymorphism in Python is useless by the same argument. But Python is not useless; Python code really is reusable; and that's because there are good conventions about what the behaviour *should* be. People who do really find this upsetting should go use a strongly-typed language. In general, getting "y.z" should be idempotent, and should not mutate y. I think everyone would agree on the concept. If it does mutate y with visible effects, then the implementor is breaking the convention. Sure, Python won't prevent you from writing a file-like class where you write the string "blah" to the file by fetching f.blah and you close the file by mentioning f[42]. But when users of this class then come running after you with pointed sticks, i'm not going to fight them off. :) This is a list of all the type slots accessible from Python, before iterators (i.e. pre-2.2). Beside each is the answer to the question: Suppose you look at the value of x, then do this operation to x, then look at the value of x. Should we expect the two observed values to be the same or different? nb_add same nb_subtract same nb_multiply same nb_divide same nb_remainder same nb_divmod same nb_power same nb_negative same nb_positive same nb_absolute same nb_nonzero same nb_invert same nb_lshift same nb_rshift same nb_and same nb_xor same nb_or same nb_coerce same nb_int same nb_long same nb_float same nb_oct same nb_hex same nb_inplace_add different nb_inplace_subtract different nb_inplace_multiply different nb_inplace_divide different nb_inplace_remainder different nb_inplace_power different nb_inplace_lshift different nb_inplace_rshift different nb_inplace_and different nb_inplace_xor different nb_inplace_or different nb_floor_divide same nb_true_divide same nb_inplace_floor_divide different nb_inplace_true_divide different sq_length same sq_concat same sq_repeat same sq_item same sq_slice same sq_ass_item different sq_ass_slice different sq_contains same sq_inplace_concat different sq_inplace_repeat different mp_length same mp_subscript same mp_ass_subscript different bf_getreadbuffer same bf_getwritebuffer same bf_getsegcount same bf_getcharbuffer same tp_print same tp_getattr same tp_setattr different tp_compare same tp_repr same tp_hash same tp_call ? tp_str same tp_getattro same tp_setattro different In every case except for __call__, there exists a canonical answer. We all rely on these conventions every time we write a Python program. And learning these conventions is a necessary part of learning Python. You can argue, as Guido has, that in the particular case of for-loops distinguishing between mutating and non-mutating behaviour is not worth the trouble. But you can't say that we should give up on the whole concept *in general*. -- ?!ng From aahz@pythoncraft.com Sun Jul 21 15:41:08 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 21 Jul 2002 10:41:08 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: <20020721083340.A13156@hishome.net> References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com> <20020721083340.A13156@hishome.net> Message-ID: <20020721144108.GA5608@panix.com> On Sun, Jul 21, 2002, Oren Tirosh wrote: > On Sat, Jul 20, 2002 at 10:09:23AM -0400, Aahz wrote: >>Oren: >>> >>> That's the scenario that bit me too. For me it was a little more >>> difficult to find because it was wrapped in a few layers of chained >>> transformations. I can't tell by the last element in the chain >>> whether the first one is re-iterable or not. >>> >>> My suggestion (which was rejected by Guido) was to raise an >>> error when an iterator's .next() method is called afer it raises >>> StopIteration. This way, if I try to iterate over the result again >>> at least I'll get and error like "IteratorExhaustedError" instead >>> something that is indistinguishable from an iterator of an empty >>> container. I hate silent errors. >> >> I'm still not understanding how this would help. When a chainable >> transformer gets StopIteration, it should immediately return. What >> else do you want to do? > > The tranformations are fine the way they are. The problem is the > source - if the source is an exhausted iterator and you ask it for a > new iterator it will happily return itself and report StopIteration > on each .next(). This behavior is indistringuishable from a valid > iterator on an empty container. So the problem lies in asking the source for a new iterator, not in trying to use it. Making the iterator consumer responsible for handling this seems like the wrong approach to me -- the consumer *shouldn't* be able to tell the difference. If you're breaking that paradigm, you don't actually have an iterator consumer, you've got something else that wants to use the iterator interface, *plus* some additional features. The way Python normally handles issues like this is through documentation. (I.e., if your consumer requires an iterable capable of producing multiple iterators rather than an iterator object, you document that.) > Right now I am using tricks like special-casing files and checking if > iter(x) is x. It works but I hate it. You need to write your own wrapper or change the way your consumer works. Special-casing files inside your consumer is a Bad Idea. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim.one@comcast.net Sun Jul 21 21:14:43 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 16:14:43 -0400 Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: [Ping] > Name any protocol for which the question "does this mutate?" has > no answer. [Tim] >> Heh -- you must not use Zope much <0.6 wink>. I'm hard pressed to >> think of a protocol where that does have a reliable answer. Here: >> >> x1 = y.z >> x2 = y.z >> >> Are x1 and x2 the same object after that? At least equal? Did >> either line mutate y? You simply can't know without knowing how y's >> type implements __getattr__, and with the introduction of computed >> attributes (properties) it's just going to get muddier. [Ping] > That's not the point. It answered the question you asked. > You could claim that *any* polymorphism in Python is useless by the > same argument. It's your position that "for" is semi-useless because of the possibility for mutation. That isn't my position, and that some people write mutating __getattr__ (etc) doesn't make y.z (etc) unattractive to me either. > But Python is not useless; Python code really is reusable; Provided you play along with code's often-undocumented preconditions, absolutely. > and that's because there are good conventions about what the behaviour > *should* be. People who do really find this upsetting should go use a > strongly-typed language. Sorry, I couldn't follow this part. It's a fact that mutating __getattr__ (etc) implementations exist, and it's a fact that I'm not much bothered by it. I don't suggest they move to a different language, either (assuming that by "strongly-typed" you meant "statically typed" -- Python is already strongly typed). > In general, getting "y.z" should be idempotent, and should not mutate y. > I think everyone would agree on the concept. If it does mutate y with > visible effects, then the implementor is breaking the convention. No argument, although I have to emphasize that it's *just* "a convention", and repeat my prediction that the introduction of properties is going to make this particular convention less reliable in real life over time. > Sure, Python won't prevent you from writing a file-like class where you > write the string "blah" to the file by fetching f.blah and you close the > file by mentioning f[42]. But when users of this class then come running > after you with pointed sticks, i'm not going to fight them off. :) While properties aren't going to stop you from saying self.transactionid = self.session_manager.newid and get a new result each time you do it. Spelling no-argument method calls without parens is popular in some other languages, and it's "a feature" that properties make that easy to spell in Python 2.2 too. > This is a list of all the type slots accessible from Python, before > iterators (i.e. pre-2.2). Beside each is the answer to the question: > > Suppose you look at the value of x, then do this operation to x, > then look at the value of x. Should we expect the two observed > values to be the same or different? > ... I don't know why you're bothering with this, but it's got holes. For example, some people overly fond of C++ enjoy overloading "<<" in highly non-functional ways. For another, the section on the inplace operators seems confused; after x1 = x x += y there's no single best answer to whether x is x1 is true, or to whether the value of x1 before is == to the value of x1 after. The most popular convention*s* for the inplace operators are - If x is of a mutable type, then x is x1 after, and the pre- and post- values of x1 are !=. - If x is of an immutable type, then x is not x1 after, and the pre- and post- values of x1 are ==. The second case is forced, but the first one isn't. In light of all that, the intended meaning of "different" in > nb_inplace_add different is either incorrect, or so weak that it's not worth much. I suppose you mean that, in Python code x += y the object bound to the name "x" before the operation most likely has a different (!=) value than the object bound to the name "x" after the operation. That's true, but relies on what the generated code does *with* the result of nb_inplace_add. If you just call the method x.__iadd__(y) there's simply no guessing whether x is "different" as a result (it never is for x of an immutable type, it usually is for x of a mutable type, and there's no way to tell the difference just by staring at x). > nb_hex same I sure hope so . > ... > In every case except for __call__, there exists a canonical answer. If by "canonical" you mean "most common", sure, with at least the exceptions noted above. > We all rely on these conventions every time we write a Python program. > And learning these conventions is a necessary part of learning Python. > > You can argue, as Guido has, that in the particular case of for-loops > distinguishing between mutating and non-mutating behavior is not worth > the trouble. But you can't say that we should give up on the whole > concept *in general*. To the contrary, in a language with state it's crucial for the programmer to know when they're mutating state. If you use a mutating __getattr__, you better be careful that the code you call doesn't rely on __getattr__ not mutating; if you use an iterator object, you better be careful that the code you call doesn't require something stronger than an iterator object. It's all the same to me, and as Guido repeated until he got tired of it, the possibility for "for" and "x in y" (etc) to mutate has always been there, and has always been used. I didn't and still don't have any notable real-life problems dealing with this, although I too have gotten bit when passing a generator-iterator to code that required a sequence. I suppose the difference is that I said "oops! I screwed up!", fixed it, and moved on. It would have helped most if Python had a scheme for declaring and enforcing interfaces, *and* I bothered to use it (doubtful); second-most if the docs for the callee had spelled out its preconditions better; I doubt it would have helped at all if a variant spelling of "for" had been used, because I didn't eyeball the body of the callee first. As is, I just stuffed the generator-iterator object inside tuple() at the call site, and everything was peachy. That took a lot less effort than reading this thread <0.9 wink>. From tim.one@comcast.net Sun Jul 21 21:17:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 16:17:46 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: <20020721132550.GC25525@panix.com> Message-ID: [Tim] > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs - tim [Aahz] > My iBook and OSCON class members thank Jack. Great! You're the most appreciate guy we've got here, Aahz. I demand that everyone appreciate you more too! starting-now-ly y'rs - tim From tim.one@comcast.net Sun Jul 21 22:04:02 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 17:04:02 -0400 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: Message-ID: [Martin v. Loewis] > If that is the platform convention, I see no problem following > it. Windows already does things differently from Unix, by using the > registry to compute sys.path. FYI, this is mostly a myth. In normal operation for most people, Python never gets any info out of the Windows registry. The Python path in the registry is consulted only in unusual situations, when the Python library can't be found under the directory of the executable that called the sys.path-setting code. This can happen when, e.g., Python is embedded in some other app. The process is quite involved; the comment block at the top of PC/getpathp.c is a good summary. When reading it, note that there normally aren't any "application paths" in the registry; e.g., the PLabs Windows installer doesn't create any such beast. From tdelaney@avaya.com Mon Jul 22 00:23:55 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Mon, 22 Jul 2002 09:23:55 +1000 Subject: [Python-Dev] Single- vs. Multi-pass iterability Message-ID: > From: Ka-Ping Yee [mailto:ping@zesty.ca] > > It's just not the way i expect for-loops to work. Perhaps we would > need to survey people for objective data, but i feel that most people > would be surprised if > > for x in y: print x > for x in y: print x > > did not print the same thing twice, or if > > if x in y: print 'got it' > if x in y: print 'got it' > > did not do the same thing twice. I realize this is my own opinion, > but it's a fairly strong impression i have. > > Well, for a generator, there is no underlying sequence. > > while 1: print next(gen) > > makes it clear that there is no sequence, but > > for x in gen: print x > > seems to give me the impression that there is. I think this is the crux of the matter. You see for: loops as inherently non-destructive - that they operate on containers. I (and presumably Guido, though I would never presume to channel him ;) see for: loops as inherently destructive - that they operate on iterators. That they obtain an iterator from a container (if possible) is a useful convenience. Perhaps the terminology is confusing. Consider a queue. for each person in the queue: service the person Is there anyone who would *not* consider this to be destructive (of the queue)? Tim Delaney From kevin@koconnor.net Mon Jul 22 00:30:57 2002 From: kevin@koconnor.net (Kevin O'Connor) Date: Sun, 21 Jul 2002 19:30:57 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 20, 2002 at 02:06:29AM -0400 References: <20020624213318.A5740@arizona.localdomain> <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020721193057.A1891@arizona.localdomain> On Sat, Jul 20, 2002 at 02:06:29AM -0400, Guido van Rossum wrote: > > Any chance something like this could make it into the standard python > > library? It would save a lot of time for lazy people like myself. :-) > > > > I have read (or at least skimmed) this entire thread now. After I > reconstructed the algorithm in my head, I went back to Kevin's code; I > admire the compactness of his code. I believe that this would make a > good addition to the standard library, as a friend of the bisect > module. Thanks! >The only change I would make would be to make heap[0] the > lowest value rather than the highest. I agree this appears more natural, but a priority queue that pops the lowest priority item is a bit odd. > I propose to call it heapq.py. (Got a better name? Now or never.) > > [*] Afterthought: this could be made into an new-style class by adding > something like this to the end of module: Looks good to me. Thanks again, -Kevin -- ------------------------------------------------------------------------ | Kevin O'Connor "BTW, IMHO we need a FAQ for | | kevin@koconnor.net 'IMHO', 'FAQ', 'BTW', etc. !" | ------------------------------------------------------------------------ From tdelaney@avaya.com Mon Jul 22 00:40:24 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Mon, 22 Jul 2002 09:40:24 +1000 Subject: [Python-Dev] Priority queue (binary heap) python code Message-ID: > From: Kevin O'Connor [mailto:kevin@koconnor.net] > On Sat, Jul 20, 2002 at 02:06:29AM -0400, Guido van Rossum wrote: > > >The only change I would make would be to make heap[0] the > > lowest value rather than the highest. > > I agree this appears more natural, but a priority queue that pops the > lowest priority item is a bit odd. I'm in two minds about this. My first thought is that the *first* item (heap[0]) should be the highest priority. OTOH, if it were a sorted list, list[0] would return the *lowest* priority. So i think for consistency heap[0] must return the lowest priority. Tim Delaney From greg@cosc.canterbury.ac.nz Mon Jul 22 01:20:12 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 22 Jul 2002 12:20:12 +1200 (NZST) Subject: [Python-Dev] The iterator story In-Reply-To: <20020719120043.A21503@glacier.arctrix.com> Message-ID: <200207220020.g6M0KCM21823@oma.cosc.canterbury.ac.nz> > Should people prefer to write: > > for item from iterator: > do something > > when they only need to loop over something once? This shows up a problem with Ping's proposal, I think: The place where you write the for-loop isn't the place where you know whether something will be iterated over more than once or not. How is a library routine going to know whether a sequence passed to it is going to be used again later? It's impossible -- global knowledge of the whole program is needed. This appears to leave the library writer with two choices: (1) Use for-in, to be on the safe side, in case the user doesn't want the sequence destroyed -- but then it can't be used on a destructive iterator, even if the caller knows he won't be using it again; (2) use for-from, and force everyone who calls it to adapt sequences to iterators before calling. Either way, things get messy and complicated and possibly dangerous. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 22 02:50:49 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 22 Jul 2002 13:50:49 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207220150.g6M1onv22234@oma.cosc.canterbury.ac.nz> "Delaney, Timothy" : > for each person in the queue: > service the person If you actually wrote it that way in Python, it would probably be a bug. It would be better written: while there is someone at the head of the queue: service that person Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 22 00:35:38 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 22 Jul 2002 11:35:38 +1200 (NZST) Subject: [Python-Dev] Single- vs. Multi-pass iterability In-Reply-To: Message-ID: <200207212335.g6LNZcU21438@oma.cosc.canterbury.ac.nz> Ka-Ping Yee : > I believe this is where the biggest debate lies: whether "for" should be > non-destructive. It's not the for-loop's fault if it's argument is of such a nature that iterating over it destroys it. Given suitable values for x and y, it's possible for evaluating "x+y" to be a destructive operation. Does that mean we should revise the "+" protocol somehow to prevent this from happening? I don't think so. This sort of thing is all-pervasive in Python due to its dynamic nature. It's not something that can be easily "fixed", even if it were desirable to do so. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From ping@zesty.ca Mon Jul 22 03:55:22 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Sun, 21 Jul 2002 19:55:22 -0700 (PDT) Subject: [Python-Dev] The iterator story In-Reply-To: <200207220020.g6M0KCM21823@oma.cosc.canterbury.ac.nz> Message-ID: I'm in a bit of a bind. I know at this point that Guido's already made up his mind so there's nothing further to be gained by debating the issue; yet i feel compelled to respond as long as people keep missing the idea or saying things that don't make sense. So: this is a clarification, not a push. I am going to reply to a few messages at once, to reduce the number of messages that i'm sending on this topic. If you're planning to reply on this thread, please read the whole message before replying. * * * On Mon, 22 Jul 2002, Greg Ewing wrote: > This shows up a problem with Ping's proposal, I think: > The place where you write the for-loop isn't the place > where you know whether something will be iterated over > more than once or not. When you write the for-loop, you decide whether you want to consume the sequence. You use the convention and expect the implementor of the sequence object to adhere to it. > How is a library routine going > to know whether a sequence passed to it is going to > be used again later? You've got this backwards. You write the library routine the way that makes sense, and then you document whether the sequence gets destroyed or not. That declaration becomes part of your interface, and users of your routine can then determine how to use it safely for their needs. (Analogy: how does the implementor of file.close() know whether the caller wants to use the file again later? Answer: it's not the implementor's job to know that. We document what file.close() does, and people only *decide* to call file.close() when they don't need the file anymore.) Without a convention to distinguish between destruction and non-destruction, you can't establish what the library routine does; so you can't document it; so you can't use it safely *even* if you trust the implementor. No implementation would ever make it possible for your library routine to claim that it "does with the elements of a given sequence without destroying the sequence". Now if you do have a convention -- yes, you still have to trust implementors to follow the convention -- but if they do so, you're okay. * * * > This appears to leave the library writer with two > choices: (1) Use for-in, to be on the safe side, > in case the user doesn't want the sequence destroyed -- > but then it can't be used on a destructive iterator, No, it can. The documentation for the library routine will state that it wants a sequence. If the caller wants to use x and x is an iterator, it passes in seq(x). No problem. The caller has thereby declared that it's okay to destroy x. To make it more obvious what is going on, i should have chosen a better name; 'seq' was poor. Let's rename 'seq' to 'consume'. consume(i) returns an object x such that iter(x) is i. So calling 'consume' implies that you are consuming an iterator. All right. Then consider: for x in consume(y): print x The above is clear that y is being destroyed. Now consider: def printout(sequence): for x in sequence: print x If y is an iterator, in my world you would not be able to call "printout(y)". You would say "printout(consume(y))", thus making it clear that y is being destroyed. > (2) use for-from, and force everyone who calls it to > adapt sequences to iterators before calling. Since for-in is non-destructive, it is safer, and it is also more common to have a sequence than an iterator. So i would usually choose option 1 rather than 2. But sure, you can write for-from, if you want. I mean, if you decide to accept strings, then users who want to pass in integers will have to str() them first. If you decide to accept integers, then users who want to pass in strings will have to int() them first. This is no great dilemma. We actually like this. * * * Hereafter i'll stick to existing syntax, because the business of introducing syntax isn't really the main point. I'll use the alternative i proposed, which is to use the built-in instead. So we'd say for i in consume(it): ... instead of for i from it: ... Tim Delaney wrote: > I think this is the crux of the matter. You see for: loops as inherently > non-destructive - that they operate on containers. I (and presumably > Guido, though I would never presume to channel him ;) see for: loops as > inherently destructive - that they operate on iterators. That they obtain > an iterator from a container (if possible) is a useful convenience. I believe your interpretation of opinions is correct on all counts. Except i would point out that for-loops are not always destructive; most of the time, they are not, and that is why i consider the destructive behaviour surprising and worth making visible. > Perhaps the terminology is confusing. Consider a queue. > > for each person in the queue: > service the person > > Is there anyone who would *not* consider this to be destructive (of the > queue)? Well, the only reason you can tell is that you can see the context from the meanings of the words "queue" and "service". If you said for person in consume(queue): service(person) then that would truly be clear, even if you used different variable names, because the 'consume' built-in expresses that the queue will be consumed. * * * Greg Ewing wrote: > Given suitable values for x and y, it's possible for evaluating "x+y" > to be a destructive operation. Does that mean we should revise the > "+" protocol somehow to prevent this from happening? I don't think so. Augh! I'm just not getting through here. We all know that the Python philosophy is to trust the implementors of protocols instead of enforcing behaviour. That's not the point. Of course it's POSSIBLE for "x + y" to be destructive. That doesn't mean it SHOULD be. We all know that "x + y" is normally not destructive, and that's what counts. That understanding enables me to implement __add__ in a way that will not screw you over when you use it. All i'm saying is that there should be a way to *express* safe iteration (and safe "element in container" tests). Guido's pronouncement is "Nope. Don't need it." Although i disagree, i am willing to respect that. But please don't confuse a lack of enforcement with a lack of convention. Convention is all we have. -- ?!ng From tim.one@comcast.net Mon Jul 22 04:09:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 21 Jul 2002 23:09:11 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: Message-ID: [Guido] > The only change I would make would be to make heap[0] the > lowest value rather than the highest. [Kevin O'Connor] > I agree this appears more natural, but a priority queue that pops the > lowest priority item is a bit odd. So now the fellow who wrote the code to begin with squirms at what will happen if it's actually put in the std library, and sounds like he would continue using his own code. [Delaney, Timothy] > I'm in two minds about this. My first thought is that the *first* item > (heap[0]) should be the highest priority. > > OTOH, if it were a sorted list, list[0] would return the *lowest* > priority. On the third hand, if you're using heaps for sorting (as in a heapsort), it's far more natural to have a max-heap -- else the sort can't be done in-place (with a max-heap you pop the largest value, copy it to the last array slot, pretend the array is one shorter, and trickle what *was* in the last array slot back into the now-one-smaller max-heap; repeat N-1 times and you've sorted the array in-place). On the fourth hand, if you want a *bounded* priority queue, to remember only the N best-scoring (largest-priority) objects for some fixed N, then (perhaps paradoxically) a min-heap is what you need. On the fifth head, if you want to process items in priorty order (highest first) interleaved with entering new items, then you need a max-heap. I suspect that's what Kevin does. > So i think for consistency heap[0] must return the lowest priority. On the sixth hand, anyone who has implemented a heap in another 0-based language expects the first slot in the array to be unused, in order to simplify the indexing (parent = child >> 1 uniformly if the root is at index 1), and to ensure that all nodes on the same level have indices with the same leading bit (which can be helpful in advanced algorithms -- then, e.g., you know that i and j are on the same level of the tree if and only if i&j > i^j; maybe that's not obvious at first glance ). Priority queues just aren't a once-size-fits-all thing. From drifty@bigfoot.com Mon Jul 22 04:23:03 2002 From: drifty@bigfoot.com (Brett Cannon) Date: Sun, 21 Jul 2002 20:23:03 -0700 (PDT) Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? In-Reply-To: Message-ID: [Tim Peters] > [Tim] > > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs - tim > > [Aahz] > > My iBook and OSCON class members thank Jack. > > Great! You're the most appreciate guy we've got here, Aahz. I demand that > everyone appreciate you more too! > I appreciate everyone everywhere for everything. =) my-Berkeley-education-has-turned-me-hippie-ly y'rs -Brett From tim.one@comcast.net Mon Jul 22 05:01:48 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 00:01:48 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: Just FYI. I ripped out the complications I added to the mergesort variant that tried to speed many-equal-keys cases, and worked on its core competency (intelligent merging) instead. There's a reason : this kick started while investigating ways to speed Zope B-Tree operations when they're used as sets, equal keys are impossible in that context, but intelligent merging can really help. So whatever the fate of this sort, some of the code will live on in Zope's B-Tree routines. The result is that non-trivial cases of near-order got a nice boost, while ~sort got even slower again. I added a new test +sort, which replaces the last 10 values of a sorted array with random values. samplesort has a special case for this, limited to a maximum of 15 trailing out-of-order entries. timsort has no special case for this but does it significantly faster than the samplesort hack anyway, has no limit on how many such trailing entries it can exploit, and couldn't care less whether such entries are at the front or the end of the array; I expect it would be (just) a little slower if they were in the middle. As shown below, timsort does a +sort almost as fast as for a wholly-sorted array. Ditto now for 3sort too, which perturbs order by doing 3 random exchanges in a sorted array. It's become a very interesting sort implementation, handling more kinds of near-order at demonstrably supernatural speed than anything else I'm aware of. ~sort isn't an example of near-order. Quite the contrary, it has a number of inversions quadratic in N, and N/4 runs; the only reason ~sort goes faster than *sort now is-- believe it or not --a surprising benefit from a memory optimization. Key: *sort: random data \sort: descending data /sort: ascending data 3sort: ascending, then 3 random exchanges +sort: ascending, then 10 random at the end ~sort: many duplicates =sort: all equal !sort: worst case scenario C:\Code\python\PCbuild>python -O sortperf.py 15 20 1 samplesort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.18 0.02 0.01 0.14 0.01 0.07 0.01 0.17 16 65536 0.24 0.02 0.02 0.22 0.02 0.08 0.02 0.24 17 131072 0.53 0.05 0.04 0.49 0.05 0.18 0.04 0.52 18 262144 1.16 0.09 0.09 1.06 0.12 0.37 0.09 1.13 19 524288 2.53 0.18 0.17 2.30 0.24 0.74 0.17 2.47 20 1048576 5.47 0.37 0.35 5.17 0.45 1.51 0.35 5.34 timsort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.17 0.01 0.01 0.01 0.01 0.14 0.01 0.02 16 65536 0.23 0.02 0.02 0.03 0.02 0.21 0.03 0.04 17 131072 0.53 0.04 0.04 0.05 0.04 0.46 0.04 0.09 18 262144 1.16 0.09 0.09 0.12 0.09 1.01 0.08 0.18 19 524288 2.53 0.18 0.17 0.18 0.18 2.20 0.17 0.36 20 1048576 5.48 0.36 0.35 0.36 0.37 4.78 0.35 0.73 From greg@cosc.canterbury.ac.nz Mon Jul 22 05:50:02 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 22 Jul 2002 16:50:02 +1200 (NZST) Subject: [Python-Dev] The iterator story In-Reply-To: Message-ID: <200207220450.g6M4o2u23472@oma.cosc.canterbury.ac.nz> Ka-Ping Yee : > When you write the for-loop, you decide whether you want > to consume the sequence. As someone pointed out, it's pretty rare that you actually *want* to consume the sequence. Usually the choice is between "I don't care" and "The sequence must NOT be consumed". Of the two varieties of for-loop in your proposal, for-in obviously corresponds to the "must not be consumed" case, leading one to suppose that you intend for-from to be used in the don't-care case. But now you seem to be suggesting that library routines should always use for-in, and that the caller should convert an iterator to a sequence if he knows it's okay to consume it: > Since for-in is non-destructive, it is safer, and it is also > more common to have a sequence than an iterator. > ... > If y is an iterator, in my world you would not be able to > call "printout(y)". You would say "printout(consume(y)) Okay, that seems reasonable -- explicit is better than implicit. But... consider the following two library routines: def printout1(s): for x in s: print x def printout2(s): for x in s: for y in s: print x, y Clearly it's okay to call printout1(consume(s)), but it's NOT okay to call printout2(consume(s)). So we need to document these requirements: def printout1(s): "s may be an iterator or sequence" for x in s: print x def printout2(s): "s MUST be a sequence, NOT an iterator!" for x in s: for y in s: print x, y But now there's nothing to enforce these requirements -- no exception will be raised if you call printout2(consume(s)) by mistake. To get any safety benefit from your proposed arrangement, it seems to me that you'd need to write printout1 as def printout1(s): "s must be an iterator" for x from s: print x and then in the (overwhelmingly most common) case of passing it a sequence, you would need to call it as printout1(iter(s)) -- unless you allow the for-from protocol to automatically obtain an iterator from a sequence if possible, the way for-in currently does. > Greg Ewing wrote: > > Given suitable values for x and y, it's possible for evaluating "x+y" > > to be a destructive operation. Does that mean we should revise the > > "+" protocol somehow to prevent this from happening? I don't think so. > > Augh! I'm just not getting through here. Sorry, I wrote that before I saw your full proposal. I understand your point of view much better now, and even sympathise with it to some extent -- something like the for-from syntax actually passed through my mind shortly before I saw it in your post. There's no doubt that it's very elegant theoretically, but in thinking through the implications, I'm not sure it would be all that helpful in practice, and might even turn out to be a nuisance if it requires putting in a lot of iter(x) and/or consume(x) calls. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Mon Jul 22 07:05:14 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 02:05:14 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: One more piece of this puzzle. It's possible that one of {samplesort, timsort} would become unboundedly faster as the cost of comparisons increased over that of Python floats (which all the timings I posted used). Here's a program that would show this if so, using my local Python, where lists have an .msort() method: """ class SlowCmp(object): __slots__ = ['val'] def __init__(self, val): self.val = val def __lt__(self, other): for i in range(SLOW): i*i return self.val < other.val def drive(n): from random import randrange from time import clock as now n10 = n * 10 L = [SlowCmp(randrange(n10)) for i in xrange(n)] L2 = L[:] t1 = now() L.sort() t2 = now() L2.msort() t3 = now() return t2-t1, t3-t2 for SLOW in 1, 2, 4, 8, 16, 32, 64, 128: print "At SLOW value", SLOW for n in range(1000, 10001, 1000): ss, ms = drive(n) print " %6d %6.2f %6.2f %6.2f" % ( n, ss, ms, 100.0*(ss - ms)/ms) """ Here's the tail end of the output, from which I conclude that the number pf comparisons done on random inputs is virtually identical for the two methods; times vary by a fraction of a percent both ways, with no apparent pattern (note that time.clock() has better than microsecond resolution on WIndows, so the times going into the % calculation have more digits than are displayed here): At SLOW value 32 1000 0.22 0.22 -0.05 2000 0.50 0.50 0.10 3000 0.80 0.80 -0.64 4000 1.11 1.10 0.71 5000 1.44 1.45 -0.12 6000 1.77 1.76 0.72 7000 2.10 2.09 0.31 8000 2.43 2.41 0.79 9000 2.78 2.80 -0.58 10000 3.13 3.13 -0.01 At SLOW value 64 1000 0.37 0.38 -1.00 2000 0.83 0.83 0.20 3000 1.33 1.33 -0.15 4000 1.84 1.84 0.05 5000 2.40 2.39 0.38 6000 2.95 2.92 0.97 7000 3.46 3.47 -0.20 8000 4.04 4.01 0.87 9000 4.60 4.63 -0.68 10000 5.19 5.21 -0.33 At SLOW value 128 1000 0.68 0.67 0.37 2000 1.52 1.50 0.99 3000 2.40 2.41 -0.67 4000 3.35 3.32 1.03 5000 4.30 4.32 -0.47 6000 5.32 5.29 0.54 7000 6.27 6.27 0.04 8000 7.29 7.25 0.55 9000 8.37 8.37 -0.03 10000 9.39 9.43 -0.49 From oren-py-d@hishome.net Mon Jul 22 07:08:18 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 22 Jul 2002 09:08:18 +0300 Subject: [Python-Dev] The iterator story In-Reply-To: <20020721144108.GA5608@panix.com>; from aahz@pythoncraft.com on Sun, Jul 21, 2002 at 10:41:08AM -0400 References: <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com> <20020721083340.A13156@hishome.net> <20020721144108.GA5608@panix.com> Message-ID: <20020722090818.A5576@hishome.net> On Sun, Jul 21, 2002 at 10:41:08AM -0400, Aahz wrote: > > The tranformations are fine the way they are. The problem is the > > source - if the source is an exhausted iterator and you ask it for a > > new iterator it will happily return itself and report StopIteration > > on each .next(). This behavior is indistringuishable from a valid > > iterator on an empty container. > > So the problem lies in asking the source for a new iterator, not in > trying to use it. Making the iterator consumer responsible for handling > this seems like the wrong approach to me -- the consumer *shouldn't* be > able to tell the difference. If you're breaking that paradigm, you > don't actually have an iterator consumer, you've got something else that > wants to use the iterator interface, *plus* some additional features. Tuples are very much like lists except that they cannot be modified. A lot of code that was written with lists in mind can actually use tuples. If you pass a tuple to a function that tries to use the "additional feature" of mutability you will get an exception. Pipes are very much likes files except that they cannot be seeked. A lot of code that was written with files in mind can actually use pipes. If you pass a pipe to a function that tries to use the "additional feature" of seekbility you will get an exception. Iterators are very much like iterable containers except that they can only be iterated once. A lot of code that was written with containers in mind can actually use iterators. If you pass an iterator to a function that tries to use the "additional feature" of re-iterability you... will not get an exception. You'll get nonsense results because on the second pass the iterator will fail silently and suddenly pretend to be an empty container. Would you say that any code that expects a seekable file or a mutable sequence is "breaking the paradigm"? Why should code that expects a re-iterable container be different from code that uses any other protocol that has several variations and subsets/supersets? > The way Python normally handles issues like this is through > documentation. (I.e., if your consumer requires an iterable capable of > producing multiple iterators rather than an iterator object, you document > that.) The way Python normally handles issues of code trying to use a protocol that the object does not support is through *exceptions*. When a 5000+ line program produces meaningless results documentation not not very helpful to start looking for the problem. An exception gives you an approximate line number and reason. If __setitem__ on a tuple was ignored instead of producing an exception or seek on a pipe failed silently I don't think that anyone would find "don't do that, then" or "documentation" to be a satisfactory answer. Oren From mwh@python.net Mon Jul 22 11:03:10 2002 From: mwh@python.net (Michael Hudson) Date: 22 Jul 2002 11:03:10 +0100 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: barry@zope.com's message of "Fri, 19 Jul 2002 17:48:53 -0400" References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net> <15672.35141.803094.488541@anthem.wooz.org> Message-ID: <2m1y9w3wrl.fsf@starship.python.net> barry@zope.com (Barry A. Warsaw) writes: > >>>>> "GvR" == Guido van Rossum writes: > > GvR> Traditionally, on Unix per-user extensions are done by > GvR> pointing PYTHONPATH to your per-user directory (-ies) in your > GvR> .profile. > > Or adding them to sys.path via your $PYTHONSTARTUP file. That only helps for interactive sessions... > OTOH, it might be nice if the distutils `install' command had some > switches to make installing in some of these common alternative > locations a little easier. That might dovetail nicely if/when we > decide to add a site-updates directory to sys.path. I don't see what's so very difficult about $ python setup.py install --prefix=$HOME but maybe I'm odd. Cheers, M. -- $ head -n 2 src/bash/bash-2.04/unwind_prot.c /* I can't stand it anymore! Please can't we just write the whole Unix system in lisp or something? */ -- spotted by Rich van der Hoff From Jack.Jansen@cwi.nl Mon Jul 22 13:02:15 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 22 Jul 2002 14:02:15 +0200 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: <2m1y9w3wrl.fsf@starship.python.net> Message-ID: On Monday, July 22, 2002, at 12:03 , Michael Hudson wrote: > I don't see what's so very difficult about > > $ python setup.py install --prefix=$HOME This is what you use if you have built Python yourself, and installed it in your home directory. What I was referring to (as the setup that isn't very well supported right now) is the situation where the system admin has built and installed Python in, say, /usr/local, and you want to install a distutils-based packaged for your own private use. Setting PYTHONPATH to be $HOME/lib/python-extensions or something similar is what people customarily do to get access to their private modules, but there is no standard, and hence also no way for distutils to find the pathname and provide an easy interface to do this. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mwh@python.net Mon Jul 22 13:56:33 2002 From: mwh@python.net (Michael Hudson) Date: 22 Jul 2002 13:56:33 +0100 Subject: [Python-Dev] Added platform-specific directories to sys.path In-Reply-To: Jack Jansen's message of "Mon, 22 Jul 2002 14:02:15 +0200" References: Message-ID: <2mptxfncou.fsf@starship.python.net> Jack Jansen writes: > On Monday, July 22, 2002, at 12:03 , Michael Hudson wrote: > > I don't see what's so very difficult about > > > > $ python setup.py install --prefix=$HOME > > This is what you use if you have built Python yourself, and installed it > in your home directory. In that case, the --prefix arg is unnecessary. > What I was referring to (as the setup that isn't very well supported > right now) is the situation where the system admin has built and > installed Python in, say, /usr/local, and you want to install a > distutils-based packaged for your own private use. That's when I do the above. > Setting PYTHONPATH to be $HOME/lib/python-extensions or something > similar is what people customarily do to get access to their private > modules, but there is no standard, and hence also no way for distutils > to find the pathname and provide an easy interface to do this. My setup requires setting $PYTHONPATH too, so it's not ideal, but it works. Cheers, M. -- Reading Slashdot can [...] often be worse than useless, especially to young and budding programmers: it can give you exactly the wrong idea about the technical issues it raises. -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#reasons From sholden@holdenweb.com Mon Jul 22 14:53:04 2002 From: sholden@holdenweb.com (Steve Holden) Date: Mon, 22 Jul 2002 09:53:04 -0400 Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows? References: Message-ID: <021001c23187$1aa9b230$6300000a@holdenweb.com> ----- Original Message ----- From: "Brett Cannon" To: "Tim Peters" Cc: Sent: Sunday, July 21, 2002 11:23 PM Subject: RE: [Python-Dev] Is __declspec(dllexport) really needed on Windows? > [Tim Peters] > > > [Tim] > > > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs - tim > > > > [Aahz] > > > My iBook and OSCON class members thank Jack. > > > > Great! You're the most appreciate guy we've got here, Aahz. I demand that > > everyone appreciate you more too! > > > > I appreciate everyone everywhere for everything. =) > > my-Berkeley-education-has-turned-me-hippie-ly y'rs -Brett > I appreciate the set of all things that are insufficiently appreciated, and each of its under-appreciated members but-it-won't-necessarily-make-a-difference-ly y'rs - steve ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From barry@zope.com Mon Jul 22 15:14:01 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 22 Jul 2002 10:14:01 -0400 Subject: [Python-Dev] Added platform-specific directories to sys.path References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net> <15672.35141.803094.488541@anthem.wooz.org> <2m1y9w3wrl.fsf@starship.python.net> Message-ID: <15676.4905.813038.253158@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: >> your GvR> .profile. Or adding them to sys.path via your >> $PYTHONSTARTUP file. MH> That only helps for interactive sessions... Yup, which might or might not be good enough. I'm thinking of the (X)Emacs arrangement that there are system startup files and user startup files that are normally always loaded, unless you use a command line switch to specifically disable them. >> OTOH, it might be nice if the distutils `install' command had >> some switches to make installing in some of these common >> alternative locations a little easier. That might dovetail >> nicely if/when we decide to add a site-updates directory to >> sys.path. MH> I don't see what's so very difficult about MH> $ python setup.py install --prefix=$HOME Actually, to do it correctly (and quietly) this appears to be the most accurate way to tell distutils to install a library in an alternative search path: % PYTHONPATH= python setup.py --quiet install --install-lib \ --install-purelib A bit less than intuitive than say, a standard alternative user-centric installation directory and a --userdir option to the install command. -Barry From barry@zope.com Mon Jul 22 16:09:49 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 22 Jul 2002 11:09:49 -0400 Subject: [Python-Dev] Sorting References: <20020720130000.GA11845@panix.com> Message-ID: <15676.8253.741856.171571@anthem.wooz.org> >>>>> "A" == Aahz writes: A> Any reason the list object can't grow a .stablesort() method? Because when a user looks at the methods of a list object and sees both .sort() and .stablesort() you now need to explain the difference, and perhaps give some hint as to why you'd want to choose one over the other. Maybe the teachers-of-Python in this crowd can give some insight into whether 1) they'd actually do this or just hand wave past the difference, or 2) whether it would be a burden to teaching. I'm specifically thinking of the non-programmer crowd learning Python. I would think that most naive uses of list.sort() would expect a stable sort and wouldn't care much about any performance penalties involved. I'd put my own uses squarely in the "naive" camp. ;) I'd prefer to see - .sort() actually /be/ a stable sort in the default case - list objects not be burdened with additional sorting methods (that way lies a footing-challenged incline) - provide a module with more advanced sorting options, with functions suitable for list.sort()'s cmpfunc, and with derived classes (perhaps in C) of list for better performance. -Barry From xscottg@yahoo.com Mon Jul 22 16:44:15 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 22 Jul 2002 08:44:15 -0700 (PDT) Subject: [Python-Dev] Sorting In-Reply-To: <15676.8253.741856.171571@anthem.wooz.org> Message-ID: <20020722154415.79981.qmail@web40111.mail.yahoo.com> --- "Barry A. Warsaw" wrote: > > Because when a user looks at the methods of a list object and sees > both .sort() and .stablesort() you now need to explain the difference, > and perhaps give some hint as to why you'd want to choose one over the > other. > Or you could have an optional parameter that defaults to whatever the more sane value should be (probably stable), and when the user stumbles across this parameter they stumble across the docs too. I think Tim's codebloat argument is more compelling. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From skip@mojam.com Mon Jul 22 17:06:58 2002 From: skip@mojam.com (Skip Montanaro) Date: Mon, 22 Jul 2002 11:06:58 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200207221606.g6MG6wT20010@12-248-11-90.client.attbi.com> Bug/Patch Summary ----------------- 262 open / 2681 total bugs (+5) 143 open / 1613 total patches (+15) New Bugs -------- OSX IDE behaviour (output to console) (2002-06-24) http://python.org/sf/573174 pydoc(.org) does not find file.flush() (2002-06-26) http://python.org/sf/574057 Chained __slots__ dealloc segfault (2002-06-26) http://python.org/sf/574207 convert_path fails with empty pathname (2002-06-26) http://python.org/sf/574235 Automated daily documentation builds (2002-06-26) http://python.org/sf/574241 Tex Macro Error (2002-06-27) http://python.org/sf/574939 multiple inheritance w/ slots dumps core (2002-06-28) http://python.org/sf/575229 Parts of 2.2.1 core use old gc API (2002-06-30) http://python.org/sf/575715 os.spawnv() fails with underscores (2002-06-30) http://python.org/sf/575770 Negative __len__ provokes SystemError (2002-06-30) http://python.org/sf/575773 Inconsistent behaviour in re grouping (2002-07-01) http://python.org/sf/576079 Sig11 in cPickle (stack overflow) (2002-07-01) http://python.org/sf/576084 Infinite recursion in Pickle (2002-07-02) http://python.org/sf/576419 Windows binary missing SSL (2002-07-02) http://python.org/sf/576711 os.path.walk behavior on symlinks (2002-07-03) http://python.org/sf/576975 inheriting from property and docstrings (2002-07-03) http://python.org/sf/576990 Wrong description for PyErr_Restore (2002-07-03) http://python.org/sf/577000 Print line number of string if at EOF (2002-07-04) http://python.org/sf/577295 ** in doc/current/lib/operator-map.html (2002-07-04) http://python.org/sf/577513 del __builtins__ breaks out of rexec (2002-07-04) http://python.org/sf/577530 System Error with slots and multi-inh (2002-07-05) http://python.org/sf/577777 resire readonly memory mapped file (2002-07-05) http://python.org/sf/577782 Docs unclear about cleanup. (2002-07-05) http://python.org/sf/577793 Explain how to subclass Exception (2002-07-06) http://python.org/sf/578180 pthread_exit missing in thread_pthread.h (2002-07-09) http://python.org/sf/579116 LibRef 2.2.1, replace zero with False (2002-07-11) http://python.org/sf/579991 Subclassing WeakValueDictionary impossib (2002-07-11) http://python.org/sf/580107 GC Changes not mentioned in What's New (2002-07-12) http://python.org/sf/580462 mimetools module privacy leak (2002-07-12) http://python.org/sf/580495 MacOSX python.app build problems (2002-07-12) http://python.org/sf/580550 import lock should be exposed (2002-07-13) http://python.org/sf/580952 Provoking infinite scanner loops (2002-07-13) http://python.org/sf/581080 smtplib.SMTP.ehlo method esmtp_features (2002-07-13) http://python.org/sf/581165 bug in splituser(host) in urllib (2002-07-14) http://python.org/sf/581529 pty.spawn - wrong error caught (2002-07-15) http://python.org/sf/581698 ''.split() docstring clarification (2002-07-15) http://python.org/sf/582071 pickle error message unhelpful (2002-07-16) http://python.org/sf/582297 lib-dynload/*.so wrong permissions (2002-07-17) http://python.org/sf/583206 ConfigParser spaces in keys not read (2002-07-18) http://python.org/sf/583248 wrong dest size (2002-07-18) http://python.org/sf/583477 gethostbyaddr lag (2002-07-19) http://python.org/sf/583975 add way to detect bsddb version (2002-07-21) http://python.org/sf/584409 os.getlogin() fails (2002-07-21) http://python.org/sf/584566 no doc for os.fsync and os.fdatasync (2002-07-21) http://python.org/sf/584695 New Patches ----------- Deprecate bsddb (2002-05-06) http://python.org/sf/553108 Executable .pyc-files with hashbang (2002-06-23) http://python.org/sf/572796 (?(id/name)yes|no) re implementation (2002-06-23) http://python.org/sf/572936 cgi.py and rfc822.py unquote fixes (2002-06-24) http://python.org/sf/573197 Changing owner of symlinks (2002-06-25) http://python.org/sf/573770 makesockaddr, use addrlen with AF_UNIX (2002-06-27) http://python.org/sf/574707 Make python-mode.el use jython (2002-06-27) http://python.org/sf/574747 Make python-mode.el use "jython" interp (2002-06-27) http://python.org/sf/574750 list.extend docstring fix (2002-06-27) http://python.org/sf/574867 PyTRASHCAN slots deallocation (2002-06-28) http://python.org/sf/575073 python-mode patch for ipython support (2002-06-30) http://python.org/sf/575774 SSL release GIL (2002-06-30) http://python.org/sf/575827 Alternative implementation of interning (2002-07-01) http://python.org/sf/576101 Extend PyErr_SetFromWindowsErr (2002-07-02) http://python.org/sf/576458 Remove PyArg_Parse() and METH_OLDARGS (2002-07-03) http://python.org/sf/577031 Merge xrange() into slice() (2002-07-05) http://python.org/sf/577875 fix for problems with test_longexp (2002-07-06) http://python.org/sf/578297 Put IDE scripts in ~/Library (2002-07-08) http://python.org/sf/578667 incompatible, but nice strings improveme (2002-07-08) http://python.org/sf/578688 Solaris openpty() and forkpty() addition (2002-07-09) http://python.org/sf/579433 Shadow Password Support Module (2002-07-09) http://python.org/sf/579435 Build MachoPython with 2level namespace (2002-07-10) http://python.org/sf/579841 xreadlines caching, file iterator (2002-07-11) http://python.org/sf/580331 less restrictive HTML comments (2002-07-12) http://python.org/sf/580670 Fix for seg fault on test_re on mac osx (2002-07-12) http://python.org/sf/580869 new version of Set class (2002-07-13) http://python.org/sf/580995 Canvas "select_item" always returns None (2002-07-14) http://python.org/sf/581396 info reader bug (2002-07-14) http://python.org/sf/581414 fix to pty.spawn error on Linux (2002-07-15) http://python.org/sf/581705 Alternative PyTRASHCAN subtype_dealloc (2002-07-15) http://python.org/sf/581742 smtplib.py patch for macmail esmtp auth (2002-07-17) http://python.org/sf/583180 make file object an iterator (2002-07-17) http://python.org/sf/583235 get python to link on OSF1 (Dec Unix) (2002-07-20) http://python.org/sf/584245 yield allowed in try/finally (2002-07-21) http://python.org/sf/584626 Closed Bugs ----------- ihooks on windows and pythoncom (PR#294) (2000-07-31) http://python.org/sf/210637 httplib does not check if port is valid (easy to fix?) (2000-12-13) http://python.org/sf/225744 httplib problem with '100 Continue' (2001-01-02) http://python.org/sf/227361 [windows] os.popen doens't kill subprocess when interrupted (2001-02-06) http://python.org/sf/231273 += not assigning to same var it reads (2001-04-21) http://python.org/sf/417930 httplib: multiple Set-Cookie headers (2001-06-12) http://python.org/sf/432621 [win32] KeyboardInterrupt Not Caught (2001-07-10) http://python.org/sf/439992 Evaluating func_code causing core dump (2001-07-23) http://python.org/sf/443866 HTTPSConnect.__init__ too tricky (2001-09-04) http://python.org/sf/458463 base n integer to string conversion (2001-09-25) http://python.org/sf/465045 Tut: Dict used before dicts explained (2001-11-10) http://python.org/sf/480337 SAX Attribute/AttributesNS class missing (2001-11-22) http://python.org/sf/484603 Error building info docs (2001-12-20) http://python.org/sf/495624 'lambda' documentation in strange place (2001-12-27) http://python.org/sf/497109 unicode() docs don't mention LookupError (2002-02-06) http://python.org/sf/513666 bogus URLs cause exception in httplib (2002-03-07) http://python.org/sf/527064 Nested Scopes bug (Confirmed) (2002-03-10) http://python.org/sf/528274 Build unable to import w/gcc 3.0.4 (2002-04-11) http://python.org/sf/542737 buffer slice type inconsistant (2002-04-20) http://python.org/sf/546434 urllib/httplib vs corrupted tcp/ip stack (2002-04-22) http://python.org/sf/547093 Unicode encoders appears to leak references (2002-04-28) http://python.org/sf/549731 email.Utils.encode doesn't obey rfc2047 (2002-05-06) http://python.org/sf/552957 unittest.TestResult documentation (2002-05-20) http://python.org/sf/558278 HTTPSConnection memory leakage (2002-05-22) http://python.org/sf/559117 Getting traceback in embedded python. (2002-06-01) http://python.org/sf/563338 urllib2 can't cope with error response (2002-06-02) http://python.org/sf/563665 compile traceback must include filename (2002-06-05) http://python.org/sf/564931 Misleading string constant. (2002-06-12) http://python.org/sf/568269 minor improvement to Grammar file (2002-06-13) http://python.org/sf/568412 Broken pre.subn() (and pre.sub()) (2002-06-17) http://python.org/sf/570057 glob() fails for network drive in cgi (2002-06-19) http://python.org/sf/571167 imaplib fetch is broken (2002-06-19) http://python.org/sf/571334 Numeric Literal Anomoly (2002-06-19) http://python.org/sf/571382 Segmentation fault in Python 2.3 (2002-06-20) http://python.org/sf/571885 python-mode IM parses code in docstrings (2002-06-21) http://python.org/sf/572341 Memory leak in object comparison (2002-06-22) http://python.org/sf/572567 Closed Patches -------------- Optional memory profiler (2000-08-18) http://python.org/sf/401229 Pure Python strptime() (PEP 42) (2001-10-23) http://python.org/sf/474274 Unicode support in email.Utils.encode (2001-12-07) http://python.org/sf/490456 httplib.py screws up on 100 response (2001-12-31) http://python.org/sf/498149 make python-mode play nice with gdb (2002-01-28) http://python.org/sf/509975 imputil.py can't import "\r\n" .py files (2002-02-28) http://python.org/sf/523944 urllib2.py: fix behavior with proxies (2002-03-08) http://python.org/sf/527518 Better AttributeError formatting (2002-03-20) http://python.org/sf/532638 RFC 2231 support for email package (2002-04-26) http://python.org/sf/549133 Fix for httplib bug with 100 Continue (2002-05-01) http://python.org/sf/551273 Py_AddPendingCall doesn't unlock on fail (2002-05-03) http://python.org/sf/552161 os.uname() on Darwin space in machine (2002-05-24) http://python.org/sf/560311 Remove UserDict from cookie.py (2002-05-31) http://python.org/sf/562987 email Parser non-strict mode (2002-06-06) http://python.org/sf/565183 Expose _Py_ReleaseInternedStrings (2002-06-06) http://python.org/sf/565378 Rationalize DL_IMPORT and DL_EXPORT (2002-06-07) http://python.org/sf/566100 Convert slice and buffer to types (2002-06-13) http://python.org/sf/568544 Remove support for Win16 (2002-06-16) http://python.org/sf/569753 Changes (?P=) with optional backref (2002-06-20) http://python.org/sf/571976 From barry@zope.com Mon Jul 22 17:05:58 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 22 Jul 2002 12:05:58 -0400 Subject: [Python-Dev] Sorting References: <15676.8253.741856.171571@anthem.wooz.org> <20020722154415.79981.qmail@web40111.mail.yahoo.com> Message-ID: <15676.11622.589953.460393@anthem.wooz.org> >>>>> "SG" == Scott Gilbert writes: SG> Or you could have an optional parameter that defaults to SG> whatever the more sane value should be (probably stable), and SG> when the user stumbles across this parameter they stumble SG> across the docs too. SG> I think Tim's codebloat argument is more compelling. Except that in http://mail.python.org/pipermail/python-dev/2002-July/026837.html Tim says: "Back on Earth, among Python users the most frequent complaint I've heard is that list.sort() isn't stable." and here http://mail.python.org/pipermail/python-dev/2002-July/026854.html Tim seems to be arguing against stable sort as being the default due to code bloat. As Tim's Official Sysadmin, I'm only good at channeling him on one subject, albeit probably one he'd deem most important to his life: lunch. So I'm not sure if he's arguing for or against stable sort being the default. ;) -Barry From skip@pobox.com Mon Jul 22 17:19:40 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 22 Jul 2002 11:19:40 -0500 Subject: [Python-Dev] Weekly bug report summary Message-ID: <15676.12444.402113.866101@12-248-11-90.client.attbi.com> Neal Norwitz asked me what happened to the weekly bug summary mailing. I've been off-net a lot and was running it via cron on my laptop Sunday mornings. I just ran the bug reporter manually and migrated the database and script over to the Mojam web server. With any luck, the script will run properly next Sunday morning. Skip From barry@zope.com Mon Jul 22 18:24:52 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 22 Jul 2002 13:24:52 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 References: Message-ID: <15676.16356.112688.518256@anthem.wooz.org> [Diverting to python-dev... -BAW] timmie> Update of /cvsroot/python/python/dist/src/Lib/email/test timmie> In directory timmie> usw-pr-cvs1:/tmp/cvs-serv3289/python/lib/email/test | Modified Files: | test_email_codecs.py | Log Message: | Changed import from | from test.test_support import TestSkipped, run_unittest | to | from test_support import TestSkipped, run_unittest timmie> Otherwise, if the Japanese codecs aren't installed, timmie> regrtest doesn't believe the TestSkipped exception raised timmie> by this test matches the timmie> except (ImportError, test_support.TestSkipped), msg: timmie> it's looking for, and reports the skip as a crash failure timmie> instead of as a skipped test. timmie> I suppose this will make it harder to run this test timmie> outside of regrtest, but under the assumption only Barry timmie> does that, better to make it skip cleanly for everyone timmie> else. A better fix, IMO, is to recognize that the `test' package has become a full fledged standard lib package (a Good Thing, IMO), heed our own admonitions not to do relative imports, and change the various places in the test suite that "import test_support" (or equiv) to "import test.test_support" (or equiv). I've twiddled the test suite to do things this way, and all the (expected Linux) tests pass, so I'd like to commit these changes. Unit test writers need to remember to use test.test_support instead of just test_support. We could do something wacky like remove '' from sys.path if we really cared about enforcing this. It would also be good for folks on other systems to make sure I haven't missed a module. -Barry From tim.one@comcast.net Mon Jul 22 18:28:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 13:28:11 -0400 Subject: [Python-Dev] Sorting In-Reply-To: <15676.11622.589953.460393@anthem.wooz.org> Message-ID: [Barry Warsaw] > Except that in > > http://mail.python.org/pipermail/python-dev/2002-July/026837.html > > Tim says: > > "Back on Earth, among Python users the most frequent complaint > I've heard is that list.sort() isn't stable." Yes, and because the current samplesort falls back to a stable sort when lists are small, almost everyone who cares about this and tries to guess about stability via trying small examples comes to a wrong conclusion. > and here > > http://mail.python.org/pipermail/python-dev/2002-July/026854.html > > Tim seems to be arguing against stable sort as being the > default due to code bloat. I'm arguing there against having two highly complex and long-winded sorting algorithms in the core. Pick one. In favor of samplesort: + It can be much faster in very-many-equal-elements cases (note that ~sort lists have only 4 distinct values, each repeated N/4 times and spread uniformaly across the whole list). + While it requires some extra memory, that lives on the stack and is O(log N). As a result, it can never raise MemoryError unless a comparison function does. + It's never had a bug reported against it (so is stable in a different sense ). In favor of timsort: + It's stable. + The code is more uniform and so potentially easier to grok, and because it has no random component is easier to predict (e.g., it's certain that it has no quadratic-time cases). + It's incredibly faster in the face of many more kinds of mild disorder, which I believe are very common in the real world. As obvious examples, you add an increment of new data to an already- sorted file, or paste together several sorted files. timsort screams in those cases, but they may as well be random to samplesort, and the difference in runtime can easily exceed a factor of 10. A factor of 10 is a rare and wonderful thing in algorithm development. Against timsort: + It can require O(N) temp storage, although the constant is small compared to object sizes. That means it can raise MemoryError even if a comparison function never does. + Very-many-equal-elements cases can be much slower, but that's partly because it *is* stable, and preserving the order of equal elements is exactly what makes stability hard to achieve in a fast sort (samplesort can't be made stable efficiently). > As Tim's Official Sysadmin, I'm only good at channeling him on one > subject, albeit probably one he'd deem most important to his life: > lunch. So I'm not sure if he's arguing for or against stable sort > being the default. ;) All else being equal, a stable sort is a better choice. Alas, all else isn't equal. If Python had no sort method now, I'd pick timsort with scant hesitation. Speaking of which, is it time for lunch yet ? From nas@python.ca Mon Jul 22 19:18:47 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 22 Jul 2002 11:18:47 -0700 Subject: [Python-Dev] Sorting In-Reply-To: ; from tim.one@comcast.net on Mon, Jul 22, 2002 at 01:28:11PM -0400 References: <15676.11622.589953.460393@anthem.wooz.org> Message-ID: <20020722111847.A3095@glacier.arctrix.com> Tim Peters wrote: > Pick one. I pick timsort. Stability is nice to have. It sounds like if you want a stable sort you will have to pay for it (e.g. ~sort is slower). The fact that timsort is faster on partially sorted inputs more than makes up for it. Neil From tim.one@comcast.net Mon Jul 22 19:20:25 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 14:20:25 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 In-Reply-To: <15676.16356.112688.518256@anthem.wooz.org> Message-ID: [Barry] > A better fix, IMO, is to recognize that the `test' package has become > a full fledged standard lib package (a Good Thing, IMO), heed our own > admonitions not to do relative imports, and change the various places > in the test suite that "import test_support" (or equiv) to "import > test.test_support" (or equiv). > > I've twiddled the test suite to do things this way, and all the > (expected Linux) tests pass, so I'd like to commit these changes. > Unit test writers need to remember to use test.test_support instead of > just test_support. We could do something wacky like remove '' from > sys.path if we really cared about enforcing this. It would also be > good for folks on other systems to make sure I haven't missed a > module. Note test/README, which says in part: """ NOTE: Always import something from test_support like so: from test_support import verbose or like so: import test_support ... use test_support.verbose in the code ... Never import anything from test_support like this: from test.test_support import verbose "test" is a package already, so can refer to modules it contains without "test." qualification. If you do an explicit "test.xxx" qualification, that can fool Python into believing test.xxx is a module distinct from the xxx in the current package, and you can end up importing two distinct copies of xxx. This is especially bad if xxx=test_support, as regrtest.py can (and routinely does) overwrite its "verbose" and "use_large_resources" attributes: if you get a second copy of test_support loaded, it may not have the same values for those as regrtest intended. """ I don't have a deep understanding of these miserable issues, so settled for a one-line patch that worked. The admonition to never import from test.test_support was a BDFL Pronouncement at the time. Note that Jack runs tests in ways nobody else does, via importing something or other from an interactive Python session (Mac Classic doesn't have a cmdline shell -- something like that). It's always an adventure trying to guess how things will break for him, although I'm not sure your suggestion is (or isn't) relevant to Jack. I imagine things will work provided that all imports "are the same". I'm not sure fiddling all the code is worth it just to save a line of typing in the email package's test suite. From tim.one@comcast.net Mon Jul 22 20:32:09 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 15:32:09 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: If you have access to a good library, you'll enjoy reading the original paper on samplesort; or a scan can be purchased from the ACM: Samplesort: A Sampling Approach to Minimal Storage Tree Sorting W. D. Frazer, A. C. McKellar JACM, Vol. 17, No. 3, July 1970 As in many papers of its time, the algorithm description is English prose and raises more questions than it answers, but the mathematical analysis is extensive. Two things made me laugh out loud: 1. The largest array they tested had 50,000 elements, because that was the practical upper limit given storage sizes at the time. Now that's such a tiny case that even in Python it's hard to time it accurately. 2. They thought about using a different sort method for small buckets, However, the additional storage required for the program would reduce the size of the input sequence which could be accommodated, and hence it is an open question as to whether or not the efficiency of the total sorting process could be improved in this way. In some ways, life was simpler then . for-example-i-had-more-hair-ly y'rs - tim From barry@zope.com Mon Jul 22 20:38:16 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 22 Jul 2002 15:38:16 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 References: <15676.16356.112688.518256@anthem.wooz.org> Message-ID: <15676.24360.88972.449273@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> Note test/README, which says in part: TP> """ TP> NOTE: Always import something from test_support like so: TP> from test_support import verbose TP> or like so: | import test_support | ... use test_support.verbose in the code ... TP> Never import anything from test_support like this: TP> from test.test_support import verbose TP> "test" is a package already, so can refer to modules it TP> contains without "test." qualification. If you do an explicit TP> "test.xxx" qualification, that can fool Python into believing TP> test.xxx is a module distinct from the xxx in the current TP> package, and you can end up importing two distinct copies of TP> xxx. This is especially bad if xxx=test_support, as TP> regrtest.py can (and routinely does) overwrite its "verbose" TP> and "use_large_resources" attributes: if you get a second copy TP> of test_support loaded, it may not have the same values for TP> those as regrtest intended. """ Yep, but I think those recommendations are out-of-date. You added them to the file almost 2 years ago. ;) Note that the warnings in that README go away when regrtest also imports test_support from the test package. TP> I don't have a deep understanding of these miserable issues, TP> so settled for a one-line patch that worked. The admonition TP> to never import from test.test_support was a BDFL TP> Pronouncement at the time. Hmm, I don't know if he considers that admonition to still be in effect, but I'd like to hope not. We're discouraging relative imports these days, and I don't see any deep reason why the regression tests need to break this rule to function (and indeed, on Unix at least it doesn't seem to). TP> Note that Jack runs tests in ways nobody else does, via TP> importing something or other from an interactive Python TP> session (Mac Classic doesn't have a cmdline shell -- something TP> like that). It's always an adventure trying to guess how TP> things will break for him, although I'm not sure your TP> suggestion is (or isn't) relevant to Jack. I wouldn't presume to know! So I'll generate a patch, upload it to SF, and assign it to Jack for review. TP> I imagine things will work provided that all imports "are the TP> same". Yes. TP> I'm not sure fiddling all the code is worth it just to TP> save a line of typing in the email package's test suite. It's a bit uglier than that because since Lib/test gets magically added to sys.path during regrtest by virtue of running "python Lib/test/regrtest.py". So to find the "same" test_support module, you'd probably have to do something more along the lines of >>> import os >>> import test.regrtest >>> testdir = os.path.dirname(test.regrtest.__file__) >>> sys.path.insert(0, testdir) >>> import test_support blechi-ly y'rs, -Barry From Rick Farrer" This is a multi-part message in MIME format. ------=_NextPart_000_000D_01C23192.61006FC0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Please remove me from your mailing list. Thanks ------=_NextPart_000_000D_01C23192.61006FC0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Please remove me from your mailing=20 list.
 
Thanks
 
------=_NextPart_000_000D_01C23192.61006FC0-- From tim.one@comcast.net Tue Jul 23 03:07:57 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 22 Jul 2002 22:07:57 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw) Content-type: text/plain; charset=Windows-1252 Content-transfer-encoding: 7BIT In an effort to save time on email (ya, right ...), I wrote up a pretty detailed overview of the "timsort" algorithm. It's attached. all-will-be-revealed-ly y'rs - tim --Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw) Content-type: text/plain; name=timsort.txt Content-transfer-encoding: quoted-printable Content-disposition: attachment; filename=timsort.txt /*-----------------------------------------------------------------------= ---- A stable natural mergesort with excellent performance on many flavors of lightly disordered arrays, and as fast as samplesort on random arrays. In a nutshell, the main routine marches over the array once, left to = right, alternately identifying the next run, and then merging it into the = previous runs. Everything else is complication for speed, and some measure of = memory efficiency. Runs ---- count_run() returns the # of elements in the next run. A run is either "ascending", which means non-decreasing: a0 <=3D a1 <=3D a2 <=3D ... or "descending", which means strictly decreasing: a0 > a1 > a2 > ... Note that a run is always at least 2 long, unless we start at the = array's last element. The definition of descending is strict, because the main routine = reverses a descending run in-place, transforming a descending run into an = ascending run. Reversal is done via the obvious fast "swap elements starting at = each end, and converge at the middle" method, and that can violate stability = if the slice contains any equal elements. Using a strict definition of descending ensures that a descending run contains distinct elements. If an array is random, it's very unlikely we'll see long runs, much of = the rest of the algorithm is geared toward exploiting long runs, and that = takes a fair bit of work. That work is a waste of time if the data is random, = so if a natural run contains less than MIN_MERGE_SLICE elements, the main = loop artificially boosts it to MIN_MERGE_SLICE elements, via binary insertion sort applied to the right number of array elements following the short natural run. In a random array, *all* runs are likely to be = MIN_MERGE_SLICE long as a result, and merge_at() short-circuits the expensive stuff in = that case. The Merge Pattern ----------------- In order to exploit regularities in the data, we're merging on natural run lengths, and they can become wildly unbalanced. But that's a Good = Thing for this sort! Stability constrains permissible merging patterns. For example, if we = have 3 consecutive runs of lengths A:10000 B:20000 C:10000 we dare not merge A with C first, because if A, B and C happen to = contain a common element, it would get out of order wrt its occurence(s) in B. = The merging must be done as (A+B)+C or A+(B+C) instead. So merging is always done on two consecutive runs at a time, and = in-place, although this may require some temp memory (more on that later). When a run is identified, its base address and length are pushed on a = stack in the MergeState struct. merge_collapse() is then called to see = whether it should merge it with preceeding run(s). We would like to delay = merging as long as possible in order to exploit patterns that may come up later, = but we would like to do merging as soon as possible to exploit that the run = just found is still high in the memory hierarchy. We also can't delay = merging "too long" because it consumes memory to remember the runs that are = still unmerged, and the stack has a fixed size. What turned out to be a good compromise maintains two invariants on the stack entries, where A, B and C are the lengths of the three righmost = not-yet merged slices: 1. A > B+C 2. B > C Note that, by induction, #2 implies the lengths of pending runs form a decreasing sequence. #1 implies that, reading the lengths right to = left, the pending-run lengths grow at least as fast as the Fibonacci numbers. Therefore the stack can never grow larger than about log_base_phi(N) = entries, where phi =3D (1+sqrt(5))/2 ~=3D 1.618. Thus a small # of stack slots = suffice for very large arrays. If A <=3D B+C, the smaller of A and C is merged with B, and the new run = replaces the A,B or B,C entries; e.g., if the last 3 entries are A:30 B:20 C:10 then B is merged with C, leaving A:30 BC:30 on the stack. Or if they were A:500 B:400: C:1000 then A is merged with B, leaving AB:900 C:1000 on the stack. In both examples, the stack configuration still violates invariant #2, = and merge_at() goes on to continue merging runs until both invariants are satisfied. As an extreme case, suppose we didn't do the MIN_MERGE_SLICE gimmick, and natural runs were of lengths 128, 64, 32, 16, 8, 4, 2, and = 2. Nothing would get merged until the final 2 was seen, and that would = trigger 7 perfectly balanced (both runs involved have the same size) merges. The thrust of these rules when they trigger merging is to balance the = run lengths as closely as possible, while keeping a low bound on the number of runs we have to remember. This is maximally effective for random = data, where all runs are likely to be of (artificially forced) length MIN_MERGE_SLICE, and then we get a sequence of perfectly balanced = merges. OTOH, the reason this sort is so good for lightly disordered data has to = do with wildly unbalanced run lengths. Merge Memory ------------ Merging adjacent runs of lengths A and B in-place is very difficult. Theoretical constructions are known that can do it, but they're too = difficult and slow for practical use. But if we have temp memory equal to min(A, = B), it's easy. If A is smaller, copy A to a temp array, leave B alone, and then we can do the obvious merge algorithm left to right, from the temp area and B, starting the stores into where A used to live. There's always a free = area in the original area comprising a number of elements equal to the number not yet merged from the temp array (trivially true at the start; proceed by induction). The only tricky bit is that if a comparison raises an exception, we have to remember to copy the remaining elements back in = from the temp area, lest the array end up with duplicate entries from B. If B is smaller, much the same, except that we need to merge right to = left, starting the stores at the right end of where B used to live. In all, then, we need no more than N/2 temp array slots. A refinement: When we're about to merge adjacent runs A and B, we first do a form of binary search (more on that later) to see where B[0] should end up in A. Elements in A preceding that point are already in their = final positions, effectively shrinking the size of A. Likewise we also search to see where A[-1] should end up in B, and elements of B after that = point can also be ignored. This cuts the amount of temp memory needed by the same amount. It may not pay, though. Merge Algorithms ---------------- When merging runs of lengths A and B, if A/2 <=3D B <=3D 2*A (i.e., = they're within a factor of two of each other), we do the usual straightforward = one-at- a-time merge. This can take up to A+B comparisons. If the data is = random, there's very little potential for doing better than that. If there are = a great many equal elements, we can do better than that, but there's no = way to know whether there *are* a great many equal elements short of doing a great many additional comparisons (we only use "<" in sort), and that's too expensive when it doesn't pay. If the sizes of A and B are out of whack, we can do much better. The Hwang-Lin merging algorithm is very good at merging runs of mismatched lengths if the data is random, but I believe it would be a mistake to try that here. As explained before, if we really do have random data, = we're almost certainly going to stay in the A/2 <=3D B <=3D 2*A case. Instead we assume that wildly different run lengths correspond to *some* sort of clumpiness in the data. Without loss of generality, assume A is the shorter run. We first look for A[0] in B. We do this via = "galloping", comparing A[0] in turn to B[0], B[1], B[3], B[7], ..., B[2**j - 1], ..., until finding the k such that B[2**(k-1) - 1] < A[0] <=3D B[2**k - 1]. = This takes at most log2(B) comparisons, and, unlike a straight binary search, favors finding the right spot early in B. Why that's important may = become clear later. After finding such a k, the region of uncertainty is reduced to 2**(k-1) = - 1 consecutive elements, and a straight binary search requires exactly k-1 comparisons to nail it. Now we can copy all the B's up to that point in one chunk, and then copy = A[0]. If the data really is clustered, the new A[0] (what was A[1] at the = start) is likely to belong near the start of what remains of the B run. That's why we gallop first instead of doing a straight binary search: if the = new A[0] really is near the start of the remaining B run, galloping will = find it much quicker. OTOH, if we're wrong, galloping + binary search never = takes more than 2*log2(B) compares, so can't become a disaster. If the = clumpiness comes in distinct clusters, gallop + binary search also adapts nicely to that. I first learned about the galloping strategy in a related context; do a Google search to find this paper available online: "Adaptive Set Intersections, Unions, and Differences" (2000) Erik D. Demaine, Alejandro L=F3pez-Ortiz, J. Ian Munro and its followup(s). -------------------------------------------------------------------------= --*/ --Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw)-- From neal@metaslash.com Tue Jul 23 04:19:32 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 22 Jul 2002 23:19:32 -0400 Subject: [Python-Dev] More Sorting Message-ID: <3D3CCB44.4F2592ED@metaslash.com> Sebastien Keim posted a patch (http://python.org/sf/544113) of a merge sort. I didn't really review it, but it included test and doc. So if the bisect module is being added to, perhaps someone should review this patch. Neal From ping@zesty.ca Tue Jul 23 05:57:24 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Mon, 22 Jul 2002 21:57:24 -0700 (PDT) Subject: [Python-Dev] Re: The iterator story In-Reply-To: <200207220450.g6M4o2u23472@oma.cosc.canterbury.ac.nz> Message-ID: SYNOPSIS: a slight adjustment to the definition of consume() yields a simple solution that addresses both the destruction issue and the multiple-iteration issue, without introducing any new syntax. On Mon, 22 Jul 2002, Greg Ewing wrote: > As someone pointed out, it's pretty rare that you actually *want* to > consume the sequence. Usually the choice is between "I don't care" and > "The sequence must NOT be consumed". Sure, i'll go for that. What i'm after is the ability to say "i would like this sequence not to be consumed." > Of the two varieties of for-loop in your proposal, for-in > obviously corresponds to the "must not be consumed" case, > leading one to suppose that you intend for-from to be used in > the don't-care case. Right. > But now you seem to be suggesting that library routines > should always use for-in, and that the caller should > convert an iterator to a sequence if he knows it's okay > to consume it: The two are semantically equivalent proposals. I explained them both in the original message that i posted proposing the solution. The 'consume()' library routine is just another way to express 'for-from' without using new syntax. However, it is true that 'consume()' is more generally useful. It would be good to have, whether or not we had new syntax. I acknowledge that i did not realize this at the time i wrote the earlier message, or i would have stated the 'consume()' (then called 'seq()') proposal first and the for-from proposal second, instead of the opposite. That is why i am sticking to talking about the no-new-syntax version of the proposal for now. I apologize if it seems that i am asking you to follow a moving target. I would like you to recognize, though, that the underlying concept is the same -- the programmer has to signal when an iterator is being used like a sequence. > Okay, that seems reasonable -- explicit is better than > implicit. But... consider the following two library > routines: > > def printout1(s): > for x in s: > print x > > def printout2(s): > for x in s: > for y in s: > print x, y [...] > no exception will be raised if you call printout2(consume(s)) > by mistake. Good point! Clearly my proposal did not take care of this case. (But there are solutions below; read on.) Upon some reflection, though, it seems to me that this problem is orthogonal to the proposal: forcing the programmer to declare when destruction is allowed neither solves nor exacerbates the problem of printout2(). consume() is about destruction, whereas printout2() is about multiple iteration. > To get any safety benefit from your proposed arrangement, > it seems to me that you'd need to write printout1 as > > def printout1(s): > "s must be an iterator" > for x from s: > print x I'm afraid i don't see how this bears on the problem you just described. It still would not be possible to write a safe version of printout2() in either (a) the world of the current Python with iterators or (b) a world where for-in does not accept iterators and consume() has been introduced. One real solution to this problem is what Oren has been suggesting all along -- raise an IteratorExhausted exception if you try to fetch an element from an iterator that has already thrown StopIteration. In printout2(), this exception would occur on the second time through the inner loop. This works, but we can do even better. After some thought today, i realized that there is a second solution. Thanks for leading me to it, Greg! With consume(), the programmer has declared that the iterator is okay to destroy. But my definition of consume() was incomplete. One slight change solves the problem: consume(y) returns x such that iter(x) returns y the first time, and raises IteratorConsumedException thereafter. Now we're all set! If consume(it) is passed to printout2(), an exception is raised immediately before any damage is done. This detects whether you attempt to *start* the iterator twice, which makes more sense than detecting whether you hit the *end* of the iterator twice. The insight is that protection against multiple iteration belongs in the implementation of __iter__, not in the iterator itself -- because the iterator doesn't know whether it can be restarted. The *provider* of the iterator does. > There's no doubt that it's very elegant theoretically, > but in thinking through the implications, I'm not sure it > would be all that helpful in practice, and might even > turn out to be a nuisance if it requires putting in a > lot of iter(x) and/or consume(x) calls. It's not so bad. You only have to say iter() or consume() in exceptional cases, where you are specifically writing code to manipulate iterators. Everything else looks the same -- except it's safe. More importantly, neither iter() nor consume() need to be taught on the first day of Python. I think it all comes together quite nicely. Here it is in summary: - Iterators just implement __next__. - Containers, and other things that want to be iterated over, just implement __iter__. - The new built-in routine consume(y) returns x such that iter(x) returns y the first time, and raises IteratorConsumedException thereafter. - (Other objects that only allow one-shot iteration can also raise IteratorConsumedException when their __iter__ is called twice.) Advantages: 1. "for-in" and "in" are safe to use -- no fear of destruction. 2. One-shot iterators are safe against multiple iteration. 3. Iterators don't have to implement a dummy __iter__ method returning self. 4. The implementation of "for" stays exactly as it is now. 5. Current implementations of iterators continue to work fine, if unsafely (but they're already unsafe). 6. No new syntax. 7. For-loops continue to work on containers exactly as they always have. 8. Iterators don't have to maintain extra state to know that it's time to start throwing IteratorExhausted instead of StopIteration. Items 1, 2, and 3 are distinct improvements over the current state of affairs. The only inconvenience is the case where an iterator is being passed to a routine that expects a container; this is still pretty rare yet, and this situation is easy to detect (hence, the error message from "for" can explain what to do). In this case, you have to wrap consume() around the iterator to declare it okay to consume. And that's all. The fact that it takes only a slight adjustment to the earlier proposal to solve *both* the destruction problem and the multiple-iteration problem has led me to be even more convinced that this is the "right answer" -- in the sense that this is how i would design the protocol if we were starting from scratch. Now, i know we are not starting from scratch. And i know Guido has already said he doesn't want to solve this problem. But, just in case you are wondering, the migration path from here to there seems pretty straightforward to me: 1. When __next__() is not present, call next() and issue a warning. 2. In the next version, deprecate next() in favour of __next__(). 3. Add consume() and IteratorConsumedException to built-ins. 4. Deprecate the dummy __iter__() method on iterators. 5. Throw a party and consume(mass_quantities). -- ?!ng "Most things are, in fact, slippery slopes. And if you start backing off from one thing because it's a slippery slope, who knows where you'll stop?" -- Sean M. Burke From xscottg@yahoo.com Tue Jul 23 07:22:12 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 22 Jul 2002 23:22:12 -0700 (PDT) Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: <20020723062212.25747.qmail@web40102.mail.yahoo.com> --- Tim Peters wrote: > In an effort to save time on email (ya, right ...), I wrote up a pretty > detailed overview of the "timsort" algorithm. It's attached. > > all-will-be-revealed-ly y'rs - tim > > [Interesting stuff deleted.] > I'm curious if there is any literature that you've come across, or if you've done any experiments with merging more than two parts at a time. So instead of merging like such: A B C D E F G H I J K L AB CD EF GH IJ KL ABCD EFGH IJKL ABCDEFGH IJKL ABCDEFGHIJKL You were to merge A B C D E F G H I J K L ABC DEF GHI JKL ABCDEF GHIJKL ABCDEFGHIJKL (I realize that your merges are based on the lengths of the subsequences, but you get the point.) My thinking is that many machines (probably yours for instance) have a cache that is 4-way associative, so merging only 2 blocks at a time might not be using the cache as well as it could. Also, changing from merging 2 blocks to 3 or 4 blocks at a time would change the number of passes you have to make (the log part of N*log(N)). It's quite possible that this isn't worth the trade off in complexity and space (or your time :-). Keeping track of comparisons that you've already made could get ugly, and your temp space requirement would go from N/2 to possibly 3N/4... But since you're diving so deeply into this problem, I figured I'd throw it out there. OTOH, this could be close to the speedup that heavily optimized FFT algs get when they go from radix-2 to radix-4. Just thinking out loud... __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Tue Jul 23 07:36:11 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 22 Jul 2002 23:36:11 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem Message-ID: <20020723063611.26677.qmail@web40102.mail.yahoo.com> --0-1908127438-1027406171=:26257 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline The latest version of this PEP will be in CVS, but the most recent copy as of this message is attached. I'm posting this to python-dev first to shave off the rough edges. I'll post to comp.lang.python after that. Please don't hesitate to email me directly if you have any questions on it. Cheers, -Scott Gilbert __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com --0-1908127438-1027406171=:26257 Content-Type: text/plain; name="pep-0296.txt" Content-Description: pep-0296.txt Content-Disposition: inline; filename="pep-0296.txt" PEP: 296 Title: The Buffer Problem Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/07/22 21:03:34 $ Author: xscottg at yahoo.com (Scott Gilbert) Status: Draft Type: Standards Track Created: 12-Jul-2002 Python-Version: 2.3 Post-History: Abstract This PEP proposes the creation of a new standard type and builtin constructor called 'bytes'. The bytes object is an efficiently stored array of bytes with some additional characteristics that set it apart from several implementations that are similar. Rationale Python currently has many objects that implement something akin to the bytes object of this proposal. For instance the standard string, buffer, array, and mmap objects are all very similar in some regards to the bytes object. Additionally, several significant third party extensions have created similar objects to try and fill similar needs. Frustratingly, each of these objects is too narrow in scope and is missing critical features to make it applicable to a wider category of problems. Specification The bytes object has the following important characteristics: 1. Efficient underlying array storage via the standard C type "unsigned char". This allows fine grain control over how much memory is allocated. With the alignment restrictions designated in the next item, it is trivial for low level extensions to cast the pointer to a different type as needed. Also, since the object is implemented as an array of bytes, it is possible to pass the bytes object to the extensive library of routines already in the standard library that presently work with strings. For instance, the bytes object in conjunction with the struct module could be used to provide a complete replacement for the array module using only Python script. If an unusual platform comes to light, one where there isn't a native unsigned 8 bit type, the object will do its best to represent itself at the Python script level as though it were an array of 8 bit unsigned values. It is doubtful whether many extensions would handle this correctly, but Python script could be portable in these cases. 2. Alignment of the allocated byte array is whatever is promised by the platform implementation of malloc. A bytes object created from an extension can be supplied that provides any arbitrary alignment as the extension author sees fit. This alignment restriction should allow the bytes object to be used as storage for all standard C types - including PyComplex objects or other structs of standard C type types. Further alignment restrictions can be provided by extensions as necessary. 3. The bytes object implements a subset of the sequence operations provided by string/array objects, but with slightly different semantics in some cases. In particular, a slice always returns a new bytes object, but the underlying memory is shared between the two objects. This type of slice behavior has been called creating a "view". Additionally, repetition and concatenation are undefined for bytes objects and will raise an exception. As these objects are likely to find use in high performance applications, one motivation for the decision to use view slicing is that copying between bytes objects should be very efficient and not require the creation of temporary objects. The following code illustrates this: # create two 10 Meg bytes objects b1 = bytes(10000000) b2 = bytes(10000000) # copy from part of one to another with out creating a 1 Meg temporary b1[2000000:3000000] = b2[4000000:5000000] Slice assignment where the rvalue is not the same length as the lvalue will raise an exception. However, slice assignment will work correctly with overlapping slices (typically implemented with memmove). 4. The bytes object will be recognized as a native type by the pickle and cPickle modules for efficient serialization. (In truth, this is the only requirement that can't be implemented via a third party extension.) Partial solutions to address the need to serialize the data stored in a bytes-like object without creating a temporary copy of the data into a string have been implemented in the past. The tofile and fromfile methods of the array object are good examples of this. The bytes object will support these methods too. However, pickling is useful in other situations - such as in the shelve module, or implementing RPC of Python objects, and requiring the end user to use two different serialization mechanisms to get an efficient transfer of data is undesirable. XXX: Will try to implement pickling of the new bytes object in such a way that previous versions of Python will unpickle it as a string object. When unpickling, the bytes object will be created from memory allocated from Python (via malloc). As such, it will lose any additional properties that an extension supplied pointer might have provided (special alignment, or special types of memory). XXX: Will try to make it so that C subclasses of bytes type can supply the memory that will be unpickled into. For instance, a derived class called PageAlignedBytes would unpickle to memory that is also page aligned. On any platform where an int is 32 bits (most of them), it is currently impossible to create a string with a length larger than can be represented in 31 bits. As such, pickling to a string will raise an exception when the operation is not possible. At least on platforms supporting large files (many of them), pickling large bytes objects to files should be possible via repeated calls to the file.write() method. 5. The bytes type supports the PyBufferProcs interface, but a bytes object provides the additional guarantee that the pointer will not be deallocated or reallocated as long as a reference to the bytes object is held. This implies that a bytes object is not resizable once it is created, but allows the global interpreter lock (GIL) to be released while a separate thread manipulates the memory pointed to if the PyBytes_Check(...) test passes. This characteristic of the bytes object allows it to be used in situations such as asynchronous file I/O or on multiprocessor machines where the pointer obtained by PyBufferProcs will be used independently of the global interpreter lock. Knowing that the pointer can not be reallocated or freed after the GIL is released gives extension authors the capability to get true concurrency and make use of additional processors for long running computations on the pointer. 6. In C/C++ extensions, the bytes object can be created from a supplied pointer and destructor function to free the memory when the reference count goes to zero. The special implementation of slicing for the bytes object allows multiple bytes objects to refer to the same pointer/destructor. As such, a refcount will be kept on the actual pointer/destructor. This refcount is separate from the refcount typically associated with Python objects. XXX: It may be desirable to expose the inner refcounted object as an actual Python object. If a good use case arises, it should be possible for this to be implemented later with no loss to backwards compatibility. 7. It is also possible to signify the bytes object as readonly, in this case it isn't actually mutable, but does provide the other features of a bytes object. 8. The bytes object keeps track of the length of its data with a Python LONG_LONG type. Even though the current definition for PyBufferProcs restricts the length to be the size of an int, this PEP does not propose to make any changes there. Instead, extensions can work around this limit by making an explicit PyBytes_Check(...) call, and if that succeeds they can make a PyBytes_GetReadBuffer(...) or PyBytes_GetWriteBuffer call to get the pointer and full length of the object as a LONG_LONG. The bytes object will raise an exception if the standard PyBufferProcs mechanism is used and the size of the bytes object is greater than can be represented by an integer. From Python scripting, the bytes object will be subscriptable with longs so the 32 bit int limit can be avoided. There is still a problem with the len() function as it is PyObject_Size() and this returns an int as well. As a workaround, the bytes object will provide a .length() method that will return a long. 9. The bytes object can be constructed at the Python scripting level by passing an int/long to the bytes constructor with the number of bytes to allocate. For example: b = bytes(100000) # alloc 100K bytes The constructor can also take another bytes object. This will be useful for the implementation of unpickling, and in converting a read-write bytes object into a read-only one. An optional second argument will be used to designate creation of a readonly bytes object. 10. From the C API, the bytes object can be allocated using any of the following signatures: PyObject* PyBytes_FromLength(LONG_LONG len, int readonly); PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly void (*dest)(void *ptr, void *user), void* user); In the PyBytes_FromPointer(...) function, if the dest function pointer is passed in as NULL, it will not be called. This should only be used for creating bytes objects from statically allocated space. The user pointer has been called a closure in other places. It is a pointer that the user can use for whatever purposes. It will be passed to the destructor function on cleanup and can be useful for a number of things. If the user pointer is not needed, NULL should be passed instead. 11. The bytes type will be a new style class as that seems to be where all standard Python types are headed. Contrast to existing types The most common way to work around the lack of a bytes object has been to simply use a string object in its place. Binary files, the struct/array modules, and several other examples exist of this. Putting aside the style issue that these uses typically have nothing to do with text strings, there is the real problem that strings are not mutable, so direct manipulation of the data returned in these cases is not possible. Also, numerous optimizations in the string module (such as caching the hash value or interning the pointers) mean that extension authors are on very thin ice if they try to break the rules with the string object. The buffer object seems like it was intended to address the purpose that the bytes object is trying fulfill, but several shortcomings in its implementation [1] have made it less useful in many common cases. The buffer object made a different choice for its slicing behavior (it returns new strings instead of buffers for slicing and other operations), and it doesn't make many of the promises on alignment or being able to release the GIL that the bytes object does. Also in regards to the buffer object, it is not possible to simply replace the buffer object with the bytes object and maintain backwards compatibility. The buffer object provides a mechanism to take the PyBufferProcs supplied pointer of another object and present it as its own. Since the behavior of the other object can not be guaranteed to follow the same set of strict rules that a bytes object does, it can't be used in places that a bytes object could. The array module supports the creation of an array of bytes, but it does not provide a C API for supplying pointers and destructors to extension supplied memory. This makes it unusable for constructing objects out of shared memory, or memory that has special alignment or locking for things like DMA transfers. Also, the array object does not currently pickle. Finally since the array object allows its contents to grow, via the extend method, the pointer can be changed if the GIL is not held while using it. Creating a buffer object from an array object has the same problem of leaving an invalid pointer when the array object is resized. The mmap object caters to its particular niche, but does not attempt to solve a wider class of problems. Finally, any third party extension can not implement pickling without creating a temporary object of a standard python type. For example in the Numeric community, it is unpleasant that a large array can't pickle without creating a large binary string to duplicate the array data. Backward Compatibility The only possibility for backwards compatibility problems that the author is aware of are in previous versions of Python that try to unpickle data containing the new bytes type. Reference Implementation XXX: Actual implementation is in progress, but changes are still possible as this PEP gets further review. The following new files will be added to the Python baseline: Include/bytesobject.h # C interface Objects/bytesobject.c # C implementation Lib/test/test_bytes.py # unit testing Doc/lib/libbytes.tex # documentation The following files will also be modified: Include/Python.h # adding bytesmodule.h include file Python/bltinmodule.c # adding the bytes type object Modules/cPickle.c # adding bytes to the standard types Lib/pickle.py # adding bytes to the standard types It is possible that several other modules could be cleaned up and implemented in terms of the bytes object. The mmap module comes to mind first, but as noted above it would be possible to reimplement the array module as a pure Python module. While it is attractive that this PEP could actually reduce the amount of source code by some amount, the author feels that this could cause unnecessary risk for breaking existing applications and should be avoided at this time. Additional Notes/Comments - Guido van Rossum wondered whether it would make sense to be able to create a bytes object from a mmap object. The mmap object appears to support the requirements necessary to provide memory for a bytes object. (It doesn't resize, and the pointer is valid for the lifetime of the object.) As such, a method could be added to the mmap module such that a bytes object could be created directly from a mmap object. An initial stab at how this would be implemented would be to use the PyBytes_FromPointer() function described above and pass the mmap_object as the user pointer. The destructor function would decref the mmap_object for cleanup. - Todd Miller notes that it may be useful to have two new functions: PyObject_AsLargeReadBuffer() and PyObject_AsLargeWriteBuffer that are similar to PyObject_AsReadBuffer() and PyObject_AsWriteBuffer(), but support getting a LONG_LONG length in addition to the void* pointer. These functions would allow extension authors to work transparently with bytes object (that support LONG_LONG lengths) and most other buffer like objects (which only support int lengths). These functions could be in lieu of, or in addition to, creating a specific PyByte_GetReadBuffer() and PyBytes_GetWriteBuffer() functions. XXX: The author thinks this is very a good idea as it paves the way for other objects to eventually support large (64 bit) pointers, and it should only affect abstract.c and abstract.h. Should this be added above? - It was generally agreed that abusing the segment count of the PyBufferProcs interface is not a good hack to work around the 31 bit limitation of the length. If you don't know what this means, then you're in good company. Most code in the Python baseline, and presumably in many third party extensions, punt when the segment count is not 1. References [1] The buffer interface http://mail.python.org/pipermail/python-dev/2000-October/009974.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: --0-1908127438-1027406171=:26257-- From tim.one@comcast.net Tue Jul 23 09:30:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 23 Jul 2002 04:30:11 -0400 Subject: [Python-Dev] Sorting In-Reply-To: <20020723062212.25747.qmail@web40102.mail.yahoo.com> Message-ID: [Scott Gilbert] > I'm curious if there is any literature that you've come across, or if > you've done any experiments with merging more than two parts at a > time. There's a literal mountain of research on the topic. I recommend "A Meticulous Analysis of Mergesort Programs" Jyrki Katajainen, Jesper Larsson Traff for a careful accounting of all operations that go into one of these beasts. They got the best results (and much better than quicksort) out of a 4-way bottom-up mergesort via very tedious code (e.g., it effectively represents which input run currently has the smallest next key via the program counter, by way of massive code duplication and oodles of gotos); they were afraid to write actual code for an 8-way version . OTOH, they were sorting random integers, and, e.g., were delighted to increase the # of comparisons when that could save a few other "dirt cheap" operations. > ... > My thinking is that many machines (probably yours for instance) have a > cache that is 4-way associative, so merging only 2 blocks at a time might > not be using the cache as well as it could. Also, changing from merging 2 > blocks to 3 or 4 blocks at a time would change the number of passes you > have to make (the log part of N*log(N)). > > It's quite possible that this isn't worth the trade off in complexity and > space (or your time :-). The real reason it's uninteresting to me is that it has no clear applicability to the cases this sort aims at: exploiting significant pre-existing order of various kinds. That leads to unbalanced run lengths when we're lucky, and if I'm merging a 2-element run with a 100,000-element run, high cache associativity isn't of much use. From the timings I showed before, it's clear that "good cases" of pre-existing order take time that depends almost entirely on just the number of comparisons needed; e.g., 3sort and +sort were as fast as /sort, where the latter does nothing but N-1 comparisons in a single left-to-right scan of the array. Comparisons are expensive enough in Python that doing O(log N) additional comparisons in 3sort and +sort, then moving massive amounts of the array around to fit the oddballs in place, costs almosts nothing more in percentage terms. Since these cases are already effectively as fast as a single left-to-right scan, there's simply no potential remaining for significant gain (unless you can speed a single left-to-right scan! that would be way cool). If you think you can write a sort for random Python arrays faster than the samplesort hybrid, be my guest: I'd love to see it! You should be aware that I've been making this challenge for years . Something to note: I think you have an overly simple view of Python's lists in mind. When we're merging two runs in the timing test, it's not *just* the list memory that's getting scanned. The lists contain pointers *to* float objects. The float objects have to get read up from memory too, and there goes the rest of your 4-way associativity. Indeed, if you read the comments in Lib/test/sortperf.py, you'll find that it performs horrid trickery to ensure that =sort and =sort work on physically distinct float objects; way back when, these particular tests ran much faster, and that turned out to be partly because, e.g., [0.5] * N constructs a list with N pointers to a single float object, and that was much easier on the memory system. We got a really nice slowdown by forcing N distinct copies of 0.5. In earlier Pythons the comparison also got short-circuited by an early pointer-equality test ("if they're the same object, they must be equal"), but that's not done anymore. A quick run just now showed that =sort still runs significantly quicker if given a list of identical objects; the only explanation left for that appears to be cache effects. > Keeping track of comparisons that you've already made could get ugly, Most researches have found that a fancy data structure for this is counter-productive: so long as the m in m-way merging isn't ridiculously large, keeping the head entries in a straight vector with m elements runs fastest. But they're not worried about Python's expensive-comparison case. External sorts using m-way merging with large m typically use a selection tree much like a heap to reduce the expense of keeping track (see, e.g., Knuth for details). > and your temp space requirement would go from N/2 to possibly 3N/4... > But since you're diving so deeply into this problem, I figured I'd > throw it out there. > > OTOH, this could be close to the speedup that heavily optimized FFT algs > get when they go from radix-2 to radix-4. Just thinking out loud... I don't think that's comparable. Moving to radix 4 cuts the total number of non-trivial complex multiplies an FFT has to do, and non-trivial complex multiplies are the expensive part of what an FFT does. In contrast, boosting the m in m-way merging doesn't cut the number of comparisons needed at all (to the contrary, if you're not very careful it increases them), and comparisons are what kill sorting routines in Python. The elaborate gimmicks in timsort for doing merges of unbalanced runs do cut the total number of comparisons needed, and that's where the huge wins come from. From ark@research.att.com Tue Jul 23 14:58:30 2002 From: ark@research.att.com (Andrew Koenig) Date: 23 Jul 2002 09:58:30 -0400 Subject: [Python-Dev] The iterator story In-Reply-To: References: Message-ID: Ping> - Iterators provide just one method, __next__(). Ping> - The built-in next() calls tp_iternext. For instances, Ping> tp_iternext calls __next__. Ping> - Objects wanting to be iterated over provide just one method, Ping> __iter__(). Some of these are containers, but not all. Ping> - The built-in iter(foo) calls tp_iter. For instances, Ping> tp_iter calls __iter__. Ping> - "for x in y" gets iter(y) and uses it as an iterator. Ping> - "for x from y" just uses y as the iterator. +1. Ping> - We have a nice clean division between containers and iterators. Ping> - When you see "for x in y" you know that y is a container. What if y is a file? You already said that files are not containers. Ping> - When you see "for x from y" you know that y is an iterator. Ping> - "for x in y" never destroys y. Ping> - "if x in y" never destroys y. What if y is a file? Ping> Other notes: Ping> - The file problem has a consistent solution. Instead of writing Ping> "for line in file" you write Ping> for line from file: Ping> print line Ping> Being forced to write "from" signals to you that the file is Ping> eaten up. There is no expectation that "for line from file" Ping> will work again. Ah. So you want to break "for line in file:", which works now? I'm still +1 as long as there is a transition scheme. Ping> My Not-So-Ideal Protocol Ping> ------------------------ Ping> All right. So new syntax may be hard to swallow. An alternative Ping> is to introduce an adapter that turns an iterator into something Ping> that "for" will accept -- that is, the opposite of iter(). Ping> - The built-in seq(it) returns x such that iter(x) yields it. Ping> Then instead of writing Ping> for x from it: Ping> you would write Ping> for x in seq(it): Ping> and the rest would be the same. The use of "seq" here is what Ping> would flag the fact that "it" will be destroyed. I prefer "for x from it: -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From thomas.heller@ion-tof.com Tue Jul 23 15:18:31 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 23 Jul 2002 16:18:31 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <003c01c23253$d2f80860$e000a8c0@thomasnotebook> > PEP: 296 > Title: The Buffer Problem IMO should better be 'The bytes Object' > 6. In C/C++ extensions, the bytes object can be created from a supplied > pointer and destructor function to free the memory when the > reference count goes to zero. > > The special implementation of slicing for the bytes object allows > multiple bytes objects to refer to the same pointer/destructor. > As such, a refcount will be kept on the actual > pointer/destructor. This refcount is separate from the refcount > typically associated with Python objects. > Why is this? Wouldn't it be sufficient if views keep references to the 'viewed' byte object? > 8. The bytes object keeps track of the length of its data with a Python > LONG_LONG type. Even though the current definition for PyBufferProcs > restricts the length to be the size of an int, this PEP does not propose > to make any changes there. Instead, extensions can work around this limit > by making an explicit PyBytes_Check(...) call, and if that succeeds they > can make a PyBytes_GetReadBuffer(...) or PyBytes_GetWriteBuffer call to > get the pointer and full length of the object as a LONG_LONG. > > The bytes object will raise an exception if the standard PyBufferProcs > mechanism is used and the size of the bytes object is greater than can be > represented by an integer. > > From Python scripting, the bytes object will be subscriptable with longs > so the 32 bit int limit can be avoided. > > There is still a problem with the len() function as it is PyObject_Size() > and this returns an int as well. As a workaround, the bytes object will > provide a .length() method that will return a long. > Is this worth the trouble? (Hm, 64-bit platforms with 32-bit integers remind my of the broken DOS/Windows 3.1 platforms with near/far/huge pointers). > 9. The bytes object can be constructed at the Python scripting level by > passing an int/long to the bytes constructor with the number of bytes to > allocate. For example: > > b = bytes(100000) # alloc 100K bytes > > The constructor can also take another bytes object. This will be useful > for the implementation of unpickling, and in converting a read-write bytes > object into a read-only one. An optional second argument will be used to > designate creation of a readonly bytes object. > > 10. From the C API, the bytes object can be allocated using any of the > following signatures: > > PyObject* PyBytes_FromLength(LONG_LONG len, int readonly); > PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly > void (*dest)(void *ptr, void *user), void* user); > > In the PyBytes_FromPointer(...) function, if the dest function pointer is > passed in as NULL, it will not be called. This should only be used for > creating bytes objects from statically allocated space. > > The user pointer has been called a closure in other places. It is a > pointer that the user can use for whatever purposes. It will be passed to > the destructor function on cleanup and can be useful for a number of > things. If the user pointer is not needed, NULL should be passed instead. Shouldn't there be constructors to create a view of a bytes/view object, or are we supposed to create them by slicing? > 11. The bytes type will be a new style class as that seems to be where all > standard Python types are headed. Good. Thanks, Thomas From xscottg@yahoo.com Tue Jul 23 16:40:55 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 23 Jul 2002 08:40:55 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <003c01c23253$d2f80860$e000a8c0@thomasnotebook> Message-ID: <20020723154055.54251.qmail@web40106.mail.yahoo.com> --- Thomas Heller wrote: > > PEP: 296 > > Title: The Buffer Problem > > IMO should better be 'The bytes Object' > Part of the title was just me being cute, but apparently this problem has a long history and has been referred to as "The Buffer Problem" many times in the past. Plus when I first submitted it, I wasn't sure the name "bytes" was going to stick. > > Why is this? Wouldn't it be sufficient if views keep references > to the 'viewed' byte object? > They do, but the referenced "inner-thing" needs it's own reference count to know how many "bytes-views" are sharing it. When a bytes-view gets cleaned up, it decrefs the reference count of the inner-thing it is referring to, and if the reference count goes to zero, the bytes-view calls the destructor for the inner-thing. > > > and this returns an int as well. As a workaround, the bytes object > > will provide a .length() method that will return a long. > > > Is this worth the trouble? > (Hm, 64-bit platforms with 32-bit integers remind my of the broken > DOS/Windows 3.1 platforms with near/far/huge pointers). > I think most 64 bit platforms actually have a 32 bit int. Some of them (like the Alpha) have a 64 bit long, but Python has made extensive use of the int type in the PyBufferProcs interface and elsewhere. So if we want to make full use of large memory machines (I do), something has to be done. The only way to reliably get a 64 bit integer on these platforms is to use the "long long" type or __int64 on Windows (spelled LONG_LONG in Python). Note that the .length() method will return a Python long, not a C long. > > Shouldn't there be constructors to create a view of a bytes/view object, > or are we supposed to create them by slicing? > Item 9 in the PEP talks about this. Maybe I'll add some text to make this more clear. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From thomas.heller@ion-tof.com Tue Jul 23 19:04:00 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 23 Jul 2002 20:04:00 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723154055.54251.qmail@web40106.mail.yahoo.com> Message-ID: <027f01c23273$528d1240$e000a8c0@thomasnotebook> > > > > Why is this? Wouldn't it be sufficient if views keep references > > to the 'viewed' byte object? > > > > They do, but the referenced "inner-thing" needs it's own reference count to > know how many "bytes-views" are sharing it. When a bytes-view gets cleaned > up, it decrefs the reference count of the inner-thing it is referring to, > and if the reference count goes to zero, the bytes-view calls the > destructor for the inner-thing. > Hm, I thought the 'inner-thing' is a python object (with it's own refcount) itself. Isn't the 'inner-thing' the bytes object owning the allocated memory? And the 'outer-things' (the views) simply viewing slices of this memory? Thomas From xscottg@yahoo.com Tue Jul 23 19:33:02 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 23 Jul 2002 11:33:02 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <027f01c23273$528d1240$e000a8c0@thomasnotebook> Message-ID: <20020723183302.98153.qmail@web40102.mail.yahoo.com> --- Thomas Heller wrote: > > > > They do, but the referenced "inner-thing" needs it's own reference > count to > > know how many "bytes-views" are sharing it. When a bytes-view gets > cleaned > > up, it decrefs the reference count of the inner-thing it is referring > to, > > and if the reference count goes to zero, the bytes-view calls the > > destructor for the inner-thing. > > > Hm, I thought the 'inner-thing' is a python object (with it's own > refcount) itself. Isn't the 'inner-thing' the bytes object owning > the allocated memory? And the 'outer-things' (the views) simply > viewing slices of this memory? > The outer-thing is definitely the "bytes object", since that's what people will work with directly. It has to be a true Python object in all its glory. The inner-thing _could_ be a Python object (and Guido suggested that maybe it should be), but that's an implementation detail. I don't know why anyone would want to work with the inner-thing directly. However, one good use case and I'll be sold on the idea. I'll definitely add some verbage to clarify this in the next revision. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From jmiller@stsci.edu Tue Jul 23 19:47:02 2002 From: jmiller@stsci.edu (Todd Miller) Date: Tue, 23 Jul 2002 14:47:02 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723183302.98153.qmail@web40102.mail.yahoo.com> Message-ID: <3D3DA4A6.6040802@stsci.edu> Scott Gilbert wrote: >--- Thomas Heller wrote: > >>>They do, but the referenced "inner-thing" needs it's own reference >>> >>count to >> >>>know how many "bytes-views" are sharing it. When a bytes-view gets >>> >>cleaned >> >>>up, it decrefs the reference count of the inner-thing it is referring >>> >>to, >> >>>and if the reference count goes to zero, the bytes-view calls the >>>destructor for the inner-thing. >>> >>Hm, I thought the 'inner-thing' is a python object (with it's own >>refcount) itself. Isn't the 'inner-thing' the bytes object owning >>the allocated memory? And the 'outer-things' (the views) simply >>viewing slices of this memory? >> > >The outer-thing is definitely the "bytes object", since that's what people >will work with directly. It has to be a true Python object in all its >glory. > >The inner-thing _could_ be a Python object (and Guido suggested that maybe >it should be), but that's an implementation detail. I don't know why > > >anyone would want to work with the inner-thing directly. However, one good >use case and I'll be sold on the idea. > Letting the inner-thing be a mmap would enable slices of a mmap as views as opposed to strings. We'd certainly like this for numarray, especially if it meant pickling efficiency for mmap based arrays. > > >I'll definitely add some verbage to clarify this in the next revision. > >Cheers, > -Scott > > > >__________________________________________________ >Do You Yahoo!? >Yahoo! Health - Feel better, live better >http://health.yahoo.com > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > -- Todd Miller jmiller@stsci.edu STSCI / SSG (410) 338 4576 From thomas.heller@ion-tof.com Tue Jul 23 20:59:18 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 23 Jul 2002 21:59:18 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723183302.98153.qmail@web40102.mail.yahoo.com> Message-ID: <030901c23283$6e2be430$e000a8c0@thomasnotebook> > > > > > > They do, but the referenced "inner-thing" needs it's own reference > > count to > > > know how many "bytes-views" are sharing it. When a bytes-view gets > > cleaned > > > up, it decrefs the reference count of the inner-thing it is referring > > to, > > > and if the reference count goes to zero, the bytes-view calls the > > > destructor for the inner-thing. > > > > > Hm, I thought the 'inner-thing' is a python object (with it's own > > refcount) itself. Isn't the 'inner-thing' the bytes object owning > > the allocated memory? And the 'outer-things' (the views) simply > > viewing slices of this memory? > > > > The outer-thing is definitely the "bytes object", since that's what people > will work with directly. It has to be a true Python object in all its > glory. > > The inner-thing _could_ be a Python object (and Guido suggested that maybe > it should be), but that's an implementation detail. I don't know why > anyone would want to work with the inner-thing directly. However, one good > use case and I'll be sold on the idea. > > I'll definitely add some verbage to clarify this in the next revision. > I've quickly read the pep again. I see no mentioning of an 'inner object' and an 'outer object' there, so I would recommend you try to explain this (if you want to stay with this decision). OTOH, your 'inner thing' has a refcount, an (optional) destructor which is a kind of closure, instance variables (memory pointer, readonly flag), so there is not too much missing for a full python object. Could the 'inner thing' have the same type as the 'outer thing': the inner thing being a full view of itself, and the outer thing probably a view viewing only a slice of the inner thing? Thomas From greg@cosc.canterbury.ac.nz Tue Jul 23 23:29:22 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Jul 2002 10:29:22 +1200 (NZST) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020723183302.98153.qmail@web40102.mail.yahoo.com> Message-ID: <200207232229.g6NMTM609792@oma.cosc.canterbury.ac.nz> Scott Gilbert : > The inner-thing _could_ be a Python object (and Guido suggested that > maybe it should be), but that's an implementation detail. In that case, unless there's some reason for it *not* to be a Python object, you might as well make it one and take advantage of all the Python refcount machinery. If you use Pyrex for the implementation, making Python objects will be dead easy! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From jason-exp-1028157503.aebc46@mastaler.com Wed Jul 24 00:18:58 2002 From: jason-exp-1028157503.aebc46@mastaler.com (jason-exp-1028157503.aebc46@mastaler.com) Date: Tue, 23 Jul 2002 17:18:58 -0600 Subject: [Python-Dev] Re: Where's time.daylight??? References: <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net> <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> Message-ID: martin@v.loewis.de (Martin v. Loewis) writes: > I have no OSF/1 (aka whatever) system http://www.testdrive.compaq.com/ -- (http://tmda.net/) From xscottg@yahoo.com Wed Jul 24 09:13:26 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 01:13:26 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <030901c23283$6e2be430$e000a8c0@thomasnotebook> Message-ID: <20020724081326.995.qmail@web40107.mail.yahoo.com> --- Thomas Heller wrote: > > I've quickly read the pep again. > I see no mentioning of an 'inner object' and an 'outer object' > there, so I would recommend you try to explain this (if you want to stay > with this decision). > This is just the terminology I was using to try and communicate with you. The outer thing is the bytes object (which is generally interesting to users), and the inner thing is an implementation detail. Like I said, I'll add more text on this in the next revision since it seems to be causing confusion. > > OTOH, your 'inner thing' has a refcount, an (optional) destructor > which is a kind of closure, instance variables (memory pointer, > readonly flag), so there is not too much missing for a full > python object. > I still haven't heard a good reason to expose the inner thing to user code yet though. So even if the inner thing is a PyObject, who would know? It's probably better for maintenance to use something everyone is already familiar with, so I'll probably do it for that reason. > > Could the 'inner thing' have the same type as the 'outer thing': > the inner thing being a full view of itself, and the outer thing > probably a view viewing only a slice of the inner thing? > It might be. However, I'm afraid this will lead to some ugly special cases when the view is the inner thing versus when the view is referring to some other thing. It's probably cleaner to make a clear distinction between the two and stick with it throughout. (I'm growing to dislike this "thing" terminology....) __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 24 09:22:29 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 01:22:29 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <3D3DA4A6.6040802@stsci.edu> Message-ID: <20020724082229.94975.qmail@web40105.mail.yahoo.com> --- Todd Miller wrote: > > Letting the inner-thing be a mmap would enable slices of a mmap as views > as opposed to strings. We'd certainly like this for numarray, > especially if it meant pickling efficiency for mmap based arrays. > The first version of the PEP I sent to you directly didn't have this, but the latest version I posted to python-dev mentions it briefly. It seems both you and Guido came up with the same idea regarding mmap. The current strategy is to add a method to the mmap module that would return a bytes object from an mmap object. I would like it to be able to pickle too. (Which probably means the new method in the mmap module will probably return a class derived from bytes, and not the bytes base class.) However, this is sort of orthogonal to the PEP. If the bytes object makes it in, but the mmap enhancements get left out, a third party extension could implement the mmap_to_bytes function and still make use of the efficient pickling by deriving from the bytes object. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 24 10:05:12 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 02:05:12 -0700 (PDT) Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: <20020724090512.2485.qmail@web40110.mail.yahoo.com> --- Tim Peters wrote: > > "A Meticulous Analysis of Mergesort Programs" > Jyrki Katajainen, Jesper Larsson Traff > Thanks for the cool reference. I read a bit of it last night. I ought to know by now that there really isn't much new under the sun... > > The real reason it's uninteresting to me is that it has no clear > applicability to the cases this sort aims at: exploiting significant > pre-existing order of various kinds. > [...] > (unless you can speed a single left-to-right scan! that would be way > cool). > Do a few well placed prefetch instructions buy you anything? The MMU could be grabbing your next pointer while you're doing your current comparison. And of course you could implement it as a macro that evaporates for whatever platforms you didn't care to implement it on. (I need to look it up, but I'm pretty sure you could do this for both VC++ and gcc on recent x86s.) > > If you think you can write a sort for random Python arrays faster than > the > samplesort hybrid, be my guest: I'd love to see it! You should be aware > that I've been making this challenge for years . > You're remarkably good at taunting me. :-) I've spent a little time on a few of these optimization challenges that get posted. One of these days I'll best you... (not this day though) > > Something to note: I think you have an overly simple view of Python's > lists in mind. > No, I think I understand the model. I just assumed the objects pointed to would be scattered pretty randomly through memory. So statistically they'll step on the same cache lines as your list once in a while, but that it would average out to being less interesting than the adjacent slots in the list. I'm frequently wrong about stuff like this though... Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From jmiller@stsci.edu Wed Jul 24 12:21:41 2002 From: jmiller@stsci.edu (Todd Miller) Date: Wed, 24 Jul 2002 07:21:41 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020724082229.94975.qmail@web40105.mail.yahoo.com> Message-ID: <3D3E8DC5.7040906@stsci.edu> Scott Gilbert wrote: >--- Todd Miller wrote: > >>Letting the inner-thing be a mmap would enable slices of a mmap as views >>as opposed to strings. We'd certainly like this for numarray, >>especially if it meant pickling efficiency for mmap based arrays. >> > >The first version of the PEP I sent to you directly didn't have this, but >the latest version I posted to python-dev mentions it briefly. It seems >both you and Guido came up with the same idea regarding mmap. > Yeah, I saw that in your respose. Sorry. FWIW, anything I say here should be regarded as a reflection of STSCI's current technical goals as channeled by me, and not necessarily "my ideas". Exploiting mmapping has been a pretty long standing goal here at STSCI. > > >The current strategy is to add a method to the mmap module that would >return a bytes object from an mmap object. I would like it to be able to >pickle too. (Which probably means the new method in the mmap module will >probably return a class derived from bytes, and not the bytes base class.) > This runs pretty wide of my current mental ruts, but it sounds like conservative design, so great. > >However, this is sort of orthogonal to the PEP. If the bytes object makes >it in, but the mmap enhancements get left out, a third party extension >could implement the mmap_to_bytes function and still make use of the >efficient pickling by deriving from the bytes object. > I understand. That sounds excellent. > > >Cheers, > -Scott > > >__________________________________________________ >Do You Yahoo!? >Yahoo! Health - Feel better, live better >http://health.yahoo.com > Back to numarray, Todd From thomas.heller@ion-tof.com Wed Jul 24 12:38:00 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 24 Jul 2002 13:38:00 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <048301c23306$90992090$e000a8c0@thomasnotebook> Let me ask some questions and about platforms with 32-bit integers and 64-bit longs: > 2. Alignment of the allocated byte array is whatever is promised by the > platform implementation of malloc. On these platforms, does malloc() accept an unsigned long argument for the requested size? > [...] > 8. The bytes object keeps track of the length of its data with a Python > LONG_LONG type. > [...] > From Python scripting, the bytes object will be subscriptable with longs > so the 32 bit int limit can be avoided. How is indexing done in C? Can you index these byte arrays by longs? > 9. The bytes object can be constructed at the Python scripting level by > passing an int/long to the bytes constructor with the number of bytes to > allocate. For example: > > b = bytes(100000) # alloc 100K bytes > > The constructor can also take another bytes object. This will be useful > for the implementation of unpickling, and in converting a read-write bytes > object into a read-only one. An optional second argument will be used to > designate creation of a readonly bytes object. > > 10. From the C API, the bytes object can be allocated using any of the > following signatures: > > PyObject* PyBytes_FromLength(LONG_LONG len, int readonly); > PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly > void (*dest)(void *ptr, void *user), void* user); Hm, if 'bytes' is a new style class, these functions should require a 'PyObject *type' parameter as well. OTOH, new style classes are usually created by calling their *type*, so you should describe the signature of the byte type's tp_call. (It may be possible to supply variations of the above functions for convenience as well.) > The array module supports the creation of an array of bytes, but it does > not provide a C API for supplying pointers and destructors to extension > supplied memory. This makes it unusable for constructing objects out of > shared memory, or memory that has special alignment or locking for things > like DMA transfers. Also, the array object does not currently pickle. > Finally since the array object allows its contents to grow, via the extend > method, the pointer can be changed if the GIL is not held while using it. ...or if any code is executed which may change the array object, even if the GIL is held! Thomas From xscottg@yahoo.com Wed Jul 24 17:01:58 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 09:01:58 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <048301c23306$90992090$e000a8c0@thomasnotebook> Message-ID: <20020724160158.34860.qmail@web40112.mail.yahoo.com> --- Thomas Heller wrote: > Let me ask some questions and about platforms with 32-bit > integers and 64-bit longs: > > > 2. Alignment of the allocated byte array is whatever is promised by > > the platform implementation of malloc. > > On these platforms, does malloc() accept an unsigned long argument > for the requested size? > At the moment, the only 64 bit platform that I have easy access to is Tru64/Alpha. That version of malloc takes a size_t which is a 64 bit quantity. I believe most semi-sane platforms will use a size_t as argument for malloc, and I believe most semi-sane platforms will have a size_t that is the same number of bits as a pointer for that platform. > > [...] > > 8. The bytes object keeps track of the length of its data with a > > Python LONG_LONG type. > > [...] > > From Python scripting, the bytes object will be subscriptable with > > longs so the 32 bit int limit can be avoided. > > How is indexing done in C?> Indexing is done by grabbing the pointer and length via a call like: int PyObject_AsLargeReadBuffer(PyObject* bo, unsigned char** ptr, LONG_LONG* len); Note that the name could be different depending on whether it ends up in abstract.h or bytesobject.h. > Can you index these byte arrays by longs? You could index it via a long, but using a LONG_LONG is safer. My understanding is that on Win64 a long will only be 32 bits even though void* is 64 bits. So for that platform, LONG_LONG will be a typedef for __int64 which is 64 bits. None of this matters for 32 bit platforms. All 32 bit platforms that I know of have sizeof(int) == sizeof(long) == sizeof(void*) == 4. So even if you wanted to subscript with a long or LONG_LONG, the pointer could only point to something about 2 Gigs (31 bits) in size. > > > > 10. From the C API, the bytes object can be allocated using any of > > the following signatures: > > > > PyObject* PyBytes_FromLength(LONG_LONG len, int readonly); > > PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, > > int readonly void (*dest)(void *ptr, void *user), > > void* user); > > Hm, if 'bytes' is a new style class, these functions should > require a 'PyObject *type' parameter as well. OTOH, new style > classes are usually created by calling their *type*, so you > should describe the signature of the byte type's tp_call. > (It may be possible to supply variations of the above functions > for convenience as well.) > I consider these to be the minimum convenience functions that are necessary for the functionality I'd like to see. I'll follow the conventions for creating a new style class for PyBytesObject to the letter, and any other variations of the above convenience functions can be added as needed. (It's easier to add stuff than take it away...) Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 24 17:02:15 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 09:02:15 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <048301c23306$90992090$e000a8c0@thomasnotebook> Message-ID: <20020724160215.30526.qmail@web40104.mail.yahoo.com> --- Thomas Heller wrote: > Let me ask some questions and about platforms with 32-bit > integers and 64-bit longs: > > > 2. Alignment of the allocated byte array is whatever is promised by > > the platform implementation of malloc. > > On these platforms, does malloc() accept an unsigned long argument > for the requested size? > At the moment, the only 64 bit platform that I have easy access to is Tru64/Alpha. That version of malloc takes a size_t which is a 64 bit quantity. I believe most semi-sane platforms will use a size_t as argument for malloc, and I believe most semi-sane platforms will have a size_t that is the same number of bits as a pointer for that platform. > > [...] > > 8. The bytes object keeps track of the length of its data with a > > Python LONG_LONG type. > > [...] > > From Python scripting, the bytes object will be subscriptable with > > longs so the 32 bit int limit can be avoided. > > How is indexing done in C?> Indexing is done by grabbing the pointer and length via a call like: int PyObject_AsLargeReadBuffer(PyObject* bo, unsigned char** ptr, LONG_LONG* len); Note that the name could be different depending on whether it ends up in abstract.h or bytesobject.h. > Can you index these byte arrays by longs? You could index it via a long, but using a LONG_LONG is safer. My understanding is that on Win64 a long will only be 32 bits even though void* is 64 bits. So for that platform, LONG_LONG will be a typedef for __int64 which is 64 bits. None of this matters for 32 bit platforms. All 32 bit platforms that I know of have sizeof(int) == sizeof(long) == sizeof(void*) == 4. So even if you wanted to subscript with a long or LONG_LONG, the pointer could only point to something about 2 Gigs (31 bits) in size. > > > > 10. From the C API, the bytes object can be allocated using any of > > the following signatures: > > > > PyObject* PyBytes_FromLength(LONG_LONG len, int readonly); > > PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, > > int readonly void (*dest)(void *ptr, void *user), > > void* user); > > Hm, if 'bytes' is a new style class, these functions should > require a 'PyObject *type' parameter as well. OTOH, new style > classes are usually created by calling their *type*, so you > should describe the signature of the byte type's tp_call. > (It may be possible to supply variations of the above functions > for convenience as well.) > I consider these to be the minimum convenience functions that are necessary for the functionality I'd like to see. I'll follow the conventions for creating a new style class for PyBytesObject to the letter, and any other variations of the above convenience functions can be added as needed. (It's easier to add stuff than take it away...) Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From tim@zope.com Wed Jul 24 18:49:10 2002 From: tim@zope.com (Tim Peters) Date: Wed, 24 Jul 2002 13:49:10 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020724160158.34860.qmail@web40112.mail.yahoo.com> Message-ID: [Scott Gilbert] > At the moment, the only 64 bit platform that I have easy access to is > Tru64/Alpha. That version of malloc takes a size_t which is a 64 bit > quantity. > > I believe most semi-sane platforms will use a size_t as argument for > malloc, That much is required by the C standard, so you can rely on it. > and I believe most semi-sane platforms will have a size_t that is > the same number of bits as a pointer for that platform. The std is silent on this; it's true on 64-bit Linux and Win64, so "good enough". >> Can you index these byte arrays by longs? > You could index it via a long, but using a LONG_LONG is safer. My > understanding is that on Win64 a long will only be 32 bits even though > void* is 64 bits. Right. > So for that platform, LONG_LONG will be a typedef for __int64 which is 64 > bits. Also on Win32: LONG_LONG is a 64-bit integral type on Win32 and Win64. > None of this matters for 32 bit platforms. ? Win32 has always supported "large files" and "large mmaps" (where large means 64-bit capacity), and most 32-bit flavors of Unix do too. It's a x-platform mess, though. > All 32 bit platforms that I know of have sizeof(int) == sizeof(long) == > sizeof(void*) == 4. Same here. > So even if you wanted to subscript with a long or LONG_LONG, the pointer > could only point to something about 2 Gigs (31 bits) in size. That depends on how it's implemented; on a 32-bit box, supporting a LONG_LONG subscript may require some real pain, but isn't impossible. For example, Python manages to support 64-bit "subscripts" to f.seek() on the major 32-bit boxes right now. From tim.one@comcast.net Thu Jul 25 00:01:32 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 24 Jul 2002 19:01:32 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: FYI, I've been poking at this in the background. The ~sort regression is vastly reduced, via removing special-casing and adding more general adaptivity (if you read the timsort.txt file, the special case for run lengths within a factor of 2 of each other went away, replaced by a more intelligent mix of one-pair-at-a-time versus galloping modes). *sort lost about 1% as a result (one-pair-at-a-time is maximally effective for *sort, but in a random mix every now again the "switch to the less efficient (for it) galloping mode" heuristic triggers by blind luck). There's also a significant systematic regression in timsort's +sort case, although it remains faster (and much more general) than samplesort's special-casing of it; also a mix of small regressions and speedups in 3sort. These are because, to simplify experimenting, I threw out the "copy only the shorter run" gimmick, always copying the left run instead. That hurts +sort systematically, as instead of copying just the 10 oddball elements at the end, it copies the very long run of N-10 elements instead (and as many as N-1 temp pointers can be needed, up from N/2). That's all repairable, it's just a PITA to do it. C:\Code\python\PCbuild>python -O sortperf.py 15 20 1 samplesort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.18 0.01 0.02 0.11 0.01 0.04 0.01 0.11 16 65536 0.24 0.02 0.02 0.25 0.02 0.08 0.02 0.24 17 131072 0.53 0.05 0.04 0.49 0.05 0.18 0.04 0.52 18 262144 1.16 0.09 0.09 1.06 0.12 0.37 0.09 1.14 19 524288 2.53 0.18 0.17 2.30 0.24 0.75 0.17 2.47 20 1048576 5.48 0.37 0.35 5.18 0.45 1.52 0.35 5.35 timsort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.17 0.01 0.02 0.01 0.01 0.05 0.01 0.02 16 65536 0.24 0.02 0.02 0.02 0.02 0.09 0.02 0.04 17 131072 0.54 0.05 0.04 0.05 0.05 0.19 0.04 0.09 18 262144 1.17 0.09 0.09 0.10 0.10 0.38 0.09 0.18 19 524288 2.56 0.18 0.17 0.20 0.20 0.79 0.17 0.36 20 1048576 5.54 0.37 0.35 0.37 0.41 1.62 0.35 0.73 In short, there's no real "speed argument" against this anymore (as I said in the first msg of this thread, the ~sort regression was serious -- it's an important case; turns out galloping is very effective at speeding it too, provided that dumbass premature special-casing doesn't stop galloping from trying ). From xscottg@yahoo.com Thu Jul 25 00:22:54 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 24 Jul 2002 16:22:54 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: Message-ID: <20020724232254.86946.qmail@web40101.mail.yahoo.com> --- Tim Peters wrote: > > > So for that platform, LONG_LONG will be a typedef for __int64 which is > > 64 bits. > > Also on Win32: LONG_LONG is a 64-bit integral type on Win32 and Win64. > Yep. I was trying to contrast that on most platforms LONG_LONG is an alias for "long long", but on Windows (32 or 64) it's going to be an __int64. > > > So even if you wanted to subscript with a long or LONG_LONG, the > > pointer could only point to something about 2 Gigs (31 bits) in size. > > That depends on how it's implemented; on a 32-bit box, supporting a > LONG_LONG subscript may require some real pain, but isn't impossible. > For > example, Python manages to support 64-bit "subscripts" to f.seek() on the > major 32-bit boxes right now. > I should have been more clear. I was referring specifically to working with pointers: datum = *(pointer + offset); or: datum = pointer[offset]; Just so there is no confusion, you aren't suggesting that the bytes PEP should provide a mechanism to support chunks of memory larger than 4 Gigs on 32 bit platforms right? I think the bytes object could be a part of the solution to that problem, at least I know how I would do that under Win32, but I'd rather not kluge up the interface to the bytes object to support it directly. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From guido@python.org Thu Jul 25 01:04:51 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 24 Jul 2002 20:04:51 -0400 Subject: [Python-Dev] Powerpoint slide for keynotes available Message-ID: <200207250004.g6P04pP20522@pcp02138704pcs.reston01.va.comcast.net> I've put the powerpoint slides for my keynotes at EuroPython and OSCON on the web. If someone can donate PDF that would be great (the HTML generated by Powerpoint sucks too much to be worth it IMO). http://www.python.org/doc/essays/ppt/ (scroll to end) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Thu Jul 25 05:37:00 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 25 Jul 2002 00:37:00 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020724232254.86946.qmail@web40101.mail.yahoo.com> Message-ID: [Scott Gilbert] > ... > I should have been more clear. I was referring specifically to working > with pointers: > > datum = *(pointer + offset); > or: > datum = pointer[offset]; Na, my fault -- I fit in the email between other things, and hadn't read the whole thread up to that point. It was clear enough in context. > Just so there is no confusion, you aren't suggesting that the bytes PEP > should provide a mechanism to support chunks of memory larger than 4 Gigs > on 32 bit platforms right? It depends on how insane you are. It sure as heck doesn't *sound* like this is the bytes object's problem to solve, but then if people want their data sorted they shouldn't let it get out of order to begin with either . > I think the bytes object could be a part of the solution to that problem, > at least I know how I would do that under Win32, but I'd rather not kluge > up the interface to the bytes object to support it directly. I agree. From thomas.heller@ion-tof.com Thu Jul 25 08:45:44 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 25 Jul 2002 09:45:44 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: Message-ID: <011201c233af$4883dd50$e000a8c0@thomasnotebook> > [Scott Gilbert] > > At the moment, the only 64 bit platform that I have easy access to is > > Tru64/Alpha. That version of malloc takes a size_t which is a 64 bit > > quantity. > > > > I believe most semi-sane platforms will use a size_t as argument for > > malloc, > [Tim] > That much is required by the C standard, so you can rely on it. > > > and I believe most semi-sane platforms will have a size_t that is > > the same number of bits as a pointer for that platform. > > The std is silent on this; it's true on 64-bit Linux and Win64, so "good > enough". > > >> Can you index these byte arrays by longs? > > > You could index it via a long, but using a LONG_LONG is safer. My > > understanding is that on Win64 a long will only be 32 bits even though > > void* is 64 bits. > > Right. So isn't the conclusion that sizeof(size_t) == sizeof(void *) on any platform, and so the index should be of type size_t instead of int, long, or LONG_LONG (aka __int64 in some places)? Thomas From thomas.heller@ion-tof.com Thu Jul 25 09:07:43 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 25 Jul 2002 10:07:43 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <014b01c233b2$5a9bf240$e000a8c0@thomasnotebook> What if we would 'fix' the buffer interface? Extend the PyBufferProcs structure by new fields: typedef size_t (*getlargereadbufferproc)(PyObject *, void **); typedef size_t (*getlargewritebufferproc)(PyObject *, void **); typedef struct { getreadbufferproc bf_getreadbuffer; getwritebufferproc bf_getwritebuffer; getsegcountproc bf_getsegcount; getcharbufferproc bf_getcharbuffer; /* new fields */ getlargereadbufferproc bf_getlargereadbufferproc; getlargewritebufferproc bf_getlargewritebufferproc; } PyBufferProcs; The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER flag is set in the object's type. Py_TPFLAGS_HAVE_GETLARGEBUFFER implies the Py_TPFLAGS_HAVE_GETCHARBUFFER flag. These functions have the same semantics Scott describes: they must only be implemented by types only return addresses which are valid as long as the Python 'source' object is alive. Python strings, unicode strings, mmap objects, and maybe other types would expose the large buffer interface, but the array type would *not*. We could also change the name from 'large buffer interface' to something more sensible, currently I don't have a better name. Thomas From oren-py-d@hishome.net Thu Jul 25 11:01:58 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 25 Jul 2002 06:01:58 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <011201c233af$4883dd50$e000a8c0@thomasnotebook> References: <011201c233af$4883dd50$e000a8c0@thomasnotebook> Message-ID: <20020725100157.GA34465@hishome.net> On Thu, Jul 25, 2002 at 09:45:44AM +0200, Thomas Heller wrote: > > >> Can you index these byte arrays by longs? > > > > > You could index it via a long, but using a LONG_LONG is safer. My > > > understanding is that on Win64 a long will only be 32 bits even though > > > void* is 64 bits. > > > > Right. > > So isn't the conclusion that sizeof(size_t) == sizeof(void *) on > any platform, and so the index should be of type size_t instead of > int, long, or LONG_LONG (aka __int64 in some places)? The obvious type to index byte arrays would be ptrdiff_t. If (char*)-(char*)==ptrdiff_t then (char*)+ptrdiff_t==(char*) Oren From tim@zope.com Thu Jul 25 16:23:05 2002 From: tim@zope.com (Tim Peters) Date: Thu, 25 Jul 2002 11:23:05 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <011201c233af$4883dd50$e000a8c0@thomasnotebook> Message-ID: [Thomas Heller] > So isn't the conclusion that sizeof(size_t) == sizeof(void *) on > any platform, Last I knew, there were dozens of platforms besides Linux and Windows . Like I said, no relationship is defined here. C99 standardizes a uintptr_t typedef for an unsigned integer type with "enough bits" so that (void*)(uintptr_t)p == p for any legit pointer p of type void*, but only standarizes its name, not its existence (a conforming implementation isn't required to supply a typedef with this name). Such a type *is* required to compile Python, though, and pyport.h defines our own Py_uintptr_t (as a synonym for the platform uintptr_t if it exists, else to the smallest integer type it can find that looks big enough, else a compile-time #error). > and so the index should be of type size_t instead of > int, long, or LONG_LONG (aka __int64 in some places)? Try to spell out exactly what it is you think this index should be capable of representing; e.g., what's your most extreme use case? From tim.one@comcast.net Thu Jul 25 16:44:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 25 Jul 2002 11:44:22 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020725100157.GA34465@hishome.net> Message-ID: [Oren Tirosh] > The obvious type to index byte arrays would be ptrdiff_t. > > If (char*)-(char*)==ptrdiff_t then (char*)+ptrdiff_t==(char*) Alas, the standard only says that ptrdiff_t *is* the type of the result of pointer subtraction, not that it *suffices* for that purpose; it explicitly warns that the true result of subtracting two pointers may not be respresentable in that type (in which case the behavior is undefined). In a similar way, C says the result of adding int to int *is* int, but doesn't guarantee the result type (int) is sufficent to represent the true result (and, indeed, in the int case it often isn't). It may be safer to stick with size_t, since size_t isn't as obscure (lightly used and/or misunderstood) as ptrdiff_t. From jeremy@alum.mit.edu Thu Jul 25 12:04:39 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 25 Jul 2002 07:04:39 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: References: <20020725100157.GA34465@hishome.net> Message-ID: <15679.56135.465947.542871@slothrop.zope.com> We could have an #if test on PTRDIFF_MIN and PTRDIFF_MAX and refuse to compile if they don't have reasonable values. Jeremy From yozh@mx1.ru Thu Jul 25 17:03:38 2002 From: yozh@mx1.ru (Stepan Koltsov) Date: Thu, 25 Jul 2002 20:03:38 +0400 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants Message-ID: <20020725160337.GA8999@banana.mx1.ru> --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Hi, all. I wrote a PEP, its number is 295, it is in attachment. It should be posted somewhere to be discussed so it is here. Please, look at it and say what you think. -- mailto: Stepan Koltsov --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="pep-0295.txt" PEP: 295 Title: Interpretation of multiline string constants Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/07/22 20:45:07 $ Author: yozh@mx1.ru (Stepan Koltsov) Status: Draft Type: Standards Track Created: 22-Jul-2002 Python-Version: 3.0 Post-History: Abstract This PEP describes an interpretation of multiline string constants for Python. It suggests stripping spaces after newlines and stripping a newline if it is first character after an opening quotation. Rationale This PEP proposes an interpretation of multiline string constants in Python. Currently, the value of string constant is all the text between quotations, maybe with escape sequences substituted, e.g.: def f(): """ la-la-la limona, banana """ def g(): return "This is \ string" print repr(f.__doc__) print repr(g()) prints: '\n\tla-la-la\n\tlimona, banana\n\t' 'This is \tstring' This PEP suggest two things - ignore the first character after opening quotation, if it is newline - second: ignore in string constants all spaces and tabs up to first non-whitespace character, but no more then current indentation. After applying this, previous program will print: 'la-la-la\nlimona, banana\n' 'This is string' To get this result, previous programs could be rewritten for current Python as (note, this gives the same result with new strings meaning): def f(): """\ la-la-la limona, banana """ def g(): "This is \ string" Or stripping can be done with library routines at runtime (as pydoc does), but this decreases program readability. Implementation I'll say nothing about CPython, Jython or Python.NET. In original Python, there is no info about the current indentation (in spaces) at compile time, so space and tab stripping should be done at parse time. Currently no flags can be passed to the parser in program text (like from __future__ import xxx). I suggest enabling or disabling of this feature at Python compile time depending of CPP flag Py_PARSE_MULTILINE_STRINGS. Alternatives New interpretation of string constants can be implemented with flags 'i' and 'o' to string constants, like i""" SELECT * FROM car WHERE model = 'i525' """ is in new style, o"""SELECT * FROM employee WHERE birth < 1982 """ is in old style, and """ SELECT employee.name, car.name, car.price FROM employee, car WHERE employee.salary * 36 > car.price """ is in new style after Python-x.y.z and in old style otherwise. Also this feature can be disabled if string is raw, i.e. if flag 'r' specified. Copyright This document has been placed in the Public Domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: --/9DWx/yDrRhgMJTb-- From thomas.heller@ion-tof.com Thu Jul 25 17:22:15 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 25 Jul 2002 18:22:15 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: Message-ID: <04c501c233f7$70a0a4b0$e000a8c0@thomasnotebook> From: "Tim Peters" > [Thomas Heller] > > So isn't the conclusion that sizeof(size_t) == sizeof(void *) on > > any platform, > > Last I knew, there were dozens of platforms besides Linux and Windows > . Like I said, no relationship is defined here. C99 standardizes a > uintptr_t typedef for an unsigned integer type with "enough bits" so that > > (void*)(uintptr_t)p == p > > for any legit pointer p of type void*, but only standarizes its name, not > its existence (a conforming implementation isn't required to supply a > typedef with this name). Such a type *is* required to compile Python, > though, and pyport.h defines our own Py_uintptr_t (as a synonym for the > platform uintptr_t if it exists, else to the smallest integer type it can > find that looks big enough, else a compile-time #error). > > > and so the index should be of type size_t instead of > > int, long, or LONG_LONG (aka __int64 in some places)? > > Try to spell out exactly what it is you think this index should be capable > of representing; e.g., what's your most extreme use case? > *I* have no use for this at the moment. I was just trying to understand the (let's call it) large byte-array support in Scott's proposal on 64-bit platforms, and how to program portably on 64-bit and 32-bit platforms. Assuming we have a large enough byte array unsigned char *ptr; and want to use it in C, for example get a certain byte: unsigned char *mybyte = ptr[my_index]; What should the type of my_index be? IIRC, Scott proposed LONG_LONG, but wouldn't this be a paint on 32-bit platforms? Thomas From guido@python.org Thu Jul 25 17:32:24 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 25 Jul 2002 12:32:24 -0400 Subject: [Python-Dev] Powerpoint slide for keynotes available References: <200207250004.g6P04pP20522@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <009b01c233f8$de6bade0$7f00a8c0@pacbell.net> I wrote: > If someone can donate PDF that would be great (the HTML > generated by Powerpoint sucks too much to be worth it IMO). > > http://www.python.org/doc/essays/ppt/ > > (scroll to end) I've received about 5 offers of PDF. The first one is now on the web. Mark Hadfield won the race. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jul 25 17:41:02 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 25 Jul 2002 12:41:02 -0400 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants References: <20020725160337.GA8999@banana.mx1.ru> Message-ID: <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> > I wrote a PEP, its number is 295, it is in attachment. > It should be posted somewhere to be discussed so it is here. > Please, look at it and say what you think. This is an incompatible change. Your PEP does not address how to deal with this at all. I will be forced to reject it unless you come up with a transition strategy (in fact, I don't even want to consider your proposal unless you deal with this). > --Guido van Rossum (home page: http://www.python.org/~guido/) From xscottg@yahoo.com Thu Jul 25 18:00:03 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 25 Jul 2002 10:00:03 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <04c501c233f7$70a0a4b0$e000a8c0@thomasnotebook> Message-ID: <20020725170003.93924.qmail@web40107.mail.yahoo.com> --- Thomas Heller wrote: > > *I* have no use for this at the moment. > I was just trying to understand the (let's call it) large > byte-array support in Scott's proposal on 64-bit platforms, > and how to program portably on 64-bit and 32-bit platforms. > > Assuming we have a large enough byte array > unsigned char *ptr; > and want to use it in C, for example get a certain byte: > > unsigned char *mybyte = ptr[my_index]; > > What should the type of my_index be? IIRC, Scott proposed LONG_LONG, > but wouldn't this be a paint on 32-bit platforms? > Ok, now that I understand where you're coming from. If nobody has an objection or can point to a supported platform where it won't work, I'll switch it to size_t. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From yozh@mx1.ru Thu Jul 25 18:06:40 2002 From: yozh@mx1.ru (Stepan Koltsov) Date: Thu, 25 Jul 2002 21:06:40 +0400 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants In-Reply-To: <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> Message-ID: <20020725170640.GA10350@banana.mx1.ru> On Thu, Jul 25, 2002 at 12:41:02PM -0400, Guido van Rossum wrote: > > I wrote a PEP, its number is 295, it is in attachment. > > It should be posted somewhere to be discussed so it is here. > > Please, look at it and say what you think. > > This is an incompatible change. Your PEP does not address > how to deal with this at all. I will be forced to reject it unless > you come up with a transition strategy (in fact, I don't even want > to consider your proposal unless you deal with this). For most strings this change will not change program result (for example number of spaces doesn't matter in SQL queries). For others I suggested (in section 'Alternatives') flags 'i' and 'o' for string constants. -- mailto: Stepan Koltsov From fredrik@pythonware.com Thu Jul 25 18:27:07 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 25 Jul 2002 19:27:07 +0200 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru> Message-ID: <009f01c23400$8259c200$ced241d5@hagrid> Stepan Koltsov wrote: > > This is an incompatible change. Your PEP does not address > > how to deal with this at all. I will be forced to reject it unless > > you come up with a transition strategy (in fact, I don't even want > > to consider your proposal unless you deal with this). > > For most strings this change will not change program result and how on earth do you know that? > (for example number of spaces doesn't matter in SQL queries). so why do all your examples use SQL queries? > For others I suggested (in section 'Alternatives') flags 'i' and 'o' > for string constants. if you want to interpret multiline strings in a different way, why cannot you just do like everyone else, and use a function? mystring = SQL(""" blablabla """) (as a bonus, that approach makes it trivial to embed files, images, xml structures, etc...) a big -1 from here. From guido@python.org Thu Jul 25 18:51:01 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 25 Jul 2002 13:51:01 -0400 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru> Message-ID: <004a01c23403$d9494880$7f00a8c0@pacbell.net> > > > I wrote a PEP, its number is 295, it is in attachment. > > > It should be posted somewhere to be discussed so it is here. > > > Please, look at it and say what you think. > > > > This is an incompatible change. Your PEP does not address > > how to deal with this at all. I will be forced to reject it unless > > you come up with a transition strategy (in fact, I don't even want > > to consider your proposal unless you deal with this). > > For most strings this change will not change program result (for > example number of spaces doesn't matter in SQL queries). For others > I suggested (in section 'Alternatives') flags 'i' and 'o' for string > constants. You are proposing a language change. Because of the grave consequences of such changes you have to explain why you cannot obtain the desired results with the existing language. You have completely failed to provide a motivation for your PEP so far. If you want your PEP to be considered you must provide a motivation first. --Guido van Rossum (home page: http://www.python.org/~guido/) From yozh@mx1.ru Thu Jul 25 18:55:28 2002 From: yozh@mx1.ru (Stepan Koltsov) Date: Thu, 25 Jul 2002 21:55:28 +0400 Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants In-Reply-To: <009f01c23400$8259c200$ced241d5@hagrid> References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru> <009f01c23400$8259c200$ced241d5@hagrid> Message-ID: <20020725175528.GA11100@banana.mx1.ru> On Thu, Jul 25, 2002 at 07:27:07PM +0200, Fredrik Lundh wrote: > > > This is an incompatible change. Your PEP does not address > > > how to deal with this at all. I will be forced to reject it unless > > > you come up with a transition strategy (in fact, I don't even want > > > to consider your proposal unless you deal with this). > > > > For most strings this change will not change program result > > and how on earth do you know that? I've seen output of `grep -rwC '"""' Python/Lib/` and `egrep -rwC '= *"""' Python/Lib/`. Most strings are docstrings ;-) > > (for example number of spaces doesn't matter in SQL queries). > > so why do all your examples use SQL queries? Because I saw this defect of Python first when I wrote SQL queries. f(): q = """my query""" % vars if debug: print q # looks bad > > For others I suggested (in section 'Alternatives') flags 'i' and 'o' > > for string constants. > > if you want to interpret multiline strings in a different way, why > cannot you just do like everyone else, and use a function? > > mystring = SQL(""" > blablabla > """) Functions don't know current indentation. > (as a bonus, that approach makes it trivial to embed files, images, > xml structures, etc...) > > a big -1 from here. :-( -- mailto: Stepan Koltsov From mcherm@destiny.com Thu Jul 25 19:01:42 2002 From: mcherm@destiny.com (Michael Chermside) Date: Thu, 25 Jul 2002 14:01:42 -0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants Message-ID: <3D403D06.6010802@destiny.com> Stephan Koltsov writes: > I wrote a PEP, its number is 295, it is in attachment. [... PEP on stripping newline and preceeding spaces multi-line string literals ...] I see ___ motivations for the proposals in this PEP, and propose alternative solutions for each. NONE of my alternative solutions requires ANY modification to the Python language. -------- Motivation 1 -- Lining up line 1 of multi-line quotes: Senario: - Use of string with things "lined up" neatly >>> def someFunction(): ... aMultiLineString = """Foo X 1.0 ... Bar Y 2.5 ... Baz Z 15.0 ... Spam Q 38.9 ... """ Notice how line 1 doesn't line up neatly with lines 2-4 because of the indenting as well as the text assigning it to a variable. This is annoying, and makes it awkward to read. Solution: - Use a backslash to escape an initial newline >>> def someFunction(): ... aMultiLineString = """\ ... Foo X 1.0 ... Bar Y 2.5 ... Baz Z 15.0 ... Spam Q 38.9 ... """ Notice that now everything lines up neatly. And we don't need to modify Python at all for this to work. -------- Motivation 2 - Maintaining Indentation Senario: - Outdenting misleads the eye >>> class SomeClass: ... def visitFromWaiter(self): ... if self.seated: ... self.silverware = ['fork','spoon'] ... self.menu = """Spam ... Spam and Eggs ... Spam on Rye ... """ ... self.napkin = DirtyNapkin() Notice how the indentation makes it quite clear when we are inside a class, a method, or a flow-control statement by merely watching the left-hand margin. But this is crudely interrupted by the multi-line string. Solution: - Process the multi-line string through a function >>> class SomeClass: ... def visitFromWaiter(self): ... if self.seated: ... self.silverware = ['fork','spoon'] ... self.menu = stripIndent( """\ ... Spam ... Spam and Eggs ... Spam on Rye ... """ ) ... self.napkin = DirtyNapkin() where stripIndent() has been defined as: >>> def stripIndent( s ): ... indent = len(s) - len(s.lstrip()) ... sLines = s.split('\n') ... resultLines = [ line[indent:] for line in sLines ] ... return ''.join( resultLines ) Notice how it is now NICELY indented, at the expense of a tiny little 4-line function. Of course, there are faster and safer ways to write stripIndent() (I, personally, would use a version that checked that each line started with identical indentation and raised an exception otherwise), but this version illustrates the idea while being very, very readable. ---- In conclusion, I propose you use simpler methods available WITHIN the language for solving this problem, rather than proposing a PEP to modify the language itself. -- Michael Chermside From xscottg@yahoo.com Thu Jul 25 18:59:50 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 25 Jul 2002 10:59:50 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <014b01c233b2$5a9bf240$e000a8c0@thomasnotebook> Message-ID: <20020725175950.8766.qmail@web40103.mail.yahoo.com> --- Thomas Heller wrote: > What if we would 'fix' the buffer interface? > This gets us part of the way there, but still has shortcomings. For one I, and people more significant than me, would still need a type that implemented the bytes object behavior. Everything but efficient pickling _could_ be done with third party extensions, but ignoring pickling (which I don't want to do), then we'd still have several significant third parties reinventing the same wheel. To me at least, this feels like a battery that should be included. > Extend the PyBufferProcs structure by new fields: > > typedef size_t (*getlargereadbufferproc)(PyObject *, void **); > typedef size_t (*getlargewritebufferproc)(PyObject *, void **); > How would you designate failure/exceptions? size_t is unsigned everywhere I can find it, so it can't return a negative number on failure. I guess the void** could be filled in with NULL. > > typedef struct { > getreadbufferproc bf_getreadbuffer; > getwritebufferproc bf_getwritebuffer; > getsegcountproc bf_getsegcount; > getcharbufferproc bf_getcharbuffer; > /* new fields */ > getlargereadbufferproc bf_getlargereadbufferproc; > getlargewritebufferproc bf_getlargewritebufferproc; > } PyBufferProcs; > > > The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER flag > is set in the object's type. Py_TPFLAGS_HAVE_GETLARGEBUFFER implies > the Py_TPFLAGS_HAVE_GETCHARBUFFER flag. > > These functions have the same semantics Scott describes: they must > only be implemented by types only return addresses which are valid as > long as the Python 'source' object is alive. > > Python strings, unicode strings, mmap objects, and maybe other types > would expose the large buffer interface, but the array type would > *not*. We could also change the name from 'large buffer interface' > to something more sensible, currently I don't have a better name. > I've been trying to keep the proposal as unintrusive as possible while still implementing the functionality needed. Adding more flags/members to PyObjects and modifying string, unicode, mmap, ... feels like a more intrusive change to me. I'm open to the idea, but I'm not ready to retract the current proposal. Then there is still the problem of needing something like a bytes object as mentioned above. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From thomas.heller@ion-tof.com Thu Jul 25 20:47:56 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 25 Jul 2002 21:47:56 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020725175950.8766.qmail@web40103.mail.yahoo.com> Message-ID: <05d501c23414$2c15c650$e000a8c0@thomasnotebook> From: "Scott Gilbert" > --- Thomas Heller wrote: > > What if we would 'fix' the buffer interface? > > > > This gets us part of the way there, but still has shortcomings. For one I, > and people more significant than me, would still need a type that > implemented the bytes object behavior. Sure, the extension of the buffer interface is only part of the picture. The bytes type is still needed as well. The extension I proposed is motivated by these thoughts: It would enable some of Python's builtin objects to expose the interface extension by supplying two trivial functions for each in the extended tp_as_buffer slot. The new functions expose a 'safe buffer interface', where there are guarantees about the lifetime of the pointer. So your bytes object can be a view of these builtin objects as well. It dismisses the segment count of the normal buffer interface. > Everything but efficient pickling > _could_ be done with third party extensions, but ignoring pickling (which I > don't want to do), then we'd still have several significant third parties > reinventing the same wheel. To me at least, this feels like a battery that > should be included. > I don't think my proposal prevents this. > > > Extend the PyBufferProcs structure by new fields: > > > > typedef size_t (*getlargereadbufferproc)(PyObject *, void **); > > typedef size_t (*getlargewritebufferproc)(PyObject *, void **); > > > > How would you designate failure/exceptions? size_t is unsigned everywhere > I can find it, so it can't return a negative number on failure. I guess > the void** could be filled in with NULL. > Details, not yet fleshed out completely. Store NULL in the void **, use ptrdiff_t instead of size_t, or something else. Or return ((size_t)-1) on failure. Or return -1 on failure, and fill out an size_t pointer: typedef int (*getlargereadwritebufferproc(PyObject *, size_t *, void **); > > Python strings, unicode strings, mmap objects, and maybe other types > > would expose the large buffer interface, but the array type would > > *not*. We could also change the name from 'large buffer interface' > > to something more sensible, currently I don't have a better name. Maybe it should be renamed 'safe buffer interface extension' instead of 'large buffer interface' (it could be large as well)? > > I've been trying to keep the proposal as unintrusive as possible while > still implementing the functionality needed. Adding more flags/members to > PyObjects and modifying string, unicode, mmap, ... feels like a more > intrusive change to me. I'm open to the idea, but I'm not ready to retract > the current proposal. Then there is still the problem of needing something > like a bytes object as mentioned above. The advantage (IMO) is that it defines a new protocol to get the pointer to the internal byte array on objects instead of requiring that these objects are instances of a special type or subtype thereof. > > __________________________________________________ > Do You Yahoo!? No, I google. ;-) Thomas From tim.one@comcast.net Thu Jul 25 20:51:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 25 Jul 2002 15:51:44 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020725175950.8766.qmail@web40103.mail.yahoo.com> Message-ID: [Scott Gilbert] > ... > How would you designate failure/exceptions? size_t is unsigned everywhere > I can find it, Right, and the std requires that size_t resolve to an unsigned type, so that's reliable. > so it can't return a negative number on failure. The usual dodge is to return (and test against) (size_t)-1 in that case. If the caller sees that the result is (size_t)-1, then it also needs to call PyErr_Occurred() to see whether it's a normal, or error, return value (and if it is an error case, the routine had to have set a Python exception, so that PyErr_Occurred() returns true then). > I guess the void** could be filled in with NULL. Sounds easier to me . From tdelaney@avaya.com Thu Jul 25 23:27:23 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Fri, 26 Jul 2002 08:27:23 +1000 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants Message-ID: > From: Michael Chermside [mailto:mcherm@destiny.com] > > In conclusion, I propose you use simpler methods available WITHIN the > language for solving this problem, rather than proposing a > PEP to modify > the language itself. In fact, the simplest mechanism is to declare all multi-line string literals at module scope. Presumably all such literals are supposed to be constants (docstrings are a special exception, but there are already rules for those in terms of how they should be displayed). This is a highly incompatible change with very high risk of breaking code. This is not a -1 or some such - this is a "cannot even be considered unless you can make it backwards compatible with all uses of multiline strings" which is of course impossible (since the whole purpose of the PEP is to modify such strings). When I first read this PEP I thought it was something that had been suggested to someone, and it was being proposed in order to be rejeted. It's obvious from later posts that that is not the case, and Stepan is having trouble understanding why such a PEP would be rejected out of hand. You might find support for a library function which performed the transformation that you desire (if there's a good enough use case for it). Personally, I don't think there is - too many times that one particular transformation will be "almost, but not quite what I want" in which case I need to roll my own anyway. Tim Delaney From ping@zesty.ca Thu Jul 25 22:03:51 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Thu, 25 Jul 2002 14:03:51 -0700 (PDT) Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: On Wed, 24 Jul 2002, Tim Peters wrote: > In short, there's no real "speed argument" against this anymore (as I said > in the first msg of this thread, the ~sort regression was serious -- it's an > important case; turns out galloping is very effective at speeding it too, > provided that dumbass premature special-casing doesn't stop galloping from > trying ). This is fantastic work, Tim. I'm all for switching over to timsort as the one standard sort method. -- ?!ng "Most things are, in fact, slippery slopes. And if you start backing off from one thing because it's a slippery slope, who knows where you'll stop?" -- Sean M. Burke From python-dev@zesty.ca Thu Jul 25 23:33:38 2002 From: python-dev@zesty.ca (Ka-Ping Yee) Date: Thu, 25 Jul 2002 15:33:38 -0700 (PDT) Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants In-Reply-To: Message-ID: On Fri, 26 Jul 2002, Delaney, Timothy wrote: > You might find support for a library function which performed the > transformation that you desire (if there's a good enough use case for it). inspect.getdoc(object) provides this, for docstrings. There's no function in the library to do this in general to any string, though. -- ?!ng "Mathematics isn't about what's true. It's about what can be concluded from what." From tim.one@comcast.net Fri Jul 26 02:05:54 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 25 Jul 2002 21:05:54 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: [Tim] > ... > There's also a significant systematic regression in timsort's +sort case, > ... also a mix of small regressions and speedups in 3sort. > These are because, to simplify experimenting, ...(and as many as > N-1 temp pointers can be needed, up from N/2). That's all repairable, > it's just a PITA to do it. It's repaired, and those glitches went away: > timsort > i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort > 15 32768 0.17 0.01 0.02 0.01 0.01 0.05 0.01 0.02 > 16 65536 0.24 0.02 0.02 0.02 0.02 0.09 0.02 0.04 > 17 131072 0.54 0.05 0.04 0.05 0.05 0.19 0.04 0.09 > 18 262144 1.17 0.09 0.09 0.10 0.10 0.38 0.09 0.18 > 19 524288 2.56 0.18 0.17 0.20 0.20 0.79 0.17 0.36 > 20 1048576 5.54 0.37 0.35 0.37 0.41 1.62 0.35 0.73 Now at 15 32768 0.17 0.01 0.01 0.01 0.02 0.09 0.01 0.03 16 65536 0.24 0.02 0.02 0.02 0.02 0.09 0.02 0.04 17 131072 0.53 0.05 0.04 0.05 0.05 0.18 0.04 0.09 18 262144 1.17 0.09 0.09 0.10 0.09 0.38 0.09 0.18 19 524288 2.56 0.18 0.18 0.19 0.19 0.78 0.17 0.36 20 1048576 5.53 0.37 0.35 0.36 0.37 1.60 0.35 0.74 In other news, an elf revealed that Perl is moving to an adaptive stable mergesort(!!!harmonic convergence!!!), and sent some cleaned-up source code. The comments reference a non-existent paper, but if I change the title and the year I find it here: Optimistic sorting and information theoretic complexity. Peter McIlroy. SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms), pp 467-474, Austin, Texas, 25-27 January 1993. Jeremy got that for me, and it's an extremely relevant paper. What I've been calling galloping he called "exponential search", and the paper has some great analysis, pretty much thoroughly characterizing the set of permutations for which this kind approach is helpful, and even optimal. It's a large set . Amazingly, citeseer finds only one reference to this paper, also from 1993, and despite all the work done on adaptive sorting since then. So it's either essentially unknown in the research community, was shot full of holes (but then people would have delighted in citing it just to rub that in <0.5 wink>), or was quickly superceded by a better result (but then ditto!). I'll leave that a mystery. I haven't had time yet to study the Perl code. The timsort algorithm is clearly more frugal with memory: worst-case N/2 temp pointers needed, and, e.g., in +sort it only needs (at most) 10 temp pointers (independent of N). That may or may not be good, though, depending on whether the Perl algorithm makes more effective use of the memory hierarchy; offhand I don't think it does. OTOH, timsort has 4 flavors of galloping and 2 flavors of binary search and 2 merge routines, because the memory-saving gimmick can require merging "from the left" or "from the right", depending on which run is smaller. Doubling the number of helper routines is what "PITA" meant in the quote at the start . One more bit of news: cross-box performance of this stuff is baffling. Nobody else has tried timsort yet (unless someone who asked for the code tried an earlier version), but there are Many Mysteries just looking at the numbers for /sort under current CVS Python. Recall that /sort is the case where the data is already sorted: it does N-1 compares in one scan, and that's all. For an array with 2**20 distinct floats that takes 0.35 seconds on my Win98SE 866MHz Pentium box, compiled w/ MSVC6. On my Win2K 866MHz Pentium box, compiled w/ MSVC6, it takes 0.58(!) seconds, and indeed all the sort tests take incredibly much longer on the Win2K box. On Fred's faster Pentium box (I forget exactly how fast, >900MHz and <1GHz), using gcc, the sort tests take a lot less time than on my Win2K box, but my Win98SE box is still faster. Another Mystery (still with the current samplesort): on Win98SE, !sort is always a bit faster than *sort. On Win2K and on Fred's box, it's always a bit slower. I'm leaving that a mystery too. I haven't tried timsort on another box yet, and given that my home machine may be supernaturally fast, I'm never going to . From xscottg@yahoo.com Fri Jul 26 02:33:30 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 25 Jul 2002 18:33:30 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <05d501c23414$2c15c650$e000a8c0@thomasnotebook> Message-ID: <20020726013330.31053.qmail@web40111.mail.yahoo.com> --- Thomas Heller wrote: > From: "Scott Gilbert" > > --- Thomas Heller wrote: > > > What if we would 'fix' the buffer interface? > > > > > For one I, and people more significant than me, would still need a > > type that implemented the bytes object behavior. > > Sure, the extension of the buffer interface is only part of the > picture. The bytes type is still needed as well. > > The extension I proposed is motivated by these thoughts: > > It would enable some of Python's builtin objects to > expose the interface extension by supplying two > trivial functions for each in the extended tp_as_buffer slot. > > The new functions expose a 'safe buffer interface', where > there are guarantees about the lifetime of the pointer. So > your bytes object can be a view of these builtin objects > as well. > > It dismisses the segment count of the normal buffer interface. > [...] > > > > I've been trying to keep the proposal as unintrusive as possible while > > still implementing the functionality needed. Adding more flags/members > > to PyObjects and modifying string, unicode, mmap, ... feels like a more > > intrusive change to me. I'm open to the idea, but I'm not ready to > > retract the current proposal. Then there is still the problem of > > needing something like a bytes object as mentioned above. > > The advantage (IMO) is that it defines a new protocol to get the > pointer to the internal byte array on objects instead of > requiring that these objects are instances of a special type > or subtype thereof. > I like your idea for adding the flags and methods to create a "safe buffer interface". As you note, string, unicode, mmap, and possibly other things could implement these methods and return a (possibly large) pointer that could be manipulated after the GIL is released. Of course the pickleable bytes object falls into that category too. It seems to me that we have two independant proposals. Do you see any reason why they shouldn't be two separate PEPs? I don't see any reason to piggyback them into one. They're related in topic, but neither seems to rely on the other in any way. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From guido@python.org Fri Jul 26 04:16:36 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 25 Jul 2002 23:16:36 -0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline stringconstants References: Message-ID: <00a901c23452$daae16c0$7f00a8c0@pacbell.net> My mails to Stepan Koltsov have been bouncing (after the first one apparently went through). Assuming he's not subscribed to python-dev, he may not be aware of our responses. What to do? Simple reject it in absentia? --Guido van Rossum (home page: http://www.python.org/~guido/) From Rick Farrer" This is a multi-part message in MIME format. ------=_NextPart_000_0007_01C2342A.123F3C00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Please remove me from the mailing list. rf@avisionone.com Thanks, Rick ------=_NextPart_000_0007_01C2342A.123F3C00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Please remove me from the mailing=20 list.
 
rf@avisionone.com
 
Thanks,
Rick
 
------=_NextPart_000_0007_01C2342A.123F3C00-- From cce@clarkevans.com Fri Jul 26 05:37:10 2002 From: cce@clarkevans.com (Clark C . Evans) Date: Fri, 26 Jul 2002 00:37:10 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <20020723063611.26677.qmail@web40102.mail.yahoo.com>; from xscottg@yahoo.com on Mon, Jul 22, 2002 at 11:36:11PM -0700 References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <20020726003709.C17944@doublegemini.com> | Abstract | | This PEP proposes the creation of a new standard type and builtin | constructor called 'bytes'. The bytes object is an efficiently | stored array of bytes with some additional characteristics that | set it apart from several implementations that are similar. This is great. Python currently lacks two "standard" programming objects which most languages have: (a) timestamp, and (b) binary. This addresses the second. This will greatly help YAML data interoperability among other programming languages such as Java, Ruby, etc. Best, Clark Yo! Check out YAML Serialization for the masses! http://yaml.org -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software From sholden@holdenweb.com Fri Jul 26 14:15:33 2002 From: sholden@holdenweb.com (Steve Holden) Date: Fri, 26 Jul 2002 09:15:33 -0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline stringconstants References: <00a901c23452$daae16c0$7f00a8c0@pacbell.net> Message-ID: <127c01c234a6$867957f0$6300000a@holdenweb.com> ----- Original Message ----- From: "Guido van Rossum" To: Sent: Thursday, July 25, 2002 11:16 PM Subject: Re: [Python-Dev] Re: PEP 295 - Interpretation of multiline stringconstants > My mails to Stepan Koltsov have been bouncing (after the first one > apparently went through). Assuming he's not subscribed to python-dev, > he may not be aware of our responses. What to do? Simple reject it > in absentia? > Well, at least that way he'll see it's been rejected from the PEP listing. You can always direct him to the Mailman archives when his mail comes back on line. regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From mal@lemburg.com Fri Jul 26 08:35:07 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 26 Jul 2002 09:35:07 +0200 Subject: [Python-Dev] Sorting References: Message-ID: <3D40FBAB.8090909@lemburg.com> Tim Peters wrote: > One more bit of news: cross-box performance of this stuff is baffling. > Nobody else has tried timsort yet (unless someone who asked for the code > tried an earlier version), but there are Many Mysteries just looking at the > numbers for /sort under current CVS Python. Recall that /sort is the case > where the data is already sorted: it does N-1 compares in one scan, and > that's all. For an array with 2**20 distinct floats that takes 0.35 seconds > on my Win98SE 866MHz Pentium box, compiled w/ MSVC6. On my Win2K 866MHz > Pentium box, compiled w/ MSVC6, it takes 0.58(!) seconds, and indeed all the > sort tests take incredibly much longer on the Win2K box. On Fred's faster > Pentium box (I forget exactly how fast, >900MHz and <1GHz), using gcc, the > sort tests take a lot less time than on my Win2K box, but my Win98SE box is > still faster. > > Another Mystery (still with the current samplesort): on Win98SE, !sort is > always a bit faster than *sort. On Win2K and on Fred's box, it's always a > bit slower. I'm leaving that a mystery too. I haven't tried timsort on > another box yet, and given that my home machine may be supernaturally fast, > I'm never going to . I can give it a go on my AMD boxes if you send me the code. They tend to show surprising results as you know :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Fri Jul 26 15:28:50 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 26 Jul 2002 16:28:50 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface Message-ID: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> Here is the draft PEP for the ideas posted here. Regards, Thomas -------- PEP: xxx Title: The Safe Buffer Interface Version: $Revision: $ Last-Modified: $Date: 2002/07/26 14:19:38 $ Author: theller@python.net (Thomas Heller) Status: Draft Type: Standards Track Created: 26-Jul-2002 Python-Version: 2.3 Post-History: 26-Jul-2002 Abstract This PEP proposes an extension to the buffer interface called the 'safe buffer interface'. The safe buffer interface fixes the flaws of the 'old' buffer interface as defined in Python versions up to and including 2.2: The lifetime of the retrieved pointer is clearly defined. The buffer size is returned as a 'size_t' data type, which allows access to 'large' buffers on platforms where sizeof(int) != sizeof(void *). Specification The 'safe' buffer interface exposes new functions which return the size and the pointer to the internal memory block of any python object which chooses to implement this interface. The size and pointer returned must be valid as long as the object is alive (has a positive reference count). So, only objects which never reallocate or resize the memory block are allowed to implement this interface. The safe buffer interface ommits the memory segment model which is present in the old buffer interface - only a single memory block can be exposed. Implementation Define a new flag in Include/object.h: #define Py_TPFLAGS_HAVE_GETSAFEBUFFER /* PyBufferProcs contains bf_getsafereadbuffer and bf_getsafewritebuffer */ #define Py_TPFLAGS_HAVE_GETSAFEBUFFER (1L<<15) This flag would be included in Py_TPFLAGS_DEFAULT: #define Py_TPFLAGS_DEFAULT ( \ .... Py_TPFLAGS_HAVE_GETCHARBUFFER | \ .... 0) Extend the PyBufferProcs structure by new fields in Include/object.h: typedef size_t (*getlargereadbufferproc)(PyObject *, void **); typedef size_t (*getlargewritebufferproc)(PyObject *, void **); typedef struct { getreadbufferproc bf_getreadbuffer; getwritebufferproc bf_getwritebuffer; getsegcountproc bf_getsegcount; getcharbufferproc bf_getcharbuffer; /* safe buffer interface functions */ getsafereadbufferproc bf_getsafereadbufferproc; getsafewritebufferproc bf_getsafewritebufferproc; } PyBufferProcs; The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER flag is set in the object's type. XXX Py_TPFLAGS_HAVE_GETLARGEBUFFER implies the Py_TPFLAGS_HAVE_GETCHARBUFFER flag. The getsafereadbufferproc and getsafewritebufferproc functions return the size in bytes of the memory block on success, and fill in the passed void * pointer on success. If these functions fail - either because an error occurs or no memory block is exposed - they must set the void * pointer to NULL and raise an exception. The return value is undefined in these cases and should not be used. Backward Compatibility There are no backward compatibility problems. Reference Implementation Will be uploaded to the sourceforge patch manager by the author. Additional Notes/Comments It may be a good idea to expose the following convenience functions: int PyObject_AsSafeReadBuffer(PyObject *obj, void **buffer, size_t *buffer_len); int PyObject_AsSafeWriteBuffer(PyObject *obj, void **buffer, size_t *buffer_len); These functions return 0 on success, set buffer to the memory location and buffer_len to the length of the memory block in bytes. On failure, they return -1 and set an exception. Python strings, unicode strings, mmap objects, and maybe other types would expose the safe buffer interface, but the array type would *not*, because it's memory block may be reallocated during it's lifetime. References [1] The buffer interface http://mail.python.org/pipermail/python-dev/2000-October/009974.html [2] The Buffer Problem http://www.python.org/peps/pep-0296.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From ville.vainio@swisslog.com Fri Jul 26 08:11:41 2002 From: ville.vainio@swisslog.com (Ville Vainio) Date: Fri, 26 Jul 2002 10:11:41 +0300 Subject: [Python-Dev] Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> Message-ID: <3D40F62D.7000106@swisslog.com> > where stripIndent() has been defined as: > > >>> def stripIndent( s ): > ... indent = len(s) - len(s.lstrip()) > ... sLines = s.split('\n') > ... resultLines = [ line[indent:] for line in sLines ] > ... return ''.join( resultLines ) Something like this should really be available somewhere in the standard library (string module [yeah, predeprecation, I know], string method). Everybody needs this kind of functionality, and probably more often than many of the other string methods (title, swapcase come to mind). -- Ville From xscottg@yahoo.com Fri Jul 26 16:01:09 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Fri, 26 Jul 2002 08:01:09 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> Message-ID: <20020726150109.4104.qmail@web40111.mail.yahoo.com> --- Thomas Heller wrote: > Here is the draft PEP for the ideas posted here. > [...] I like it. :-) > > typedef size_t (*getlargereadbufferproc)(PyObject *, void **); > typedef size_t (*getlargewritebufferproc)(PyObject *, void **); > I'm sure this is a cut-and-pasto for typedef size_t (*getsafereadbufferproc)(PyObject *, void **); typedef size_t (*getsafewritebufferproc)(PyObject *, void **); __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From thomas.heller@ion-tof.com Fri Jul 26 16:06:55 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 26 Jul 2002 17:06:55 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020726150109.4104.qmail@web40111.mail.yahoo.com> Message-ID: <089d01c234b6$15385220$e000a8c0@thomasnotebook> From: "Scott Gilbert" > > Here is the draft PEP for the ideas posted here. > > > [...] > > I like it. :-) :-) > > > typedef size_t (*getlargereadbufferproc)(PyObject *, void **); > > typedef size_t (*getlargewritebufferproc)(PyObject *, void **); > > I'm sure this is a cut-and-pasto for > > typedef size_t (*getsafereadbufferproc)(PyObject *, void **); > typedef size_t (*getsafewritebufferproc)(PyObject *, void **); > Exactly. Everything is named safebuffer instead of largebuffer. Thanks, Thomas From mwh@python.net Fri Jul 26 10:44:45 2002 From: mwh@python.net (Michael Hudson) Date: 26 Jul 2002 10:44:45 +0100 Subject: [Python-Dev] Sorting In-Reply-To: Tim Peters's message of "Thu, 25 Jul 2002 21:05:54 -0400" References: Message-ID: <2meldq3jsi.fsf@starship.python.net> Tim Peters writes: > One more bit of news: cross-box performance of this stuff is baffling. > Nobody else has tried timsort yet (unless someone who asked for the code > tried an earlier version), but there are Many Mysteries just looking at the > numbers for /sort under current CVS Python. If you put the code somewhere, I'll try it on my PPC iBook (not today, as it's at home, but soon). I'd thank you for working on this, but you're clearly enjoying it an unhealthy amount already . Cheers, M. -- ZAPHOD: You know what I'm thinking? FORD: No. ZAPHOD: Neither do I. Frightening isn't it? -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From jacobs@penguin.theopalgroup.com Fri Jul 26 16:18:58 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 26 Jul 2002 11:18:58 -0400 (EDT) Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: On Thu, 25 Jul 2002, Tim Peters wrote: > One more bit of news: cross-box performance of this stuff is baffling. I'll run tests on the P4 Xeon, Alpha (21164A, 21264), AMD Elan 520, and maybe a few Sparcs, and whatever else I can get my hands on. Just let me know where I can snag the code + test script. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From pinard@iro.umontreal.ca Fri Jul 26 16:05:39 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 26 Jul 2002 11:05:39 -0400 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? In-Reply-To: <3D40F62D.7000106@swisslog.com> References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> Message-ID: [Ville Vainio] > > where stripIndent() has been defined as: > > > > >>> def stripIndent( s ): > > ... indent = len(s) - len(s.lstrip()) > > ... sLines = s.split('\n') > > ... resultLines = [ line[indent:] for line in sLines ] > > ... return ''.join( resultLines ) > Something like this should really be available somewhere in the standard > library (string module [yeah, predeprecation, I know], string > method). Everybody needs this kind of functionality, and probably more often > than many of the other string methods (title, swapcase come to mind). Strange. I did a lot of Python programming, and never needed this. In fact, I like my doc-strings and other triple-quoted strings flushed left. So, I can see them in the code exactly as they will appear on the screen. If I used artificial margins in Python so my doc-strings appeared to be indented more than the surrounding, and wrote my code this way, it would appear artificially constricted on the left once printed. It's not worth. For me, best is to use """\ always while the opening triple-quote, and write flushed left until the closing """. As most long strings end with a new line, the closing """ is usually flushed left just as well. My opinion is that it is nice this way. Don't touch the thing! :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From oren-py-d@hishome.net Fri Jul 26 09:15:07 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 26 Jul 2002 11:15:07 +0300 Subject: [Python-Dev] Iteration - my summary Message-ID: <20020726111507.A28836@hishome.net> There has been some lively discussion about the iteration protocols lately. My impression of the opinions on the list so far is this: It could have been semantically cleaner. There is a blurred boundary between the iterable-container and iterator protocols. Perhaps next should have been called __next__. Perhaps iterators should not have been required to implement an __iter__ method returning self. With the benefit of hindsight the protocols could have been designed better. But there is nothing fundamentally broken about iteration. Nothing that justifies any serious change that would break backward compatibility and require a transition plan. A remaining sore spot is re-iterability. Iterators being their own iterators is ok by itself. StopIteration being a sink state is ok by itself. When they are combined they result in hard-to-trace silent errors because an exhausted iterator is indistinguishable from an empty container. This happens in real code, not in some contrived examples. It is clear to me that this issue needs to be addressed in some way, but without a complete redesign of the iteration protocols. My proposal of raising an exception on calling .next() after StopIteration has been rejected by Guido. Here's another approach: Proposal: new built-in function reiter() def reiter(obj): """reiter(obj) -> iterator Get an iterator from an object. If the object is already an iterator a TypeError exception will be raised. For all Python built-in types it is guaranteed that if this function succeeds the next call to reiter() will return a new iterator that produces the same items unless the object is modified. Non-builtin iterable objects which are not iterators SHOULD support multiple iteration returning the same items.""" it = iter(obj) if it is obj: raise TypeError('Object is not re-iterable') return it Example: def cartprod(a,b): """ Generate the cartesian product of two sources. """ for x in a: for y in reiter(b): yield x,y This function should raise an exception if object b is a generator or some other non re-iterable object. List comprehensions should use the C API equivalent of reiter for sources other than the first. This solution is less than perfect. It requires explicit attention by the programmer and is less comprehensive than the other solutions proposed but I think it's better than nothing. A related issue is iteration of files. It's an exception for the guarantee made in the docstring above. My impression is that people generally agree that file objects are more iterator-like than container-like because they are stateful cursors. However, making files into iterators is not as simple as adding a next method that calls readline and raises StopIteration on EOF. This implementation would lose the performance benefit from the readahead bufering done in the xreadlines object. The way I see file object iteration is that the file object and xreadlines object abuse the iterable-container<->iterator relationship to produce a cursor-without-readahead-buffer<->cursor-with-readahead-buffer relationship. I don't like objects pretending to be something they're not. I can finish my xreadlines caching patch that makes a file into an iterator with an embedded xreadlines object. Perhaps it's not the most elegant solution but I don't see any real problems with it. I am also thinking about implementing line buffering inside the file object that can finally get rid of the whole fgets/getc_unlocked multiplatform mess and make xreadlines unnecessary. The problem here is that readahead is not exactly a transparent operation. More on this later. Oren From yozh@mx1.ru Fri Jul 26 17:05:59 2002 From: yozh@mx1.ru (Stepan Koltsov) Date: Fri, 26 Jul 2002 20:05:59 +0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants In-Reply-To: <00a901c23452$daae16c0$7f00a8c0@pacbell.net> References: <00a901c23452$daae16c0$7f00a8c0@pacbell.net> Message-ID: <20020726160559.GA24120@banana.mx1.ru> On Thu, Jul 25, 2002 at 11:16:36PM -0400, Guido van Rossum wrote: > My mails to Stepan Koltsov have been bouncing (after the first one > apparently went through). Assuming he's not subscribed to python-dev, > he may not be aware of our responses. What to do? Simple reject it > in absentia? I don't understand, what happens with my DNS, but I am subscriber of this maillist and I read it sometimes. So... What you (and others) think about just adding flag 'i' to string constants (that will strip indentation etc.)? This doesn't affect existing code, but it will be useful (at least for me ;-) Motivation was posted here by Michael Chermside, but I don't like his solutions. -- mailto: Stepan Koltsov From tim.one@comcast.net Fri Jul 26 17:02:34 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 26 Jul 2002 12:02:34 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: Apart from fine-tuning and rewriting the doc file, I think the mergesort is done. I'm confident that if any bugs remain, I haven't seen them . A patch against current CVS listobject.c is here: http://www.python.org/sf/587076 Simple instructions for timing exactly the same data I've posted times against are in the patch description (you already have sortperf.py -- it's in Lib/test). This patch doesn't replace samplesort, it adds a new .msort() method, to make comparative timings easier. It also adds an .hsort() method for weak heapsort, because I forgot to delete that code after I gave up on it . X-platform samplesort timings are interesting as well as samplesort versus mergesort timings. Timings against "real life" sort jobs are especially interesting. Attaching results to the bug report sounds like a good idea to me, so we get a coherent record in one place. Thanks in advance! From mal@lemburg.com Fri Jul 26 17:58:49 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 26 Jul 2002 18:58:49 +0200 Subject: [Python-Dev] Sorting References: Message-ID: <3D417FC9.6030308@lemburg.com> Tim Peters wrote: > Apart from fine-tuning and rewriting the doc file, I think the mergesort is > done. I'm confident that if any bugs remain, I haven't seen them . A > patch against current CVS listobject.c is here: > > http://www.python.org/sf/587076 > > Simple instructions for timing exactly the same data I've posted times > against are in the patch description (you already have sortperf.py -- it's > in Lib/test). This patch doesn't replace samplesort, it adds a new .msort() > method, to make comparative timings easier. It also adds an .hsort() method > for weak heapsort, because I forgot to delete that code after I gave up on > it . > > X-platform samplesort timings are interesting as well as samplesort versus > mergesort timings. Timings against "real life" sort jobs are especially > interesting. Attaching results to the bug report sounds like a good idea to > me, so we get a coherent record in one place. > > Thanks in advance! Here's the result for AMD Athlon 1.2GHz/Linux/gcc: Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1 i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.07 0.00 0.01 0.09 0.01 0.03 0.01 0.08 16 65536 0.18 0.02 0.02 0.19 0.03 0.07 0.02 0.20 17 131072 0.43 0.05 0.04 0.46 0.05 0.18 0.05 0.48 18 262144 0.99 0.09 0.10 1.04 0.13 0.40 0.09 1.11 19 524288 2.23 0.19 0.21 2.32 0.24 0.83 0.20 2.46 20 1048576 4.96 0.40 0.40 5.41 0.47 1.72 0.40 5.46 without patch: Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1 i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.09 0.01 0.03 0.00 0.09 16 65536 0.20 0.02 0.01 0.20 0.03 0.07 0.02 0.20 17 131072 0.46 0.06 0.02 0.45 0.05 0.20 0.04 0.49 18 262144 0.99 0.09 0.10 1.09 0.11 0.40 0.12 1.12 19 524288 2.33 0.20 0.20 2.30 0.24 0.83 0.19 2.47 20 1048576 4.89 0.40 0.41 5.37 0.48 1.71 0.38 6.22 -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Fri Jul 26 18:22:04 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 26 Jul 2002 13:22:04 -0400 Subject: [Python-Dev] Sorting In-Reply-To: <3D417FC9.6030308@lemburg.com> Message-ID: [MAL] > Here's the result for AMD Athlon 1.2GHz/Linux/gcc: > > Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1 > i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort > 15 32768 0.07 0.00 0.01 0.09 0.01 0.03 0.01 0.08 > 16 65536 0.18 0.02 0.02 0.19 0.03 0.07 0.02 0.20 > 17 131072 0.43 0.05 0.04 0.46 0.05 0.18 0.05 0.48 > 18 262144 0.99 0.09 0.10 1.04 0.13 0.40 0.09 1.11 > 19 524288 2.23 0.19 0.21 2.32 0.24 0.83 0.20 2.46 > 20 1048576 4.96 0.40 0.40 5.41 0.47 1.72 0.40 5.46 > > without patch: > > Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1 > i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort > 15 32768 0.08 0.01 0.01 0.09 0.01 0.03 0.00 0.09 > 16 65536 0.20 0.02 0.01 0.20 0.03 0.07 0.02 0.20 > 17 131072 0.46 0.06 0.02 0.45 0.05 0.20 0.04 0.49 > 18 262144 0.99 0.09 0.10 1.09 0.11 0.40 0.12 1.12 > 19 524288 2.33 0.20 0.20 2.30 0.24 0.83 0.19 2.47 > 20 1048576 4.89 0.40 0.41 5.37 0.48 1.71 0.38 6.22 I assume you didn't read the instructions in the patch description: http://www.python.org/sf/587076 The patch doesn't change anything about how list.sort() works, so what you've shown us is the timing variance on your box across two identical runs. To time the new routine, you need to (temporarily) change L.sort() to L.msort() in sortperf.py's doit() function. It's a one-character change, but an important one . From tim.one@comcast.net Fri Jul 26 18:50:30 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 26 Jul 2002 13:50:30 -0400 Subject: [Python-Dev] Sorting In-Reply-To: <3D418884.2090509@lemburg.com> Message-ID: [MAL] > Dang. Why don't you distribute a ZIP file which can be dumped > onto the standard Python installation ? A zip file containing what? And which "standard Python installation"? If someone is on Python-Dev but can't deal with a one-file patch against CVS, I'm not sure what to conclude, except that I don't want to deal with them at this point . > Here's the .msort() version: > > Python/Tim-Python> ./python -O sortperf.py 15 20 1 > i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort > 15 32768 0.08 0.01 0.01 0.01 0.01 0.03 0.00 0.02 > 16 65536 0.17 0.02 0.02 0.02 0.02 0.07 0.02 0.06 > 17 131072 0.41 0.05 0.04 0.05 0.04 0.16 0.04 0.09 > 18 262144 0.95 0.10 0.10 0.10 0.10 0.33 0.10 0.20 > 19 524288 2.17 0.20 0.21 0.20 0.21 0.66 0.20 0.44 > 20 1048576 4.85 0.42 0.40 0.41 0.41 1.37 0.41 0.84 Thanks! That's more like it. So far I've got the only known box were ~sort is slower under msort (two other sets of timings were attached to the patch; I'll paste yours in too, merging in the smaller numbers from your first report). From mal@lemburg.com Fri Jul 26 18:36:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 26 Jul 2002 19:36:04 +0200 Subject: [Python-Dev] Sorting References: Message-ID: <3D418884.2090509@lemburg.com> Tim Peters wrote: > [MAL] > >>Here's the result for AMD Athlon 1.2GHz/Linux/gcc: >> >>without patch: >> >>Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1 >> i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort >>15 32768 0.08 0.01 0.01 0.09 0.01 0.03 0.00 0.09 >>16 65536 0.20 0.02 0.01 0.20 0.03 0.07 0.02 0.20 >>17 131072 0.46 0.06 0.02 0.45 0.05 0.20 0.04 0.49 >>18 262144 0.99 0.09 0.10 1.09 0.11 0.40 0.12 1.12 >>19 524288 2.33 0.20 0.20 2.30 0.24 0.83 0.19 2.47 >>20 1048576 4.89 0.40 0.41 5.37 0.48 1.71 0.38 6.22 > > > I assume you didn't read the instructions in the patch description: > > http://www.python.org/sf/587076 > > The patch doesn't change anything about how list.sort() works, so what > you've shown us is the timing variance on your box across two identical > runs. To time the new routine, you need to (temporarily) change L.sort() to > L.msort() in sortperf.py's doit() function. It's a one-character change, > but an important one . Dang. Why don't you distribute a ZIP file which can be dumped onto the standard Python installation ? Here's the .msort() version: Python/Tim-Python> ./python -O sortperf.py 15 20 1 i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.01 0.01 0.03 0.00 0.02 16 65536 0.17 0.02 0.02 0.02 0.02 0.07 0.02 0.06 17 131072 0.41 0.05 0.04 0.05 0.04 0.16 0.04 0.09 18 262144 0.95 0.10 0.10 0.10 0.10 0.33 0.10 0.20 19 524288 2.17 0.20 0.21 0.20 0.21 0.66 0.20 0.44 20 1048576 4.85 0.42 0.40 0.41 0.41 1.37 0.41 0.84 -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Fri Jul 26 19:17:10 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 26 Jul 2002 20:17:10 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <0a7101c234d0$a8c463c0$e000a8c0@thomasnotebook> [sorry if you see this twice, didn't seem to get through the first time] If the safe buffer PEP would be accepted and implemented, here's my proposal for the bytes object. The bytes object uses the safe buffer interface to gain access to the byte array it exposes. The bytes type would probably accept the following arguments: PyObject *type - the (bytes) type or subtype to create PyObject *obj - the object exposing the safe buffer interface size_t offset - starting offset of obj's memory block size_t length - number of bytes to use (0 for all) and maybe a flag requesting read or read/write access. A convention could be that if a NULL is passed for obj, then the bytes object itself allocates a memory block of length length. Of course the bytes object itself would also expose the safe buffer interface. And slicing, but not repetition. Isn't the above sufficient (provided that we somehow add the pickle stuff into this picture)? Thomas From thomas.heller@ion-tof.com Fri Jul 26 18:46:33 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 26 Jul 2002 19:46:33 +0200 Subject: [Python-Dev] PEP 296 - The Buffer Problem References: <20020723063611.26677.qmail@web40102.mail.yahoo.com> Message-ID: <098c01c234cc$621a78f0$e000a8c0@thomasnotebook> If the safe buffer PEP would be accepted and implemented, here's my proposal for the bytes object. The bytes object uses the safe buffer interface to gain access to the byte array it exposes. The bytes type would probably accept the following arguments: PyObject *type - the (bytes) type or subtype to create PyObject *obj - the object exposing the safe buffer interface size_t offset - starting offset of obj's memory block size_t length - number of bytes to use (0 for all) and maybe a flag requesting read or read/write access. A convention could be that if a NULL is passed for obj, then the bytes object itself allocates a memory block of length length. Of course the bytes object itself would also expose the safe buffer interface. And slicing, but not repetition. Isn't the above sufficient (provided that we somehow add the pickle stuff into this picture)? Thomas From guido@python.org Fri Jul 26 21:48:30 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 26 Jul 2002 16:48:30 -0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants In-Reply-To: Your message of "Fri, 26 Jul 2002 20:05:59 +0400." <20020726160559.GA24120@banana.mx1.ru> References: <00a901c23452$daae16c0$7f00a8c0@pacbell.net> <20020726160559.GA24120@banana.mx1.ru> Message-ID: <200207262048.g6QKmU123924@pcp02138704pcs.reston01.va.comcast.net> > So... What you (and others) think about just adding flag 'i' to string > constants (that will strip indentation etc.)? This doesn't affect > existing code, but it will be useful (at least for me ;-) Motivation > was posted here by Michael Chermside, but I don't like his solutions. And I don't like your proposal. Sorry, but I really don't think the syntax should be changed for something that's so trivial to code if you need it. --Guido van Rossum (home page: http://www.python.org/~guido/) From nhodgson@bigpond.net.au Sat Jul 27 01:51:39 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sat, 27 Jul 2002 10:51:39 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> Message-ID: <028501c23507$c46ebcb0$3da48490@neil> Thomas Heller: > The size and pointer returned must be valid as long as the object > is alive (has a positive reference count). So, only objects which > never reallocate or resize the memory block are allowed to > implement this interface. I'd prefer an interface that allows for reallocation but has an explicit locked state during which the buffer must stay still. My motivation comes from the data structures implemented in Scintilla (an editor component), which could be exposed through this buffer interface to other code. The most important type in Scintilla (as in many editors) is a split (or gapped) buffer. Upon receiving a lock call, it could collapse the gap and return a stable pointer to its contents and then revert to its normal behaviour on receiving an unlock. Neil From xscottg@yahoo.com Sat Jul 27 03:26:38 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Fri, 26 Jul 2002 19:26:38 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <098c01c234cc$621a78f0$e000a8c0@thomasnotebook> Message-ID: <20020727022638.86727.qmail@web40101.mail.yahoo.com> --- Thomas Heller wrote: > If the safe buffer PEP would be accepted and implemented, > here's my proposal for the bytes object. > > The bytes object uses the safe buffer interface to gain > access to the byte array it exposes. > > The bytes type would probably accept the following arguments: > > PyObject *type - the (bytes) type or subtype to create > PyObject *obj - the object exposing the safe buffer interface > size_t offset - starting offset of obj's memory block > size_t length - number of bytes to use (0 for all) > > and maybe a flag requesting read or read/write access. > > A convention could be that if a NULL is passed for obj, > then the bytes object itself allocates a memory block > of length length. > > Of course the bytes object itself would also expose the safe > buffer interface. And slicing, but not repetition. > > Isn't the above sufficient (provided that we somehow > add the pickle stuff into this picture)? > It's probably sufficient but more than necessary. In particular, supporting the safe buffer protocol makes sense to me (if that gets accepted), but I'm not eager to immediately support the obj pointer as you describe above. We've gotten side-tracked a bit when describing the "view behavior" for the slicing operations on a bytes object. It was not my intent that the bytes object typically be used to create views into other Python objects. That whole discussion was an attempt to describe the slicing behavior. From my perspective, describing the whole inner-thing and outer-thing stuff was to explain the implementation. Think of the bytes object as a mutable string with some additional restrictions, and that's what I have in mind. The mmap example is sort of a retrofit since mmap should probably have been implemented via something like bytes in the first place (to get the bytes style slicing among other things), not because I think there are a lot of objects that you would want to wrap up in bytes views. The existing buffer object is ok for creating views, and truthfully I don't know how often it is really used for that. What I (and I think others) need is more like a pickleable-mutable-reliable-byte-string. I'm not eager to grow bytes into a superset object. Even if I'm wrong about the need for this, at the very least, the additional functionality can be added later. I really just want to push through a simple, usable, bytes object for the time being. We can easily add, we can't easily take away. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Sat Jul 27 03:40:12 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Fri, 26 Jul 2002 19:40:12 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <028501c23507$c46ebcb0$3da48490@neil> Message-ID: <20020727024012.85905.qmail@web40107.mail.yahoo.com> --- Neil Hodgson wrote: > Thomas Heller: > > > The size and pointer returned must be valid as long as the object > > is alive (has a positive reference count). So, only objects which > > never reallocate or resize the memory block are allowed to > > implement this interface. > > I'd prefer an interface that allows for reallocation but has an explicit > locked state during which the buffer must stay still. My motivation comes > from the data structures implemented in Scintilla (an editor component), > which could be exposed through this buffer interface to other code. The > most important type in Scintilla (as in many editors) is a split (or > gapped) buffer. Upon receiving a lock call, it could collapse the gap and > return a stable pointer to its contents and then revert to its normal > behaviour on receiving an unlock. > A couple of questions come to mind: First, could this be implemented by a gapped_buffer object that implements the locking functionality you want, but that returns simple buffers to work with when the object is locked. In other words, do we need to add this extra functionality up in the core protocol when it can be implemented specifically the way Scintilla (cool editor by the way) wants it to be in the Scintilla specific extension. Second, if you are using mutexes to do this stuff, you'll have to be very careful about deadlock. I imagine: thread 1: grab the object lock grab the object pointer release the GIL do some work acquire the GIL # deadlock thread 2: acquire the GIL try to resize the object # requires no outstanding locks Thread 2 needs to make sure no objects are holding the object lock when it does the resize, but thread 1 can't acquire the GIL until thread 2 gives it up. Both are stuck. If you choose not to implement the locks with true mutexes, then you're probably going to end up polling and that's bad too. Is there a way out of this? This is part of the reason I didn't want to put a lock state into the bytes object. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From ask@perl.org Sat Jul 27 06:40:33 2002 From: ask@perl.org (Ask Bjoern Hansen) Date: Fri, 26 Jul 2002 22:40:33 -0700 (PDT) Subject: [Python-Dev] python.org/switch/ Message-ID: <20020726223911.T70962-100000@onion.valueclick.com> As presented on the Perl Lightning talks here at OSCON: Switch movies. You guys will dig Nathan's (nat.mov and nat.mpg). http://www.perl.org/tpc/2002/movies/switch/ ;-) - ask -- ask bjoern hansen, http://askbjoernhansen.com/ !try; do(); From tim.one@comcast.net Sat Jul 27 09:02:48 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 27 Jul 2002 04:02:48 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: http://www.python.org/sf/587076 has collected timings on 5 boxes so far. I also noted that msort() gets a 32% speedup on my box when sorting a 1.33-million line snapshot of the Python-Dev archive. This is a puzzler to account for, since you wouldn't think there's significant pre-existing lexicographic order in a file like that. McIlroy noted similar results from experiments on text, PostScript and C source files in his adaptive mergesort (which is why I tried sorting Python-Dev to begin with), but didn't offer a hypothesis. Performance across platforms is a hoot so far, with Neal's box even seeing a ~6% speedup on *sort. Skip's Pentium III acts most like my Pentium III, which shouldn't be surprising. Ours are the only reports where !sort is faster than *sort for samplesort, and also where ~sort under samplesort is faster than ~sort under timsort. ~sort (only 4 distinct values, repeated N/4 times) remains the most puzzling of the tests by far. Relative to its performance under samplesort, sf userid ~sort speedup under timsort (negative means slower) --------- --------------------------------------------------- montanaro -23% tim_one - 6% jacobs99 +18% lemburg +25% nascheme +30% Maybe it's a big win for AMD boxes, and a mixed bag for Intel boxes. Or maybe it's a win for newer boxes, and a loss for older boxes. Or maybe it's a bigger win the higher the clock rate (it hurt the most on the slowest box, and helped the most on the fastest). Since it ends up doing a sequence of perfectly balanced merges from start to finish, I thought perhaps it has to do with OS and/or chip intelligence in read-ahead cache optimizations -- but *sort also ends up doing a sequence of perfectly balanced merges, and doesn't behave at all like ~sort across boxes. ~sort does exercise the galloping code much more than other tests (*sort rarely gets into galloping mode; ~sort never gets out of galloping mode), so maybe it really has most to do with cache design. Whatever, it's starting to look like a no-brainer -- except for the extremely mixed ~sort results, the numbers so far are great. From mal@lemburg.com Sat Jul 27 09:54:35 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 27 Jul 2002 10:54:35 +0200 Subject: [Python-Dev] Sorting References: Message-ID: <3D425FCB.2010104@lemburg.com> This is a multi-part message in MIME format. --------------080204010802000409070906 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Tim Peters wrote: > [MAL] > >>Dang. Why don't you distribute a ZIP file which can be dumped >>onto the standard Python installation ? > > > A zip file containing what? And which "standard Python installation"? If > someone is on Python-Dev but can't deal with a one-file patch against CVS, > I'm not sure what to conclude, except that I don't want to deal with them at > this point . Point taken ;-) I meant something like this: Here's a ZIP file. To install take your standard Python CVS download, unzip it on top of it, then run echo "With .sort()" ./python -O sortperf.py 15 20 1 echo "With .msort()" ./python -O timsortperf.py 15 20 1 -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ --------------080204010802000409070906 Content-Type: application/zip; name="tim.zip" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="tim.zip" UEsDBBQAAAAIABSc+izfCZtEKlMAAOEeAQAUABUAT2JqZWN0cy9saXN0b2JqZWN0LmNVVAkA A7iHQT3BqUE9VXgEAPQBZADMPGtzGzeSn8lfASu1NkmRMinbsmPa2lJk2quLIvskba1yjouF GYIkzOFgMg9SzMb726+7gZnBPEhJyV7dqlIxOQAajX53o4dPO+xcRjFTzlfhxkwuA08shR/z WCqfdZ42m99J3/WSiWB7nzbxXPkH8z18OJ2IKbu6fnc6/tvo5N3o8iqf+CaKJzB6MD9ufie8 SNgjm+hpvAlEBIONxtMOe69CFsnfxDjGzb4T/kROm80I93eZ9ONmqBJ/kgQ4pwXfmd9u/rPZ SPxIznwxYfTIkXHE3rL+sDxwCE9b9qO2z46P2Yths4mbXyJslgSvmw3WYWdT5rM3TP8dvjjq slgxzpaJF0ugClNTM8ZeHVQWHPafv9q64Oh5ZcHg6Nmr59sWvBgclhcMng36Lw+3LHje//6o tGAA+Lx4ueUMzw5fHpXP8OrZq1dH/fozHB4dDp6XznD0ctCHFbVnOOx//3LwoniGF8+OXr3s fz+oO8Pg6OXLl4cDc4aDg8LCw06n9WL/WUe2a1biIA7RClp1PZcRUysR9rjnKZfHImJBqAIV okRzD2HEc8E8lHqUqi5b8oX0ZyxUakkgpiCTfDKRZsEsVOt4fsAAtChARg2BzZbSm3SZk4D2 RLRe+CqZzXGjmVzhTr7gYS+WS8H4EvH4DYTREXO+krATQoRTecqf0epI/JoI36Xj8SAAjYha bZBdwjoIRZQNskCp0Nv0AhECykuZAthEsViyUBCasLa1nkt3jqhyeirjTZeJg9lBl3E3VBE8 9zw29fhKhfoEAP0f0p+oddRlaxnP8dv3tznOjkCCBRzO4iYeBySYwyes12PcnxCEtXgCR49i CZBnKkZ6AuoRiwIO2E9DPsutDLDHAasDbPP1RpqKK+HrzWNkaeTOwTABxh4YISSvjOEwvyYS wIJIwPmQgkBKZGwExCcgAHqlFgJpt4T/cR+XJRHQP1ZaaJ42GxPFwKQ0wFocH79lz4b4mUzK vv72jQEBQd5a/mEbvoYiTkKftVpkS2hmm+2zQZu9eaO/DpvsG9hIsIHAenZxObo6+59Ra8VD kGAwfl3mS+BQ1G7A3y9N3L5h/n5pNow1HPtiPcbPYMJsG2iWDvVkOWWtfOYbsHatf7U0hHa/ zZ6ShKtpC7dtt9u0pvFp85NYjqtYZYAQOkxE253j1YCZgMvF38/Ph+ZZRph+u9n8tPmofUgH PqJPGV+INZlsAolWWz9Pp6lgmJ3WdzagqEN9IH0WBvj/k7AdheH4Bz4582MRgkKegrS2kBEp JwglYFOzocEAlgSik54+Rw2XgeE/nQt3QXqO2jf11JoEATc3IJ7WrGWP3rKUuHQmws8godG8 UEBaFW4IP0BIBYBMCmL84ZRoYpOhyx4bal0jiwwFcJkmdWGP/KA5md6mdFJB71g5YxSPlE00 FblYM8E6GJxMi8RPJ+fnH08NCYjCGhlrnY3VjqPjxo0lCKqI7fVd1u+mhO1Yz9uGVxnVaMRI P/4Dz8c2Ga8vT05/BMi2PtqcQtkCFcQAwpD3CpUnnwJLUSLxfI/MDBIKhHlfuesNcgrnT1ut opi3ASLrHTNzIMLLxDc5OmBsxa0Iw2GdHn0Q8RkQqYB9lwIc+ecPUZQpiXrHfv+dSXb89q6j 6C1omUE/kw+UjvwhHPQqDsFljN+Haqk/tnBKY4/cMM1kCjwoOJ6QgxnfIxQ19lfCIADYjG7d 8RnOhgEVdlm6R/2J7uYICt9n+aUiLIbioZjBV3AqRdJnj4kH3SYr/tWsAstKco7sqgPqTXB4 WDvYCYZ38Hh88250ejl630q3GT5Ifv8g43fuCxQ0fK4wzWI8jzA0x0hgqwyUcEV7egc/wRdL mGxoCvOJgB1cafAcNm3kzUTLlPRtJUWxkH40KG0aCW+qlXA9FyF4z5xhK62VeCjaKWMkuW7D zFXRlj6MW7i5ZSLfsrOLa7DeNzasMgc+Gk+XM2HP5b6vYzMGlluYJBCDJ4qiahlAZ0CrnKJg CJpGOTRuU6PLCtjuD1InZyBVqVD0JDXHJ4pThADj+ovO/vLB4+Ku9sTCACzCMADkvzwA0Szp Ai0bMoRAGH+W+4MvMNt8+aKF6eyCZGlFh6MRWocTVxjm2NRKF9vPiTT7ZRG0TNKZH4kwrvEB FfGzbc2/xb+Zr6QEVdXrphjkRqCI+QklMSXM///QhQUYlbZrzUjGi9JxIGb8SUDiMYmoJmJM w0rJSRM1ZTwxyVY5wq21BBjA/N2/Drk+nxYgiGiu/nZ6cjG+Onk/Gv8w+nB2gYPNSgj2yFYY wOudwpzGAWBrHk5A9VCeT+ch4CEh37mW0VKEkOo0GGWwoXiCeWBElR4GcV7MXA4RotaOSC3F HKJhSrlCMUlcEeml8Tzk0RzTPpjpA4QO2JNNh0H2NzPZNCxxIb+MIbuiNBCXyeVSTCQ8gxRx IjwBgwcUazdQ46xAb0gaStmEUTwT1xZMdU4H0Lws0NTR6/vL0cieUQwmMcXpHcfBeBoK0bKD xS0sGF28IwYUPQFxOwCzGld43WXvz85HrDM1igkJ9SyyBaBJZ4aNLkUQjlCMczxSNLQvfpSd PvPNbSvgRmFqTAmLaQu32/t8cHDwpWCv+/r0xVk0I7N3/SFD0DYTmNzftzc+NhsXwMB/WXKQ yfQnokmRQSCLML/fts7TMMc/F3xVx4aiNuMJyofQx7wDTJ0jtwJr4mII60tM3Oa5Qas6cKSg 8DCQwqVkE9URJoQiSrw4y712c3tVx+yUvUh29tfaqDnlNHudx7kmoCgEBBlEg1M9LCMyMwUe /53yhQGXncxK5PspwulgHulby5vGIhFt21jQEdydM6GLygeMXagYqzE81iZmyTdsmQB/RFaU AwMNdiNS8DzCEs9UxO68WLLL4IKchVRD0jalLNkrS7D396URbKxIwI4J+t9GZOfnyClDR9vE UMRVSG1sgjU0sJxaxt9pOkH0Y6LisbFh8J0xoJKu7eUWEx0OHtdYR9xTA06Vv8wlgHECkRty EUyrqzQlorTAOZUhRtiwhcfROGPEockEUbcOJXR2OboeU+SmEW6TzmNoslVq9lJR2CYFDdSV nCC4w9n16KeMJH2twQb2qfJdHp/4k3fCaz2GYVytZ+hkrLLcEHUXCjvw//Kn8d9GuR4b7DgZ AjWYbzvZDsA5WRBzjeB2FfwELDdVT6ztxmoG0QtklVRPRQPOHBGvhTCKs5VWxtTvJFZmY8YZ iP9S0m/BeQz+2l7n8t9s4tLXhSysMLXetK8sy643LedpZNc94c/iecmyc5N602JuRR0AoArC VX7MIYAsA7HjVuEVfXvZ+PBtxsddFoqCl9Kdn6plwEPxg1JeS3jdiuDBxrKtE2eKqoBEo//O bBMCPC5GCOQ8qfKXTiiFEKVgWTvKbZ5S6upTiRTFAlSxgJAf/j++SJQnb7xs+QviYpeKttAp 8qQrthAKMm/zaS5n85pyuE9lHhN6EL2wLG3YRp91jptxlZ4d26TOJ9oSroHhrgANx2kefX9L 3wtA6XkFqpldAOsHuoxcyKHskEGv6uk9DR5+wWwVmZFpECFFSkQgrKjUKrJk6GSMaZQTcUDR mmEwSVPyb7WVY18XjmuLsFeGvfdgbTWR5X+gImseWLJVJjeioLenrXcJp0vOaJdFc5zMoqWS k0XCVUmtnBCW20d8r8IlT9URLzfywhNjWHqCKA6SQo2W8DEEJD1vYUnql72/HBz2+9EvEOmm 1Siz1HGK+ZzPl6KWfOn9m1PN9PGkDXO9kMs022fOgwQ8vbIoCXb9lc0u//BvkW9LsMt7OXfv 5TxMl/Yt/O/WqO8SH9tFnN3ZmKgTz7znQxfb2ddhQURrRLOmdO6nhtQ3VrTK/A7z/wTPKxyn cnNOspq827d4QWNf9djXknB8TSfpGrYlFF+JURanOiaHDqiUaLLnuvsxY+bKcQ9kBg9wYtWa N0SePwiXJ+BMPt980ZEdQ1UPhZuEkVxhGUj6dCVPyq4CkaYtym/qOhOkhZQHsrXQOWCgojiA gJFC2Rwud2O5kvGGJX4sPVrMp3h1kuWKAAnrYtg8gdf5gIjypcs9Fs15IA5MPQxoL2h1ul/a OALJk90HwsOQb7rsCZxl43riCVFCMWquSNdjXA6JWCjJsgUbwkUXuinphSAvLYOxKYQ5OE5r Ed8D9rrXonjckmCzW9cIc+k2wegCiBKQ/opS4ykm3x53KePWdKC7bfgyoWmncwyekCha7s3g ggbPlQpMqIX9WNtN6KpdvsewtCsLJwoOYmX8g5+Zm7Twh1M5AnKMpONpAuFKIDtVJ/f4Z/n6 KxoavodtJkRbx+S4lDDTGUDUSf7RoFle06Gc0c/qS+loLu5FP9pl2uRZWcuqUJwy+3wrXrDf w+/B356WMbr52u7xCK29fNVqt9MrXOX9h8eN5gqE27ZxgtYSAjQNp2dHjCnkFAO0gUYpKGnA 0u/F6B+WeesWwLSH2Q19cakx1LjHJO2jQBV4R/rJehNzS5MtyYEalSbBI9O9yKLWRR61LsBw a7O9v2/ufD4vvqTyPslbXDSMpx2Cgms7TzUg2w2k0Boazv7kSxFmww5k3oKu4zP7Vq54KWcx SlsF7j+J2ZRLz+hTzh+zT1nicZW+lWL3JNV9sIHIYpJltjU9J/g43eWRlUiaS4B3o/N0OFXh 6pVipcacM9GmeK+3wFzWcFOnq/W0L/OvN9AcPE6lAkBtEYVaMldZmQZ1Cx0jLHT8ADKhDVc1 plsXY7pFGtPdmFBhrYlM0ZzJita5DUlJSECza5kAD5QOla5mTOhRywUD1DoVVqjZ48eWESje a1k3OvmUdoVelVSpX4w1iy0d98vfam7y/0QyV6qzlNzO7nROu6HtIbP0yc/Xh855f4Jvig3V NgQ/7YDVYXUzC4vLdUgEltUBizcMefCZTqoLOHFsaO4q6hsI8jh9oAGXr8wNrxul3gfybZXY Ok91SmKa3tqXhNU0Vf6R82xpfOjoRreO/8B2B6qsQtAFgrTRorOtdyA786CUT1AsGHOwy67C 2NifMQg9BxABQZzRt7yWlXCUUw3rtr6mEK6F62sWJBmKqfRBsbqtZ5daHbpMpfe3hSJcSvWt RGcpdV7Dx2LStSWj2VG+rCp7oSlrvOJekoVCOyuc/7dtT5VIu96ecFJluT8wEWwph89OVFNb KHw3CX0e/mYr73G5Wq2b390rReXhaGBExcxamYvjSm6dHws+XUBmaGFlnuywm9K3rr+KKFq+ ISze3dvVkrz6dRLOxp94GInrJPCA/LCoy/bkx9d6j70uewz8eAyJT+UMeZeKObS8w9zztI1m F9or+5ojh60bXgoaWN4t1RhxG8M24Fu0Q7tjP2enb4HJnvB3+ROY5cCUodX4mDdL2O0F1L4L 2WHueMDIRXOFRk6GbiJj3ZfiYNIPSXi80XYuF2Gn2h2RN9Mh3MJRO20n24ZHmCyXVK2dlQsi O1GlfJ7ewpBxQpWN14wfaKJC2EBjesLNzc1rdob1Dn+hqx5rvmH4akOMCaAjIBiPdDmD+yyJ 6PUOXGgltm2qL+C6CnYTJSLq+qklg1O83DecyuLuR07lrqjqYvUaq4h1L6ch7+k0HNIIy0c0 G04qTEWZMFeeoaCAgGtSMTmRaqmLGw/rWDQNi7jXPgnnAx34FokzlnyrO6czBCpIvLQZArsD wCmQeNK9rfMksrKpSo23yIwyL67Ma0Xj99xmipNypMyQhzjxb83yqe93rWji19qbibKlUXh/ TdaGPlXP1KLnYH7BCifkW0GLIHsiHXWE7hUBImUX2o8MyJJl1oN1plCjpFelBZbS0gfEMVsN vTEW9zC8TaPHRToASynMSK1Om2UU2U4L58F0cO6mwR/xzqACD3LNQABSrQf459/la9iFnHON Y67tr859jilL9sjaLxW2T6rlUvlUOklCwXT521jdO2JCtgeI6EKw9lmVtuua1zOosIaSXW5k rgapxR7o+8WpGqc74tIcrVXZsta3RBtRslxUFvVQuGrrh7avj+wUs1CF3XZPukpbdS/FSgDT sc0Vt9IvS5omAkY2p6uJ7imWBGgnWuLW9RK8oGizuaSmGLvHN9QAs7uRLNzxlC2ZnbnuhzAN VgD98WMARx6q15tjGGlqKTD0BkdKphor0wATDwn/4Bda1IB/4AvVnPf39biB980cGbw4+zWR 7iLCkKj0Gje90Iq3FxGSQnf5g68hhYYggRqDub+g1rGJjNwE8hO8kiGvcy2X7JOgmdT7jLVC aleREYJOfFc34LFrvhB4yRISwV1I0SjzxNctw16UBIEnxQQ3c6vLWYv7G1qDZonpN80Npm2A jXWViIIr7A2MuT/h4aQOUM6OSmNNu4uVavSshBFBMljleEAYhnJFRLkksYrAvlDDodaOAQK5 Be5tuqBo9BkUbXNg94XTKyORZ3fr39pysrG/6EOIUiAd0tuQRaM3tDuJsa1Hr6xJC7fSANCg zcfn12ncdEoNYoYqEGHUcQe7CGOwAlEWnDzrQdRJr9ia9i+6C4tDyC1B2KYcK8Qt+ERUy960 xVOQxyJ7TKHnYWoe9FjlKMa4p9bktpjWbvRXDS6LVbSd73fZ7Y5hMDob7ZkKPaBUwTOU7TI9 lVAq5MPkg4ZprXQr3sYNnflp4RDmtstGLRRp5anOMhdvkBp7dewxnbIm6dvy3g4dEjE5ic4V gA/LjXnmQdaE/MaEb/juw9kFhsqoHcj8aAlEwjcHyKjom801vekt4Bng5Xk9x1Nrn0UcLRHa JFSoGHtd0SwNDYjMLHF69yDEa1EdvjvS5yHeFaMl1fblTMcvCAhCGNjSE9jY+pLMG6GVbVY2 gHhNq8IFAPmB4CKMDDSlSmwq1oCOYXyk3+fHm+somU5h4GProtM5hLSKxxzc/kr3M2vTKfTb U+IWEnR0IRYUuorGNyTCjIaQtCYevv9/YN+wpqODfj+l+KeTy+uz67OPu0iv/Zt1cGKDo0Pk WCF+9JY+njMnOq3SRA/QV00o4STKIJCc1prQCEXT+rBLpsBX6aEoNzW4wzIQXio3QjaOBWyY YN4hIdy/Q0DoGgx5AA+19jbmtxN87ISOzFS+xMolTi4QXLs4b5MRXlez6NWVNaTp7lxF+HJK FCVL7XxSG+kq/FWAfx2+uNXMQsTJA6x5EOipAZch7rikZK6H+eQE90wdJd47I6Z4vxCa63zc KCC/Q44Nb9YRWJyh5mxQIcA5T9RvgJl+zS4EckYRmwPGfDYLxQytqgA5c4nWeoPBX5CgyoJF 79ckGH1qkmE5N1aJO9+wN2vpL47LElWUoOdGsk5ufhpdfsgkijgJvPWTpSOIAKZbP8IfcQBp 4t4aVXQpQuoa0FLF2cIHBe/FqudkquvOE3/BVpJX1JfemQWXTedW08yTm44FIiathjN/kCt6 wUhDg9GLFH6KmJZCjr/MkQRGqH4EshFKttbBsX4k82B+ASMG/+aAjJJyAEtyBBFKi6S5aL9g 47b+5QyUJNAbatbPLC9hh8j8SLmAZyzLSnmrVA7QLLi4KCdwpgC4MhPwyDAe32wgEgdBqOCz Pi2CCgX4X21goljTDeS92ESDQS+mKFPpEpDQnUNU7kJmYn7BZCpvARlLcVCsFNJhRcD1dpmV YpwkG/Qfu8f9Eh9T8HGi5fqiKIKpqA1ekOhdXZ+c/liwZob/KgnJQGNo56IMnOBvXszmDDgo l6gdUjfQNNMOIezRWQPFKGYly5cHuAT1QleRccEFe8o8v3XRBhdY0YoO/oZMhlgXXzw56tPP uqRmwvyeC0IqBtK0D6w+eo6OxfQ9rcGmJpF+YcWRM5LBzAQTDOxZ0TIKR05fv1MuNkiZd+tA 6aLEsVyjUcEMjo6jxS13Y2qoYvFaFSifU/qob4Xsgt0gvX4mHaUIFYPrD3jltIf5614aIVtR RshlRKJKW2IkAktGGN/tLfaQUJGgUgvFfnI6ZTdvfjYK6rM9jIAW7T0G6oSNWJGJ30N6D5Dc DGUNRGww3RH1JWH7vbiNM6sGWCO8n4kWWZ0OMhfrxGfvz69bN132c5v6UXRHBp4Qn3VTlWvr ogXTt2xw4iFj+Nsm/9velza3cSRpfyZ+RYsTSwMkQJOUZMmkpQmOTHm0q2sszYw3NFpGg2iK EIEGBgCvHWt/+5vPk1lXd4OiZGuO3VfhMNHddWZlZeVd2LUcKYEVjkKHqH0s45gRoEQCfMMG 54lqKLGLYw+8BFdZSH6dmRCKUz3P8sHMZAZ25IkEoW30jsfdG8iaEBmNGpWL4buzydm8KuMq 7LEuSjRlqcky1Rkqpkpys5RFEcmLPb0Ayi9mcjCTBzjJkcRoAhcZrhx+nAwdVRLCzZHxkw5u JAf5gJTQhtAGl2GVhVOWtgTYV5MzgQecTkC2b3XY9ZNjQ0slt4jByDwj283oeLLFkgc4JIAr 3D3HTjwDDCdni6kAXOGiDJFGtmZCKBltpqRbiQkwDsXhhcX4Vvgfkr7OssGZI/kdIlss4AUk uVYhkDwSBI3CH528BPUcv0IftzqnrxJxl6RW5A3iomyAHBKYjO9vFTvK6V5zRokR3Qnxv1lz ienwfLKAgBhUGX7519ZiPIDIQBsJHYEqqGDr73LrQKVhrXSoyuBPZ8rec81KqwiZsWacrUeo DJN3aaCyglIWFlmfzL1oBL8+KkegHnZtr3Au0KVgrit0WjoXuOaysXbVoqIlYNQVXBGc4ixG nc34K7yx7OvMTdEK4FSdF7Ijue3O8yNsS6HLpIiMdCy0QrDHOLBKqxqWYjmo6KI2yjaEgM2y nowBKaa21UJCCsfByMJN1ekJE53yq7m4KRSmG+rZ5PIzhX58x7IQ+uZrncDQA8Wydp1MkNZM jsMl0Akx3logwGfk4BNVdwsFsc3CQFsWVy6v9NTBQQc5yrGfhVBHLrlNeqSsuItw1PrqAiqS xCJ4/46xfdDwVzhRryyBWIXQsfarkfD4moJNGsVZxDRw9vVRfqYWPJVJRTAA4SZzgIRjEF+k onJtz1796RESC66oJ/BXMhHw52fzM/gqgOKzCai5MFgLxVTMn3p0zRDJNYK/2NT8xYC47WlP cWCdi6t7M/V6avE0223FYr/K7kE4VILoWC87wcB2nQgfUnziIaMmStvgTq36kePjRHPl/DMT eZ5akR4BHuRz+IzLGgb9qrQMgjkZDuTUZyK+C+Xsd4RTFpacnU6y8mthO8uOcthj6b8YRMcm ccEje5ss9XZHpXLpCDHB5HdnUmMy9u7iJvmTSZ8YHyUbeE5W9BShoqHRnCxsQdaNWe/yuaYQ HKlIQlSaK1tb2n4LYoqfb6JGINuzKKYI74zfEakUw6AK5GAows01VluXQm20fRgWCBs/eOWk 6NrjqrnR+00kzLhIHuR9Ra5OFBYhrpws9OkDBMxi2yOPBdj30ZXTERWAgrKzk9lA2aKwtLrV cwd1Iz2AtjW8U21XxWGmuSh7k+Pe7dCaDOyxLpsM7HxYXHSt/dPsf4TOv2sbhnSUDQ7STbaq 7WEZrMxq0ux/CvcEjjIfCYlFy2A+Ma/sdFgOTCbunwkaLYgm5lTDZeltuy99JBYUsu+UQYp2 g6syHyvO64Z47eTpCDlskxj/ngdOVhNdcsfmIwjFFInH08l8PuyPTLzM51fj6WKysJ01mA3P neZHaHH+rlD7WsINe6mZO8EDQ2j8joh4o1L+B43z9s7X99yz0wyGxUFDtzM76uAlcUHtoJR/ J8IrZ/vnQqEqlOkqA6nGdIXU90bFeSHn0Qwdo9LR2SLl2f0QtaE/FYJZk/4cwreQidcTmZyQ DEMBEXBmuSmblKS9fGmK0IVIQwOTCFSI9zv+5Oz4eISNzJyQ3M/xJpzOJkdFMZhDhI4YDxLu xl3cNSUPqbQgx1iQo60SjqziuMNBDUtH6GXDFDMS8XzwXgi+8JjtYkht40QFmFFxTKpKfcXw 3cniVkfTtnp1zk+ePrXc+c+dzgMlEnF3s5dSVpr6KXsJWVEIlPak+WoE7gU1lyQrChd604Ad dAfSOjgQSzfB+bVlgh2vCHUQn3hQY/hIoQP1ArK0XtnK2PBIRBWryEoUU2qNgITEs+FACJTt PlvSTtrcT/XW/bBMqSqVdl1OuQACZDh7qVRSWSp5xwpcDdMQcX2izvUILieqyR3JmeXkbhCP BfidXJlEO7KYH/a/C0fjIJ4JwosMObqVVmN6Vku5qnK6LQXZIZrojPcqXLKho8mgICN4Qa5j 7ogh40DD2lzk07lX8J4WV073U59gwCQbaokcAnOGXxHbcdTNzoFKUvaIqhzke8a2IX7DQ13E uZRjYKzYRHAXZ5GdEsWMBxUX+sV/iEBv2p6fMLiz0naHHOgX+dUmOZxksD+hmZN84JDcjUue J90IREHJ8IRSMeoqjF60tykkzo+EyIL3FBohZDKkwB0VZfsn5KFNASks+BTiuPK4O6rvV/C+ GiKRsNUDjK+mRo7nMHeD9RZKPk9U/a7xrtfGXVBPCFbYbT7QVCctz85Emnzl+alXUO89Bxr8 TRNwF5rQeDGPbfWeyTErfXst0wTi0nxHbfYaKjjJ2tCs0HavH2G7z1ZJWlfj2L21QK888WHA 4Iyr0atyRG3duPNpXpJUCl/cY6sihUygYWJtJeEPYWmUPfoGwv7JUMtRC6XfvxNmVpHZyI32 rFoSI6xeoe+YU6qyjVEiKpHyJWN3opiw65vZz+zrZ9W/UFmIgEnylKbVDjzqJqv/GeyMjlAq bXVNaDGXDxlq0DvryphByNLzEOH7V7qPdHTjIi9Nrkk0IokhWz0aoJxg3yKl7FFMeZ0cpIDU BWw6JXlwx9nGXAynOld+20hPUPTiXDSGQ60ojuHyRZxXAeA9GvbH4OZEIFImbKi6+PmQXlIU Y+Wwnxwfq95cDw1kT/THdLaqBVYzipdqpJRNIZRHP8CrumrDe57Nz5jbSIjbKSTyR398/eLx 49/tvzpAEstEoxt9uuPUUCC/rnn4bMumunO769KQQvZb2tMdLtP21jfdULpYHPHtzt2tbmsF 2eOlxM639+XPzv37d+XPN7dvf4uXt+9v4VG+SX/y+s72Ngvfvrd1e4sftu/evYMv2/e+tU9b d7f1270793e+vcse7m1/880Oe9mSYvfwc+fuzr1v7rFT+XX7zjfbKLq99e39na27t++hxM79 2zt3tnbw+86929/eu3d/G6OQEnfv3L9z5073Y5Pf+UbnuXX7zvbdb7fubmkFZzJ6rkWPcQ4w Spa3B9ze6fWHiz1r4R5aAOpGGsGKpH1jteAXVAJWdX5L1H0LZCZySr+6CpFRRN3MCOACqe74 +0X5Izg8KXENpadJ5403SLytaRVVhwjP3hPk6CC9cLFAZlqO/CoitasSP1DkoN83p5beZ/+j T8tzxPCaR/dc+p3uQu9BCUh5cXhGkVL2z4dQj0xzGBVLvbjgc7vGkYnBK1HefZAqFoR+mSTY Vbq+rXT8MRVgQiMGKXHRVuabIi0CkkYm+PZtcCDWUhpFRfRzZ5Zl6dZqUSZ5a2eLEVSssbFh J97fzHu81h1VWX05705N4Vh6aV3KX8DTXbhaO3koREyAY8HyLQLgSjhwVT+pguc6jUbren4i m1y/8Nw9bfymjzM+ZTQU1pVMQ9CkwAwfjH0CZjldbm9bh56/LJUlBkqIRAP4feBiPRfGhyRB xwb1y0ZErO3k96ui8znF4UHwUe6SdkMNqMDmE+URVKnufEMecmjtuChTg3E0biHb27iGoN08 mg7LJxp7z8OoZwc60VcOG18JGxkprOwAtCRzXrr04NQJK5NApsqY6ohTz6bz4mww6amKRZfO 62YsDQQ40KtsdlZCvlJ1C7wwSuXnboE1pZ5m9E449cXJWH0virnewZGpO0dJxbKWPDc1h154 IQ3PwQEAckDb5NqYOeRKxKN/bTyK5uubCfmfQVlExW1SY9gUIOEvnulYKz4LGMD6HhvclExs Qri5IZIUWPh10v57qvhtXPyznn3z7dY338rq3uM3hPIhKws//hv2YM+FWiyYb2w0gWOu/tHH 9/rIrAoL5rZ0gRbkzEPCDurF/O0jCaO86W071cOPRoKJjMhODUegnQdzsMBKhwvekyBw8/9k COaCfJzl51KMfqE8S1TZj35HdPB3a2TVaCPIGEQJwmheIWGP2KGl/Un5WPNiQiZ1GBAV3R74 hcfJS6fU8Ay78tYauKv5Sxz5o5/aLz1GzKC2p+jWcNqmx60S5T8XJKBQFoDvJn/hXkB7QE4Y 1hg3m12lkbynB9oQ0LiuOtlQO6dGFbU4b/qyS0UOxSiFzDqEjXWoW7ReXjoRA9IxGvNZMo1N SL06fv7Z+gm++1b44QPPVOhrB56oPLeNGsZM00tn0DkF5khJyTOxZTn5aBtR/bVzdlDNBoU4 V0oOcKdayKD91s2+dAeBvanunJVk86xoAoOQhYQD/13F2h/589jSdN2IqKgpIn3Q4FxWH7ML aiRf1oRWvf/HmR+iOSQ7bG0tizgDHdhcTxPdcu4qpmi7sRx8N0w65LMzjnpiZj7y+Jf4yfOF +sorQcM/eswL8fVFvFkUWRn03Qc/g4S/lP1h5Mv4zJbLwJgFhfR1C2PE/vGQ7jCKwypNm0tq 5nLXau+9Hkihb095JqVr5US5FBEKBiLb37JanL7y2FL17aZBhkCIXtvkHZcQfQmArpFHNyqv x3ArWSm5vRfeIh4nNEm40oxYMo5aQYIEpcLOF8ayOuqreRoHIAzCA21ubpqY0dsO6ctL+orV DiIqAxY2UmGLvjZP1QujH1QkwMIL+63rs8/UT6ZbrKCi4zi9+iNoQzhcMzWHmtR1o2oYqyl6 uVWMEXJcitcldSNXGhur+fQ6kdWobEUD5TKURjw/FcfW+7q6WgBWL1/au7az7SZ13fkYfI20 O1VmOdY+qRLU1J4Q+x3LK7S2lWMqz8ZumxPunmTH6BMlfHKkIdbOV9YFHWJ4lTTSkBeQD8zb 2gXEpxvbGi25a3V06JoWTAFjW8iTl0BcskBWsgpBqZATT0xOez0XrBoR43heAW51pLGhRG3Z mHo9dGfRPjcc4wcbhtblWqCszJs65Wju3jcmsAEQaGhYhO2u12Ah9ec/7R+GL2FBxt43taAC mazBFJpXqO5uafEXOBgng0HmFZ5qu40MQGpG/o574+GDrjnpldJnPnDt0zAMrkYLOZ+2k2I0 VV9Kv4nNt8SZ87zNX5+G8PHHEFW6sc6ckn+4oLXDsAtka+69ReZn/WNZMrskYXV6NitWszbE 2dBFx9EiFyquxk0hPlorGgzia2absTsTbpZzLk1k2LZjNx4AaISDlh4++DHTwDKupkNsnLt0 dBkpxoGETHjnIfhEmFgemltJdUfQ5wiqI361Q052gBF853MUhPu6z1EygJkZ2Kr9f5d0n7Rg I2nQWs1ktejX1evthdHirRtvlfGg6Rn8zVHhmY3ZG17nsm5zEoaCKZrQzF51cnqe8Qxrmme8 Km4xHoT1Mdb763WVbhX1YwIHOzP2wneR24HpIWoVlHQgOBM14vUzImsuXgoCXcdZso5uGf0q 9nozR8ASkUGnIuT0QZjMg2g2pkmYmW8nnEnezN6ma2oN2rmko0irc7yC4aw+eltByZiO2okP 8rAarK6rVNMbJ3uCcGhBrsWJ0Y13JY764cL2lpFOznd9FpyqWt4fLpzdIE7w5CuRmGnN+xoF 46J2sR+842bqxWMASD/PuqOOBwFMW7AtyIgRlc1BDjqV9ro4s9FeZZGTewxTguahtAlS6wOS qBxRdmhOt63Y7U74y9BIcFmGz39/+M7qgQHzVRynA4eIq10T5uYLeTI8Xf1uNXKClokOGA1k KA1vI6TvFLZvOCvgyb5J5wsaPs2nEPuyqzb84uqrWVyVvuaQTZ374frIsyPR1jyJdC5Ro1nQ ClQ8K0cx653sD2nBjpqSHcJkO9TgPyMmtqU+tJq3UESs4/3j1QPKBlNHMKN/JX0pEVTMae8i ruAYmmYqfGfzBRLujQsazS7cbavUKCovppKbWkpjIuN/q08Guc+UqgTmV9+w4K1kK6ug4vX6 PncI/VZN1aAurP4M8j6yDIyYGdQS4QVn3l7lLXkb43+qgksqbrDkbKkwU2PNVEX00QFNGkc0 WzagMJ5oPk0iE8azsbHAZZW4ZVaNcJr1I3HUIXvLTX8Dh8+Whi3bIuMqBLfk0Kb25RhVkzY8 cSddZ75UoSPYkzOLUG/xHij6GAwYBQ3PTtjY99Hcqg8igyMIM1sd4QpvSjyak0IvEKY/FLSN W295JE5w6OqPHf7Apc8tcxtyjQ2KG7T20Bp7aG091KYQ7VwI+VqPGgmxKltu2rg32TyKqO+S b9vu24g2nc0WrihXUyBYP0aoMkolN1dijYxTJzvViQtBXJQh0g3GObqirobBEHJwYCoGiYvS kYo+jDzNj6F9nflkBqF2y18S7QxR58MJYsPo1ZX3hySsbR2JgKQo52czdbAL1J5UtOX18riu WqBjyji6rsAfNgqCrzri0skjsn8y29yhoMWnWj41N1i0VD69h7c+VqyGRj3j1dU9bukRQrSB Us9wLYZaGnekjPG2Mr51KdvbtsDstM2Q+shIwcb2ntsbkAJRfWOjTNisuEkeC+kx4k+WSvrg X9pL2q4nDAK6ZZTiqaa5VtU+ogjhNDF0Fhu6XhHNLZBIlmwy23MhYvroNQOt3OsQguO+NNF1 rt3sxTUfXwuXHk4t95typLYUro968w6Bf9NDnkqW5SrYjorLo2LqIuHnraTHKtvsH+r9ZG2Z Yl5edTbftlqr+ao6wydQ0NQYZRSRSkuY0dVcCJMMtvQx1A/hz95aPUFwPlsrzbpBf2WEptIx 7J0jxgXiKpFfgqdoCWajNEsi/V1mLb4dRpc9QaGsqRiUCvnkYUO1dkn/rxlCqqlS8uGZDwXg RsNSb21ulkHxpFQ2f3PaA61WMe0BngUqU1Xfqd+3TGK9Da0bIyOG5RlopYwINIgS15uSvi/T UfQFXnIv/txtodkoOETBcrqXwc8fBWx6OKdPgxGRynbTactYjoSOtohsTklHRU/ZO3WFlJ3H JFKa5hJ7MHi9QFK2zcXlgiy1jHTivFqd3SEleIaMwNaI5HEcEclzlxjoHyzcsvQfQ4a8uzwf mEJ4Oo0oIGCG5Ln4X0nbLfPMlIzSCQ/fMeW51IKVDG+kKWvUqKX+2vbEUIYqbYfEd29Qyy1+ r5fphJ0SwfLuw1alJbMNN+gEYewTXrdcABQijinDZeP8UocBkyVHia7XBGmAd9DmIJOAS/Nu nD1qfGdVY6qYv0Ev0SxWohkrMOWfPmojzDe74VTYvGwTr2EGWqERcpHeER/qa++mcMuMwqNO 08Rd3ZhI+94eupm0GhqHPOpTr/A6A9nzk+NjYWXmISJc3q0p2VHdkU06LPxK+hgfPNUhv42W WtU19ZXuZY2r3PMIcN1KG1y23UpvvdV4nIubLjPIDV7pQlfktdoK9OIVqKPDUmy4ETJUl+uX raseV+fFRxcYCvdAIKJZhRm7RffvTnXd86zn0MBTFKGE3/kGKaLaz+8ye2GAKC3ek1riN027 XTfgnIe/p+zwElXXTRe7h8xB7ix2nakPrBxdMxrK9Du9cuSrUOjvEUegBjo24M5JHsY+gDGM q3JwYWQuE5HwVB56TlsQTzlc1zYOkGZcJv56NPfxmS1PgMZvA1aGZRlvKLbju42J6+i4Qiu1 l2Dv+K1zNHKsr2uP2NvZU7F6onNrmC2r25HPuS5lBQ8slwAdpOJzrdN1bJXz10IPzhBcXA7n 6h0luLlbvu224Ig2/yizdy0bRiX8L+VXHpheWfkVOc57zNxlggvbZjyC03lZ8q8xcjMjCYY6 Jan5XxNARGaCd0xRwk6v03rBbdMsG/SKYNToSc6oANiJlScZHrcuZkORNZG4hm7b5oqseH2F nEpmOl510TUE2W9XecfwEq5EWeT/DWwJR76ed5LD6jPOqgdJzb/XWaWjtwPr78uURDxJIyh+ XZ7k73F2xTyL504ffA57WkEGz55+Se5UJRRS5gamJeUXkyX652Zafg1m9IswJWGN/9mYknRk X5opMbwbv43XdC8mpcZpeHZkGeMSkPJTmJJ0tjdmSjS1Xn45HJ+N4/Rk5YKuvNRNPYPu9RUu EJfzdaoqiR70HZZKqpUFLz47xpckjNJEsOqsJ5Xw79mT54dMYHX46umTR8gVNT0ZytGJxFb2 4eXB8++fPP8BNRSVUOJ/5PDa/Gb7PqJwXSq6alsynts73ez+XfoXDgfDo7MR04agKQ010PF2 s3cTyzoEtwwffqCZp+IcQWnmrXSA0hFB+gSKkzIXQENfdBYcN6NkfvE4u9CzMOi0mkAI43Q5 EuCQ5lQlwXCqY40VzP2ZZQcksNFCpTvnwbEKhwu46SxWnZM0EidZ8PgYuSv+2zR/RwwxX6fR BNHPcxfAL5Nj0GOOoFbhxPI++FtClom3XIh56WPM1wM0VdnGDFtxrjRcwpDPEB2GiPbZAjsb uUZmV4y+RrQGMWoHjTF4cfsbGaUmM9veuc8g0PkRqo1lUGeML7PUf9Ln0yjUUkpT+dXK1jj3 nk8axexvw7K2Irocll0L4D45m1nuAksp+A2GpSy0ZluspZ0KUUZHTLlbaa2CZRWcvr0TcAzg v4DHGAr9sP/06YuXFk/KfTuDMR02S0FlVX/q4c0kKcKXdzFSRrwvmGEUwFc/aSS3pEaTpmlL RLdQdryMekMDzGt3dKaHEjpvUPPRqRK6vvrcbNj3OSvY0QO1cZKPIb06pk8ZM3U+OfKhxN40 xKBrIVHvzxRbmVxFcHyIBcUFDnSs1ixZuHnSwvvHE8usAZcmad2tTnW7YxFevd5/fXD4+uDZ S16yke3c/YaEmzkHxrzUUnOXcTvjpkrmTcKdgrgwy0K4/tZyHhBJ3uamQLRNH4E2hCVexkG5 qBoBaqKERWZ9lX9FOQ4X2LsgZiCXzE/JmoLMgtSd2yGS8qjDFGCHqSAdpB4E7vSMZRseWXBe 99EAJiDNp/mRXtOp1EnD0d0dmNayDXXflhU5ONzZoph3Vbg7OjhahNT/KPj+m6HuF2iydaSD wQwI2pethrgLILRm09ekK6OixOuI6hBJNIKXLdDzpC2nKWM9cycAD3wCV9AB4Mq8o/Io7xex 88t1u+E6wg2bfLex/daXNK8ZqsldSgu3Mi7VgeVo5eCg4sdhxBStOk2Z1mlRaB5VyyZN7Xlx CeF6uNDw9uHx0NPsQeGWzcx78fJxjLUj7K0TQGUuTR/jJvwCv2naGrjEPdrLe5YhsYSaQ538 KowF5QFsj8mslhueGHAIm0Y7Ig/r8ZUiqThtbNN47q+8k4mN572HPp935rcM38OnAH/9pNx7 2wcPGve/FbIrWZWhejwrCr9C3IKafySbXJQhH0WYhjuOnR2L1umBMTslUvhB+cHTn3s3Bpse PBaSpxWDxzXSOvHARRZA55x9jNFVRybHLfP9NkIdNaRcBfDLoQwZRyF6qwJScMZ26aC0qYU6 vxD+CvMD2t498xlNDZvrK/gAfOWzOU80QM5HPJF/deaqLeadOjsCj0PgqkpL7VNs8YiXeAq8 302gT0rABtWQQu1dsagDzbRBBfzPrgMfk658pwAxAEQG9i0VxL5nXkfkEpECtzRzBn0ayLrw XkjNoOOuSJ6MNBEuyYnNSaNo0J86J5BlABm6KCxxJO8auNCsHMbIsJIjLSmS6AV/bkFjH4WO rvwzDldnuO7iXX260U6CQCqKpViAinutyh1TTXcEJYJOcobjYp5nB8/az151s+cHB993RLbT vwJyedsJ/f1W8GF35S+U59N1dXU1mSmXWZm8PDDjsbl6muvuZZn+kjJ9YGLs9XKB/BzDsqeJ 3zK0Tte4fmz1ViIgHWzgO9we+0TpZ3rPthw9dFdTG/K0Dy++aZ5kIOZ3qQsxv19n39BYwsHZ btG7GWyzHJ+Nuqn+t3FfjCbXUHCOi1skT1/37XU/cVppYH/iE25QzBd2nNn9CbzGBnvn3dlw 5C41h0RzzhjQcsLkJXQCC3tzbQ2wxf/7VLrmTusqq2C/Etg7FE5wDfMs8+ptPrw3IJxI0fmE TVSMj6ZXuhOQBg4NLNsymCocbOERN3WkNDfnnQV9rQWIvLC+1yv7jsz0NRSvZTFVr7CSypkl sY3k2uh4ROD9hjYFihn7IuWVXtiw4BPcN9dY+ne10qrt+t7FOc9yKH6k7wvcPKKu2LpG8BaB j1ZHdfdnpekcTVREoJfmNJ7Dgb2En6tXOOvQBQR990OoRraSzDFTjSlTJhLdiIoh4YEp/U6d bpkP1Zi0xy4mrQHsdFw1sPApjwajjuFcF2s7WpqV6uq4MjYdDbY0OcoqRKECTkFtA49HloeR 5dHI+vWR5dHI8o+PLL/ZyLIQqvZCVxXn8oX5TsJnO1pPpV+RAIurUYADCJsVuRkah2Fp6fxn WlyJnAuNiypHauuAWaWlPsM4IsTqF35I8XgMAcsrUMaAbC6qA9iUGImIVbqLeW1KglseGU4/ GdWMTGBZtf3TOpHIlErIPxKKDdcPaEV4KKkfPr3xUhO1mjDdo3INkytkJksBRVsoNx5gVfYr cMpi3PwsQEG74yDV/0RI9WNI9auQun671iCVB0jlAVL5UkhlUSxM0+ZCLHYjOUDVDy1rh+pf Owhlbz9WjbB2jV5rqLT8wLEDTJvzN9j863BB5HW4JWtsUJ9sUP5l2aCT4b8CG1TVF+R1FUJ/ 7wsxS7WrD69jlgIH1EfjfQvhq7BQJCrLMJrzczwUZxaxUVP/EDev1LPM9dGYrV5Pdzjj5Wx/ V3b3/2e2fn1mKwH7P46laWC2bGT9MLJ/CBv4f4/Z4o5WfqvMkfGpjmEBqVKcQoPc16cfZc9I eHoxRxUeYpYDF3DKub/9Ebbj85mwCp79ciaMRFD5sLL/GfDre/hdx7VV4ddP4ecYEoKv/3Hw fTZnViEg/zScWb/CmfXaWIxOtDyfyaMtLiZqaMkXZoNxJo+h3m6zsb1UM9pN3PqbWZy8rqtn PEX1NnugGngb41G4Xd3DEvZmierUv4RGHoncope8qncrefFAj3Q4He3obb7Rm9t07fC6FLPy xKyAN/LIy9KVUxsQ3vTjN1rKeq4yRdGgqryRSxM2mQ0aYg0FHv0hbApYRsYMDZ375u3ZoAef EXW3ERrLwGqywfPoOg4XxMEccHZ/G5M86/VqxUB1xZaXj+vacTEyZzO6afIg2djO3iFKKIeV FTwHL5qdB/sTb01OwOt1uw5ACbx2AK+VBMwp2FngQyu08FZJ9oYiT6/Hnq7jGJGNiyZvxjf1 Lc0eBv9b3L0VchLmSRIbjSNXuGgsOVJRqKeuy4jedvP+FC1AQlBjnjcI6eFwqFClKGS1Mq+c wa4wYFZm1Y/z69xkUv6CWaJ2fFpM8zc4YN8Gub12XhgpbphdhUhXZqLkCuYHRGHTTB0CbX1y 81xNPsFvhzMZD8u2UpNOnHnHI6RTeEfdeh31eB6WSSeFWZjnVlJYJLnGwmaYuszHMDwE/wVk k6BjRT5cmDOzN3F3+Zczcpm0URZUWDnz0Ex0lRCwe1b0CkrLw/kJjpNsHZW2N2nI7d1m6C5+ 7ThDtSwXSuxs+vcPo/f4dK0Pxy85GI4mo1E+nRdNFkU7SY2Ew+fSyURlFpFrEAdLNaeU1JGB 0nzh/DPm6x+EimiDMNx7O1Ool5Z05cALeFbAn20UUqO0ZDFOhyD4pI90ZH4wK9l1DWfNLbci 9j65pMjdzh7fIBmQpWtXjMI+TTRMfGssSSLzBpVFixdPc9M5GzVydvOKG/VgsDh6NY46J5xf iB2Cakd/DxyprbVb5usWo3LfcwJ1m8nLK9wmbSzLcIw7D/ojoSkiyhzCDUgdIfZNtyVn8iCf 0ncqdmNavvcimAXGqIXWx3ZN3lP57T7Mi9FxrELildqAYQTWcXoJusY8hwzypaKAECUWiya3 Ps/PC51SnUnLlC3bq9y3riyufvFcD0ZZNYHzpnL3zmc9vvXyan/27vClSIsFLxy3e8ZXf36x y/mvdrM1fw1ltGI2lg/OXk1XkjWsbpzU2k0oQwLY0XHv4aRva7aSPMv3tealXQngihsBf+7O ulDgO+FQnTpoHtRBmm3QVUV6NanKbBjMVJUsiEnAWKgQUr/ntgO9k0v6urisAZWElwKuKCNA 2DIO4ZNci/gU5w8geJk44bCSj7ZMmqq4MBrVC/7/3PJg32LQVP0ef6sSXBaX2nXvKoU9ta4k MufY2JsbZ1PmzzS/JMDHKo76eiljk3OrOkl1lHHdJJ+KQm99dhN5AsFxL0ltNjbwiF+aaJc/ oyn27FWgSp40Cvp2GhfKC6ShHaJ3aKNCZOOW4pzB87p0+vLq8LmcDS7fQVbbGBFRSF1D1tQ3 RBr46cnzRz8ePG5ro8tE1RYcJt9hbUqR1jp6W49NbJitge6DafS3+Ijww7yIfLYGh/TSr8Ru nRT59NA37EVSazpuG/CwVsMgtc14kJxnWzbVexslUhtAIoE3/vu3YYR42eW77AH/dPnKf7fd NNACSIAUvYkv4v3x4E+/e/K6/QSuK74IHmVoYPYeynt5Wsu2Lu91OpyQr/v46ZOXoX5D7f+S 6TK/eWii5Wv//mD/JfG9/aSb/XsHyqO/KBniVvyLkZ5DkCupvxe/Q/7s9r/7dxbLeagwOXwP dki/RGfG4YKgkkJ7+om/9d17eedfEmSHC/cmmuXh+469/WBD9Btkq5MuIsS8uextW8fLzOGT lt8OS3k6hDdS+9JHBnlAXob1RCDnEEkTs3GoSadRcG7+DbqRcg51LikoC6cELcylrMBYOJjL 7L/iRi4jXA/ju0x7UbwcQxS77CRfqtU1olb+39tW2I4pzuE1y4Dj3L1RZ0u7atiDMcjrQZNj RJdZfvajk3wG/kFBfF0MpUtmc2nBjTXbSm3hDEsu/ZHllqwT89grtkjKRKCYXyZhLdtcJtlA +CULZTGIka+sNuFIiSM2l3YWhl1lq8WjJyqPg0d3wpiaEUYKuYWyF7p21cQDy3ta0k/C1zZG DbktcyGgwUIOj6/atmPejSZ9JknRlWrFVE0G9+ZxLuN6i+ibomzniinMAwI1w+UMiXvb+glw 2ILA0Al4pjgWzoQuSHcyFB717qyIh8c37q74B8we85HOt5s6N2wdanu/yeYnw+OFptXZttDo XdeELtD27lv7oG9sDPFN1xVG/qRy3/U5kXvJXrgp723d2u5wzH2qX63w5yupLDFVh874Ved8 rxISYFzGOfYGoxZ+/jm7pZUOmUixfe73Blw2f5cPnpS41SUf4aIoem7WeXY0OfWMNPlma+PQ WAljS+LqnlPR0McH2TTipzX2AF6ejDNRv2GNWwBGbMsyLRhqksRXbVpYncfndrIq8HFVWDg3 13jUG9k9PWGdiBPaMTGnNnVhm+bFoh3onmyIZU22fNwo8q1p5s6ceyDDJtgM19EMPSzYCIzX YGpwVUOvx0xegUikvBI3nHX1ipIqGk131d5HetnWXrxIVzsJQP67WZhzjUFPWF0bTLQNN80t WU1vcHSF1aidjqWz1IiUv0lLRnt2r9XECDsa6deezu1u/J/A8jZTg0+R6peaTJooAtPEra9P rxHrP1cUv5Ekfq24zbsaYuk5bPpTZywnHgzcFr+piG5Z+2Pxmle6JJK6IlUhJMBnTrN7yzRO Mdx1xmyOU2blsqsaWpqZnvcYLY5O3JW7qk3vWYYzTYaoH0Y9+dtDcjVW1U8uxxZs40iI2Mt7 x8VFz+mxe4uJvNHGepidN7f4izGVckVXZ7qrQHjrD3KQaKqQkJCw5TO4Ixcmkpu5tLAZ06Fi StiutDQtaF+ArWgq3UyOj6VzT2KmDSDexsXrLtneVNdSEw3hXjRev55wXHr1Uw2+3ezNVFPs W1b9zSz7wabgR+dzMHJKAjhXeM4MlHiR3Do1WeSjcM0Oc6pOoawV6Z50EMdYFR2TG05OSSW8 siGZuypcpqn9vIq8SaLhsHzpZSBGVHoOEhMz35jJAzcyJVODUldTihoW+yz4n7FKXCH46ly/ StUx1lZrOZR98pEfLR0n7TzulJzORAK97FpgE+NxFTWF/Ko7SKqMqqzBlGD/NdZp/0a31Hn7 X/XKmYY+Ix0kutrNLDoUgM/GuE4r3EGLO/L6xSgrhV4MslUcPqvo7EYaGbUxPvT39dycgUpN X0rEBR6/QdDzcfbk1dPXrRYEO+P1XjWxsL8aZ+j07+dQrbkzssabduP4o06sXD5vYLnYpkz5 +wPC4zw6o7euOZ4N5xpOaD/llHDAYPExXL32dDKOor5u1WX7kKzIj36gX3hRYqjU1qQG1Qic 1u/+XNmJ6kDD80XKx0ydpaK6ur+O8AEFcMNEYr3+BflBDvvweXHRLh2iXSzn7afWLGv5di86 sdLf3H7ajC5f70y7LXctknu1bGDEI1+6XG/mdf0VTWUjTVhXmjmls3pQC1xcsx2YWvQj7Oq5 51WH3vs1utIvZcWy4cZGUN4c8WIWz2n/ODw6eaS0E7mo021EXes5Oj48+INX7qCFh6k17+WV YIFw7ZPx00n5rt1GFHhHE8d7Gy6qVYyAAUMUm4RdfLVA7g2Z/sHl0eGfwCkewNIpTDFgs6nA uezsZpdU7iK1grxfjTaFJ6zN4KUd58bgjV1L/wnAzeFo4MPN4Hrd+rCxzrWEmfLfvwAqhpvz KKbk83l0KlAYHooE60xd9UMNslnqhGAgXHa+1w/41GHhV0F2g/8nYTuOKxXVZnnTsSpM5vlw PlwgB43+6mZKCIUhC0vZhYNBckZcJqs7ida011PPwL1M1+HSf7eFc6t1mYrAK9KFFOUg2sKV YgB7bh3lW7IeOpyacrVh4kejIp9VZj2acmok+Z3MMViyAESSGB1GU6qIRkGD0VX3whswMocz wWBjROOzN94uF6qZn0ztQI5HeY67ry9GMakxNUFy/tbP5ItOIy+6eAK2GXKMKlESpI0/GmN+ Plqin7xo+nDhNZWjwJrJAl9Ej9Dr487UB7pvMXL/+Pyg46WVVyfCfh6dLXZdUL16ac4tl6WK +WRT7Y1KK4ne1WFZ1J9tv7nqmqg4d6Qz+fJ6prf+BfjJpxhi2rzTlDFzmpcy7XIOJgTX3Fq8 zo0+ZSEVZ6I6NNp4nkIKryLgVYjl6XWk0lpyhFIJ3YprzdPPlHieLqFQinSnFRnVeaDKTo/H Taffh8mq+1V9bldBKjyYR6FyYbxeTToPgTjnWJGo/T17f4H3F/X3R3qVXA0RLF+TIIOOhjm0 ZPpPX+/asXOOZHXS7l7m7VO+0EFc6EFzoYM/RIUeLCn0PG7p1pJCP8RjerhkTD/ELT2stCTy Y342kt0TLyQk4KO8xLlxQhWYQtnlUJTGOo2boLY7/L652fb4c6FRgLg8ALIX1Su6EeZY+bnt dU0tdPCHr58feIVGvHfrFI0DqRAyN7gPlQaeHzQ0gClW6tusnf7IsDLcVcAp5O9wSbc6yka3 UOB/OW5ZiDIFNm3R6vas7cvJ1Hu77sOXrRhoUrayuMAlK8j/27/KXg/HFXc/HjrHQ5G7Utrs LlhYLv8NF+EmahnGDMdKyHJEJonBjfB+xDKZzp5qUu57V3jorirREjUFcHyTIGq3c2qSO13N 8TUt3M28WL1GQdPLo/ExbUaAiHjpLdnzyRkcoZDrr4AoxQxnTmvLBECm5mKdCrPowCbnv9Sh oG14cTCeLq6gPlMFFSK0qB30ftDKQPrxXtvwNdxnqgdg11DMYoVyJnWCTmtWuHRFyAyKHL+9 UXFejJCfb3g0LMqjK3OJBtpaAEIf0QRHJw6DF1Ng36wsLhd6T3l0Y6XiuYgMUkBk8ctFu8Mm eOmjpraZwitqgXxBkykurv9vn5ia+saJNFgsjryn+CI+u2RCTxTjTMLH50Zdks6fOJjXMFCx prRIXKx9fIW6YtMru34osE5rayrRn3vtXu+hwCKXlbKyvYfzvx4q+6HrWXLwvqlX0reOveLn p1z9I3KfnUCRfAG0c19zlM76Q2HPZ5qh9PmPB1CCt6MNpCosH4rUzbw2pFLIAy5SySQpfFT5 SmfhPfPdNHtopbOtrjPmrVe+wc5XKgec7HB1v9PwmrPS4yhQr7g8yc/mmnuvyvhEbE1MjjCX BOeGCw/j+kxtfTHdF0cMoxm0O5HroZtwQ9LoIRN764KpGHDw+vDJ64NnfpdCWsS8laDIrhhB JX8cMs/aEEoGfS3OcEzK9tn2DVQA5ZozadK0o9YF+lADVYZUlwuqp4Xuhx5VAaodpQ6WfpYf wgEmm5P5juPtAltTOVwIHTaKfoEEfZOJKuED+SVoeAOgVq4ZLJeQteHHyFoVCWMopLJVK9Ny u0sLmQNN9RSkN/LHbbzx8+lF5VCUAsHKax2oq8jpBTp5A0P231YdpViVXfMhEtJSW+5+OfiP 4upiMhvMzax7eqGmXYrweMSPbrYGubcppAfDuVUjjOHM1/mZ1NykJd+qVqvqRba6qX4mWsSg 7W8SfKE/UriXk5N8fhKtvdfbN2s5YCWPlRzZhPVUXDor0RjMzav1FX959f3k6PDV6x//tP9j W7fM4WBy1G2tPt3Ux7a21QGXaXtK35AmlQM0mrSiOS59K/pILGcb+qx7SM5PbZIXHeKNMxHW WlXDmm9VH9sUELtZNEQzwNkQLUCOxWpNTidT3578br9hsbfSzEPlUKU51RbR6u3cbMm7Otm0 bVICwxU7tS60uu/FdE+01Xei5lXanSjRxc1+wjCwUAMYpNcIClDcuuYe8pZC5BBmyzpajhKp fW/chfrkuy5Us3tdFyF9cGh9vrx5Z2AKMFGNmsFDDazrT55nL5/uPzpYr9WHTc1XpoHtjUhd SEv7lk0wD3Koz/BTpqbtZlYOarErTsZ8+bbRSVA/PeN9X9/LUcHNqNd/zZVOtVb+tqoou9pd kd336LHlw+2grH7pZtmzg9e/P3zRzcKO+tBFVcXOpqr6xVWVie7/+MOrbhbQXhvQ3SOETq0t 1Vb0czSAsBm1vuC59F7vXt53s5Vq77ZDtKriatPY9UvUa0B7N23sv8ZZc/+Gmh6/tSKxr6ki P8QVPda6wRKRmkfLT12t+vyFTjVCS21BfYXq1fWuzxqoHFpq5fGy2vzQrdTWOifL6pwkdXTA rILDRBYTfz7YJbtIclHQ7eBDjNKO0VbUnruDy/PnitpCXWDMv2Kvxq5L+2zZse/kayx+BZtJ y4rsdpTLGK2oPmpRIRhymoais2JaREX1sbkomWjfPwmvK1ctygPYN6oX2rqychZIWejmO/7A tqa1uHsRt95UyToJlaJ+pLz8F8q77NUxTDSddSMAhyVjmT0gbcrJ2yVAsjIOrmnNAF7gQ0JI WVsJKXUIpIhgkvH0l3LVXjsUST471lfYG9OquGLIxgrNZEpSvcnAlnWW8DZ/bSyrrND8rD8/ mg2nKR+6nqV86LpKA85vQC1zKp/yg3LbTKM9VJFIvu+/Onz64vkPbS9IOHZd+eshwp5MnPlB xBnKk2TE9iocIBpwRjETUr3NShqQXq8dCwvsz2lErIzlAQRxCA+NYlmDhvnXHz6tOfXxm6AG Wjhf4PCQwwf/R2mjHEjF0KVL90pYJ+dXGr8b0oN1Je4P+gzN+nFw2XZvHRNdcUYhk00BLluz Ia3pmNZ0UGvRqNRdV4XNCvw+uEFExfUqnaS4wRT+FLRhVW4Wjxxxfbm4f2+Mu2XqtkS/jCGo ZC/Ag4+Uzie2/Ya22BQZARTeeGBr4FUBK05J5JGAIrkutVSx3DRBm2vqgY9I8VqG0Kp6C9fu Nf6YvMLeVo2kaJYXlzvPOE1Skrq9t0FaJUX+BHIRPyuD+w+iH+78CYZ1Hc7fn5RoLP8Xn8dn 05R/KJ1wiQ7UDreAYil7+kZF/7zb70BAOM8OC71N8VCvU3z6Jt/t6xe1x1HvVNCioiFaTaPu 9B66Qs5BIhpIo9ohAZyDu9P00zk8VfbJFAYy10WRBS4m1h6uv8tn/fxdAV/8oSaFo22QNL2b vVdS1UguW5GLwVZUDjMKUF3BUB2JCxfe6eMDTgQuhlJpPemCURK9UBxgynr4G9ElG3wWGTjW 1ztG25jvHdmuHsUNV33R6IrmIDXLeavRdHgEBT1uusZNYLi4xGgwL9mRsat7/FzDdeCvbnbY pUQ9oeHfcdrX0HV8UejZDDXB0HUknqXZ+3s9Rd6zGwFX9j6cFDWS75uQRXgv8B46A3hjX/Zt ZcVVkPV0uc8+tNz/qjDA+sYLgLV0iBDDJHV3kpdh3MuH3UuYkesGT0C5Uw04laj8eg9qZy6P 1WomhBVvdBguEjtDzUs26sF09MYB1R29oo7jcz2ocQMaRBOwyyx+PDhw3w0PotsJKmwLNOZz xIx9nByIEFMhB2GPV9xqSIbCoHHwPJ7MxvmikRHgkq/y/LexkDFow9r+l9V/29zZ2pr/ZbXD G30wyNXgfJYpyUsNUHBFNxxMSLjRiMg46s81U3TdStb8mglE/mVJdPEqvO3HUypJ47m4C+f+ baAmnQWDaAzm4WMyM4fitJFxfMmZ+JEZ3oqLNlBmWXiRWhdYZFrkAfw3u7sWJhpbS5AN5UHF ZWm9YxBTCNlBkwWZvN1UHFo325BhfpU1iNA5ylwaeFXPWnwJkv/rceCfRqahH6gXM5gN6/x6 OV/CsPuWqUKMwfRrkJilNCapFzEhVcLzJUUEs/PHDoXP8ikyvla1X2N9fY3yq0FT4yWMrtf6 1BVFroxpXaIoRluo1xoCE1nKDn9/sP+9LOyT1+01rcFCgsDYK6tq32qteIwNm6rT1TJMw6kX VelQBnoHj6qghB7as+qRpgLsRZgWH03ZJkX5yIJbXe5Tff2uEJAuZvUP82UfnEsa+5wV01mi CZyFHvFUry6rZPYFfFqr6i27cUGvzEyK2iInJd3Co2BkcOuGfvHYMBmkJ6vP3eYdOWOUxWx4 JLLJvsCkWwHeZCn0Gr7IYPtn3jcTrlUvHz/d/+GV7LHH+398+jr7OYte/n7/TweHPzzKfua2 8q9/ty904T9fHviR4OZw8qXZitcChj4HiiLZStu5Owfsdm98U+6F1Ug2EZ2GA+D5GGAe+fVG WBBe1oGBoHFU1TuHQ0tQKEbjx2O9svfM8dXMsBPVtDf1yuMCONjwQdbUDSV5jyxL9bdCvRrK InXVDC0t+9TYA9qKANGGRjasE54CVPFkCETCYgi6b9TB4ZonDmmx58VFKAQNcIrsjw6/L0ah AC9yM30zfO++P5tpKkw1nFwUejWYu6/MvBqv4EU2ngyGx1fOmwrz2EM8L28nbDEC+WyO7LLw OjsbTzX739lc7xIVkrsYLs5wfya+jq8yhhnSUXKWD+fwASnRCu4lzEt41VzJb1z7By8r3mmX R0NQP0RzwEJyQentkQw2XzD1RrgSc26OY5yJ900eXe3RqeuJeYfY9Yn9fJANhQ5jynrzZ2gM I+cxt9BbzTS21qK3N+lR0qAlrwSTT6bqj38TZwGcrKvWpzm19m0VcOunRmENGV8dj+X6uKBg Pa2M7IZ21Np8ug22smY76o2qXmNBbahv9sXlxtMb9dlsOr3hTD9mOtV/8Tg/bjWt11liML3R GJsMlzepeFP7JTS8lfaEywp4Xt8CS12b6qbQWsP/54yiddDGNs5rDKPXV/xc42hiuL4+Z+kn c9Ai+bhWunJKzBxR63yEtd7qwqXwUcXxPzqm/06MNUbx54nd82n3qmIcX4LXXr4xulGNhOmu Nve/nM/uflFuugKSL8A81/jjj3DNzQd66P9fiX32LDFCql68Ptj1iaTXZY+vu+TT5QCXEAWC owL2rYjBbf6XgYZkT5z79pJS8k94uvol7AlFa5mBD0y9HPl71QBH+SCbkH7Vr/RqcuYYIEPr /cdlXuY/TorR+sA54bN6SjarKeg4/lpTLi3xz2hw5sCRnLbO0JjGwEuU/4y8a2nswyMa2dMu u9laZQ7XRkZY28NF76EDtFlngrZNHVVcGXlqiN/ESrRWDuOxvf5x/9F/pO7WsVf3MEmyxfut 3UzcudKuQ5NATrr54/O4I6T28o7ebrz6IZHbtHjdmM7ufbRzvf+PBzwT0gFSNYhvBXBosHIK FFfT+WIjveOS1A4Yqmz/CiYajKrODd6L97pkEWgRVG8Z4Ksu8X/dS8PANJOfu9BlEecQV7QJ 8/Ne5k0wMqy0huobx+2pBG2/azDWu03mQ0Kqeui/drO4EW67jY34lbrsBGCac0HkFb2nARNB Ady0YWxKVfHxOuJzcy7P0XdnugN9X3VrCDq4Gh3MsA6xhPF96Vp3uvFhNDzy7VWPx7H/Iq/i E7Cqjo13dKNKNmn5iyhgkw83YQKrn27M7SXf/lcxeRUOw3H9NeYuoaDdT+Dv0kX6RJaubYTQ c/4xdezGXJ3JdMbfpeXxJinsWcAKW/fPzO8Jf/b/AFBLAwQUAAAACAA6Vvss8K2lrOUGAACU EQAACwAVAHNvcnRwZXJmLnB5VVQJAAOwXkI9p15CPVV4BAD0AWQAjVdtb9s2EP6uX3FVEFRO HNV2ugwzlgEF9oIB2Qas+eYaAS1RNmuJVEkqtvehv313JCVLtjfUSGCZvDvey8PnTnEcf1Ta Qs11oXTFZMbBcmPTKPrIOVRMyGQEuAWZqnA7h1JIDuYgLdunTsayVVMyy4OcamzdWHDWyEwc x1EkqppOMQfTPlpR8fZZo11Vtb8qps2GlZ0gr+pClJ2wMlFkc3jsNtI1t/ScC52MoijnhTNY lIpZk8jRPAL8oBt/c9toCQxDMGinABlOBi8LQsJiMobpKCWnSesKfuOSa2aFXHdSBvi+5tKI Vz4Go8BucGmnBeYNn3lFKQCr8CByD60GU8y5DOgnz6zShxTg94I0vBwrNWf5AY2je2YMwsLn Bh2lVW842FmxbEu+UjXMpimKsj2YwUpgzkmskJgiZdKa2U36WWEZbT6GWOvryUMewzXIkZOz +uAT5JRqUsLYkkKS8Cr2Qnyf8drC73/9orXSR3mN4j6Faahht8NNU1rcXuiAC0Eu71FszbEo y05y4MDFhQue7VrP+p8AnDRvqjrxDoxR8VywqNOsVAYRe/mcP5Xkg51CSFaWF9wSBWqcL/9n GO0HC9NIvElbDOfcCfqElKMgH+b89FMzY6ITrVCoMVRmPdSstZAW4ozJt9ZjNsYcUUrnsRP3 9S4Nn5/Xsk0w3oM86Sf2QkKv4KOHJuHY4zJNjwotIDweppPR0E0xBFaLmiFO6DI9Bu8Wc7Ec 7Oa8/K8tUkw1f+X6DAJeI+V7y2WekODozCqtukVMPEdCKhGUXm8Ej4/gb7v2XOPXPSkVZWM2 SQgUmTA1NkemSMO6F8qVsMlTELITojkkSsputg2+PqUGeTD8sNNLIqHK1w/prKC7ntjpnZ2M xp4Z+ud15K2PPPkc1sA4zq45z13BiDYN8eYr00I1Boz4hxukeNJ7Rhpzv4FpDrObG9ErMjqA 20yvm4pLvJSegkc91dA1cmHqkh2QZMfOxtiRHClTjESqziemNQmhKyQUolKepWtFscumWmF9 YSfsxlOsKku1cwIaeURbwc3cn39DNudtL8iZZW75k1/OuckQDKTZbb3zW+x85/5kZ0yHS7hv reMF3RCY/ZW9vSw9nbTizDrncdfJf/Xy2IcPkDd1KTIskzf1GEyVJfAvDbZPWnzjF3dKYx/J mMES4UlUvagttvumLUNAQps8WWQbuIWYdGPf+jeuivHNp3f3t18f38RLD7OiIlZI4uuZgevv TUxacP1g4hu6E87oqA9IkkcwJrFArompdPEIdbxgNOSFIx9QI5vCjz+C6JaeAj10Pb7baZE/ y9Ej1+cSBJEM0O/Dv7vS/sYhYTkcHM+4QBFH2U8D2eO6Q0bUY8Gf1YXiD4kQ+1V1OJLh/SkX Tr+BDMXsG4SeFmK6HNPXbIni7nvsVy+Ecn8ayt8c7ybNh4jIkiGiEKfufg2mqGNsgiasn7B4 k/mJG3e4tHTDQX94CIPCSTqwN1zy7vbUuw/ayXumUp6ojnfk1Ct4P/SJiP1p8X4+7BSEsye4 gUTCu3fwfpjNK/hV6ZAOXnJiNkMMteJEY8hFGbbv1Wec9XCYo45KHIaX3CDgJUqdGGNISYXI BA0aGMAuPfOkYnVSsmqVM9jP4e5uj7W7hM2vQ2xSZINEtRSB8+eHNQ74Y8rZtwWSRqcOsRVG t7ibpN8tMVHykj+P/+/PM43POHFh0GprsDNsOSzukf9xEh/DxP1N3c/7JY3MOLcwQyMFyx11 9ExR7TG3qsw58khV+2CwLyhJcPjSiGxLzoxhtxHIao3hvrtUPBdM9iypdjBH5hx7tFMnqkSe l70sMTd4Qy1elT3mBmekAtPjQDM7JS2EtNu/o6Du3P+oRzph9DgKjvozVUCc8nctvH2EJtGi i/ovvqzVTLMVvh2BT7AwPTNGrCWBjUmLYCswOspXATssAmevBAVWUXBIphdqHg6/CL43g2I7 OvaThn+V7IaMP/AnteK1xpN21PWEfFVbrIcrrsm0qG0YEP6Sx+Fh3o0sJIUBYzl0e1ued6oT NANJzw0JYhnpH1/cRl7hl73V7KjicudBgZMGD+AI9Lb2b4JKp8PmeYWQLhhNyGeHuP0t8ff0 O/9MND2buGfMN02BePjrYjpfzvvoGwQconSDTSfkrDpzuJocDS2PRekfMOsf4A+xw2ThnJeJ 4uCDGIienzJbnnSf3kn3pye1OdKYkTaXLreFxifNCRg0Q3W+nGnvKYHnr2p02alP/P/RrYGH HyYPPyBF7XHi2DAcAdj5i1doR+Rdsvfburu1W7yr29ntdHRm/gpcyXPeDmu9WvUG7CjCPL28 SFbxlxd6VXj78kLBv7y89W77KxL9C1BLAwQUAAAACAA2VvssFCAPP+YGAACVEQAADgAVAHRp bXNvcnRwZXJmLnB5VVQJAAOnXkI9p15CPVV4BAD0AWQAjVdtb9s2EP6uX3FVEFROHNV2ugwz lgEF9oIB2Qas+eYaAS1RNmuJVEkqtvehv313JCVLtjfUSGCZvDvey8PnTnEcf1TaQs11oXTF ZMbBcmPTKPrIOVRMyGQEuAWZqnA7h1JIDuYgLdunTsayVVMyy4OcamzdWHDWyEwcx1EkqppO MQfTPlpR8fZZo11Vtb8qps2GlZ0gr+pClJ2wMlFkc3jsNtI1t/ScC52MoijnhTNYlIpZk8jR PAL8oBt/c9toCQxDMGinABlOBi8LQsJiMobpKCWnSesKfuOSa2aFXHdSBvi+5tKIVz4Go8Bu cGmnBeYNn3lFKQCr8CByD60GU8y5DOgnz6zShxTg94I0vBwrNWf5AY2je2YMwsLnBh2lVW84 2FmxbEu+UjXMpimKsj2YwUpgzkmskJgiZdKa2U36WWEZbT6GWOvryUMewzXIkZOz+uAT5JRq UsLYkkKS8Cr2Qnyf8drC73/9orXSR3mN4j6Faahht8NNU1rcXuiAC0Eu71FszbEoy05y4MDF hQue7VrP+p8AnDRvqjrxDoxR8VywqNOsVAYRe/mcP5Xkg51CSFaWF9wSBWqcL/9nGO0HC9NI vElbDOfcCfqElKMgH+b89FMzY6ITrVCoMVRmPdSstZAW4ozJt9ZjNsYcUUrnsRP39S4Nn5/X sk0w3oM86Sf2QkKv4KOHJuHY4zJNjwotIDweppPR0E0xBFaLmiFO6DI9Bu8Wc7Ec7Oa8/K8t Ukw1f+X6DAJeI+V7y2WekODozCqtukVMPEdCKhGUXm8Ej4/gb7v2XOPXPSkVZWM2SQgUmTA1 NkemSMO6F8qVsMlTELITojkkSsputg2+PqWVQSIMv+z0kkwo8/VDOivosid2emcno7Gnhv6B HXvrI1E+hzUwjrRrznNXMeJNQ8T5yrRQjQEj/uEGOZ70npHH3G9gmsPs5kb0qowO4DbT66bi Em+l5+BRTzW0jVyYumQHZNmxszF2LEfKFCOxqvOJaU1C6AoJhaiUp+laUeyyqVZYYNgJu/Ec q8pS7ZyARiLRVnAz9+ffkM152wxyZplb/uSXc24yRANpdlvv/BY737k/2RnT4RLuW+t4QzeE Zn9nby9LTyetOLPOedx18l+9PDbiA+RNXYoMy+RNPQZTZQn8S4P9kxbf+MWd0thIMmawRHgS VS9qi+2+acsQkNAmTxbZBm4hJt3Y9/6Nq2J88+nd/e3Xxzfx0sOsqIgWkvh6ZuD6exOTFlw/ mPiGLoUzOuoDkuQRjEkskGxiKl08Qh0vGA2J4UgI1Mmm8OOPILqlp8APXZPvdlrkz3L0yDW6 BEEkA/T78O/utL9yyFgOB8czLnDEUfbTQPa47pAR9WjwZ3Wh+EMmxIZVHY5seH9KhtNvYEMx +wahp4WYLsf0NVuiuPse+9ULodyfhvI3x7tJAyIismSIKMSpu1+DMeoYm6AR6ycs3mR+4sYd Li3ddNCfHsKkcJIObA6XvLs99e6DdvKeqZQnquMdOfUK3g99ImZ/WryfD1sF4ewJbiCR8O4d vB9m8wp+VTqkg5ecmM0QQ6040RhyUYb9e/UZhz2c5qilEofhJTcIeIlSJ8YYUlIhMkGTBgaw S888qVidlKxa5Qz2c7i722PtLmHz6xCbFNkgUS1F4AD6YY0T/phy9m2BpNGpQ2yF0S3uJul3 S0yUvOTP4//780zzM45cGLTaGuwMWw6Le+R/HMXHMHF/U/fzfkkzMw4uzNBMwXJHHT1TVHvM rSpzjjxS1T4Y7AtKEhy+NCLbkjNj2G0EslpjuO8uFc8Fkz1Lqp3MkTnHHu3UiSqR52UvS8xN 3lCLV2WPucEhqcD0ONDMTkkLIe327yioO/c/6pFOmD2OgqP+UBUQp/xdC68foUm06KL+i29r NdNsha9H4BMsTM+MEWtJYGPSItgKjI7yVcAOi8DZK0GBVRQckumFmofDL4LvzaDYjo79pOHf Jbsh4w/8Sa14rfGkHXU9IV/VFuvhimsyLWobBoS/5HF4mHcjC0lhwFgO3d6W553qBM1A0nND glhG+sc3t5FX+GVvNTuquNx5UOCkwQM4Ar2t/aug0umweV4hpAtGI/LZIW5/S/w9/c4/E03P Ju4Z801jIB7+upjOl/M++gYBhyjdYNMJOavOHK4mR0PLY1H6B8z6B/hD7DBZOOdlojj4IAai 56fMlifdp3fS/elJbY40ZqTNpcttofFJcwIGzVCdL2fae0rg+bsaXXbqE/9/dGvg4YfJww9I UXucODYMRwB2/uYV2hF5l+z9tu5u7Rbv6nZ2Ox2dmb8CV/Kct8Nar1a9ATuKME8vL5JV/OWF 3hXevrxQ8C8vb73b/opE/wJQSwECFwMUAAAACAAUnPos3wmbRCpTAADhHgEAFAANAAAAAAAB AAAApIEAAAAAT2JqZWN0cy9saXN0b2JqZWN0LmNVVAUAA7iHQT1VeAAAUEsBAhcDFAAAAAgA Olb7LPCtpazlBgAAlBEAAAsADQAAAAAAAQAAAKSBcVMAAHNvcnRwZXJmLnB5VVQFAAOwXkI9 VXgAAFBLAQIXAxQAAAAIADZW+ywUIA8/5gYAAJURAAAOAA0AAAAAAAEAAACkgZRaAAB0aW1z b3J0cGVyZi5weVVUBQADp15CPVV4AABQSwUGAAAAAAMAAwDeAAAAu2EAAAAA --------------080204010802000409070906-- From skip@pobox.com Sat Jul 27 16:32:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 27 Jul 2002 10:32:20 -0500 Subject: [Python-Dev] Sorting In-Reply-To: References: Message-ID: <15682.48388.755636.474915@12-248-11-90.client.attbi.com> Tim> Skip's Pentium III acts most like my Pentium III, which shouldn't Tim> be surprising. ... Tim> sf userid ~sort speedup under timsort (negative means slower) Tim> --------- --------------------------------------------------- Tim> montanaro -23% Tim> tim_one - 6% Tim> jacobs99 +18% Tim> lemburg +25% Tim> nascheme +30% I should point out that my PIII is in a laptop. I don't know if it's a so-called mobile Pentium or not. /proc/cpuinfo reports: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 1 cpu MHz : 451.030 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 897.84 It also has separate 16KB L1 I and D caches. From what I was able to glean from a quick glance at a Katmai vs. Coppermine article, the Coppermine's L2 cache is full-speed, on-chip, with a 256-bit wide connection and 8-way set associative cache. Does any of that help explain why my results are similar to Tim's? Skip From tim.one@comcast.net Sat Jul 27 21:22:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 27 Jul 2002 16:22:47 -0400 Subject: [Python-Dev] Sorting In-Reply-To: Message-ID: [Tim] > ... > I also noted that msort() gets a 32% speedup on my box when sorting a > 1.33-million line snapshot of the Python-Dev archive. This is a puzzler > to account for, since you wouldn't think there's significant pre-existing > lexicographic order in a file like that. McIlroy noted similar results > from experiments on text, PostScript and C source files in his adaptive > mergesort (which is why I tried sorting Python-Dev to begin with), but > didn't offer a hypothesis. Just a note to clarify what "the puzzle" here is. msort() may or may not be faster than sort() on random input on a given box due to platform quirks, but that isn't relevant in this case. What McIlroy noted is that the total # of compares done in these cases was significantly less than log2(N!). That just can't happen (except with vanishingly small probability) if the input data is randomly ordered, and under any comparison-based sorting method. The only hypothesis I have is that, for a stable sort, all the instances of a given element are, by definition of stability, already "in sorted order". So, e.g., "\n" is a popular line in text files, and all the occurrences of "\n" are already sorted. msort can exploit that -- and seemingly does. This doesn't necessarily contradict that ~sort happens to run slower on my box under msort, because ~sort is such an extreme case. OK, if I remove all but the first occurrence of each unique line, the # of lines drops to about 600,000. The speedup msort enjoys also drops, to 6.8%. So exploiting duplicates does appear to account for the bulk of it, but not all of it. If, after removing duplicates, I run that through random.shuffle() before sorting, msort suffers an 8% slowdown(!) relative to samplesort. If I shuffle first but don't remove duplicates, msort enjoys a 10% speedup. So it's clear that msort is getting a significant advantage out of the duplicates, but it's not at all clear what else it's exploiting -- only that there is something else, and that it's significant. Now many times has someone posted an alphabetical list of Python's keywords ? From guido@python.org Sat Jul 27 22:56:30 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 27 Jul 2002 17:56:30 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 In-Reply-To: Your message of "Mon, 22 Jul 2002 15:38:16 EDT." <15676.24360.88972.449273@anthem.wooz.org> References: <15676.16356.112688.518256@anthem.wooz.org> <15676.24360.88972.449273@anthem.wooz.org> Message-ID: <200207272156.g6RLuU826463@pcp02138704pcs.reston01.va.comcast.net> > It's a bit uglier than that because since Lib/test gets magically > added to sys.path during regrtest by virtue of running "python > Lib/test/regrtest.py". Perhaps regrtest.py can specifically remove its own directory from sys.path? (Please don't just remove sys.path[0] or ''; look in sys.argv[0] and deduce from there.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 27 22:51:50 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 27 Jul 2002 17:51:50 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 In-Reply-To: Your message of "Mon, 22 Jul 2002 13:24:52 EDT." <15676.16356.112688.518256@anthem.wooz.org> References: <15676.16356.112688.518256@anthem.wooz.org> Message-ID: <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net> > A better fix, IMO, is to recognize that the `test' package has become > a full fledged standard lib package (a Good Thing, IMO), heed our own > admonitions not to do relative imports, and change the various places > in the test suite that "import test_support" (or equiv) to "import > test.test_support" (or equiv). Good idea. > I've twiddled the test suite to do things this way, and all the > (expected Linux) tests pass, so I'd like to commit these changes. You've done this by now, right? Fine. > Unit test writers need to remember to use test.test_support instead of > just test_support. We could do something wacky like remove '' from > sys.path if we really cared about enforcing this. It would also be > good for folks on other systems to make sure I haven't missed a > module. Perhaps it would be a good idea for test_support (and perhaps some other crucial testing support modules) to add something at the top like this? if __name__ != "test.test_support": raise ImportError, "test_support must be imported from the test package" --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jul 27 23:17:39 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 27 Jul 2002 18:17:39 -0400 Subject: [Python-Dev] More Sorting In-Reply-To: Your message of "Mon, 22 Jul 2002 23:19:32 EDT." <3D3CCB44.4F2592ED@metaslash.com> References: <3D3CCB44.4F2592ED@metaslash.com> Message-ID: <200207272217.g6RMHdA00500@pcp02138704pcs.reston01.va.comcast.net> > Sebastien Keim posted a patch (http://python.org/sf/544113) > of a merge sort. I didn't really review it, but it included > test and doc. So if the bisect module is being added to, > perhaps someone should review this patch. It doesn't strike me as a "fundamental" algorithm like bisection or heap sort. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Sun Jul 28 06:48:26 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 28 Jul 2002 01:48:26 -0400 Subject: [Python-Dev] RE: companies data for sorting comparisons In-Reply-To: Message-ID: Kevin Altis kindly forwarded a 1.5MB XML database with about 6600 company records. A record looks like this after running his script to turn them into Python dicts: {'Address': '395 Page Mill Road\nPalo Alto, CA 94306', 'Company': 'Agilent Technologies Inc.', 'Exchange': 'NYSE', 'NumberOfEmployees': '41,000', 'Phone': '(650) 752-5000', 'Profile': 'http://biz.yahoo.com/p/a/a.html', 'Symbol': 'A', 'Web': 'http://www.agilent.com'} It appears to me that the XML file is maintained by hand, in order of ticker symbol. But people make mistakes when alphabetizing by hand, and there are 37 indices i such that data[i]['Symbol'] > data[i+1]['Symbol'] So it's "almost sorted" by that measure, with a few dozen glitches -- and msort should be able to exploit this! I think this is an important case of real-life behavior. The proper order of Yahoo profile URLs is also strongly correlated with ticker symbol, while both the company name and web address look weakly correlated, so there's hope that msort can get some benefit on those too. Here are runs sorting on all field names, building a DSU tuple list to sort via values = [(x.get(fieldname), x) for x in data] Each field sort was run 5 times under sort, and under msort. So 5 times are reported for each sort, reported in milliseconds, and listed from quickest to slowest: Sorting on field 'Address' -- 6589 records via sort: 43.03 43.35 43.37 43.54 44.14 via msort: 45.15 45.16 45.25 45.26 45.30 Sorting on field 'Company' -- 6635 records via sort: 40.41 40.55 40.61 42.36 42.63 via msort: 30.68 30.80 30.87 30.99 31.10 Sorting on field 'Exchange' -- 6579 records via sort: 565.28 565.49 566.70 567.12 567.45 via msort: 573.29 573.61 574.55 575.34 576.46 Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 120.15 120.24 120.26 120.31 122.58 via msort: 134.25 134.29 134.50 134.74 135.09 Sorting on field 'Phone' -- 6589 records via sort: 53.76 53.80 53.81 53.82 56.03 via msort: 56.05 56.10 56.19 56.21 56.86 Sorting on field 'Profile' -- 6635 records via sort: 58.66 58.71 58.84 59.02 59.50 via msort: 8.74 8.81 8.98 8.99 8.99 Sorting on field 'Symbol' -- 6635 records via sort: 39.92 40.11 40.19 40.38 40.62 via msort: 6.49 6.52 6.53 6.72 6.73 Sorting on field 'Web' -- 6632 records via sort: 47.23 47.29 47.36 47.45 47.45 via msort: 37.12 37.27 37.33 37.42 37.89 So the hopes are realized: msort gets huge benefit from the nearly-sorted Symbol field, also huge benefit from the correlated Profile field, and highly significant benefit from the weakly correlated Company and Web fields. K00L! The Exchange field sort is so bloody slow because there are few distinct Exchange values, and whenever there's a tie on those the tuple comparison routine tries to break it by comparing the dicts. Note that I warned about this kind of thing a week or two ago, in the context of trying to implement priority queues by storing and comparing (priority, object) tuples -- it can be a timing disaster if priorities are ever equal. The other fields (Phone, etc) are in essentially random order, and msort is systematically a bit slower on all of those. Note that these are all string comparisons. I don't think it's a coincidence that msort went from a major speedup on the Python-Dev task, to a significant slowdown, when I removed all duplicate lines and shuffled the corpus first. Only part of this can be accounted for by # of comparisons. On a given random input, msort() may do fewer or more comparisons than sort(), but doing many trials suggests that sort() has a small edge in # of compares on random data, on the order of 1 or 2% This is easy to believe, since msort does a few things it *knows* won't repay the cost if the order happens to be random. These are well worth it, since they're what allow msort to get huge wins when the data isn't random. But that's not enough to account for things like the >10% higher runtime in the NumberOfEmployees sort. I can't reproduce this magnitude of systematic slowdown when doing random sorts on floats or ints, so I conclude it has something to do with string compares. Unlike int and float compares, a string compare takes variable time, depending on how long the common prefix is. I'm not aware of specific research on this topic, but it's plausible to me that partitioning may be more effective than merging at reducing the number of comparisons specifically involving "nearly equal" elements. Indeed, the fastest string-sorting methods I know of move heaven and earth to avoid redundant prefix comparisons, and do so by partitioning. Turns out that doesn't account for it, though. Here are the total number of comparisons (first number on each line) done for each sort, and the sum across all string compares of the number of common prefix characters (second number on each line): Sorting on field Address' -- 6589 records via sort: 76188 132328 via msort: 76736 131081 Sorting on field 'Company' -- 6635 records via sort: 76288 113270 via msort: 56013 113270 Sorting on field 'Exchange' -- 6579 records via sort: 34851 207185 via msort: 37457 168402 Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 76167 116322 via msort: 76554 112436 Sorting on field 'Phone' -- 6589 records via sort: 75972 278188 via msort: 76741 276012 Sorting on field 'Profile' -- 6635 records via sort: 76049 1922016 via msort: 8750 233452 Sorting on field 'Symbol' -- 6635 records via sort: 76073 73243 via msort: 8724 16424 Sorting on field 'Web' -- 6632 records via sort: 76207 863837 via msort: 58811 666852 Contrary to prediction, msort always got the smaller "# of equal prefix characters" total, even in the Exchange case, where it did nearly 10% more total comparisons. Python's string compare goes fastest if the first two characters aren't the same, so maybe sort() gets a systematic advantage there? Nope. Turns out msort() gets out early 17577 times on that basis when doing NumberOfEmployees, but sort() only gets out early 15984 times. I conclude that msort is at worst only a tiny bit slower when doing NumberOfEmployees, and possibly a little faster. The only measure that doesn't agree with that conclusion is time.clock() -- but time is such a subjective thing I won't let that bother me . From tim.one@comcast.net Sun Jul 28 09:07:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 28 Jul 2002 04:07:12 -0400 Subject: [Python-Dev] RE: companies data for sorting comparisons In-Reply-To: Message-ID: [Tim] > ... > Sorting on field 'NumberOfEmployees' -- 6531 records > via sort: 120.15 120.24 120.26 120.31 122.58 > via msort: 134.25 134.29 134.50 134.74 135.09 > ... > [where the # of comparisons done is] > Sorting on field 'NumberOfEmployees' -- 6531 records > via sort: 76167 ... > via msort: 76554 ... > ... > [and various hypotheses for why it's >10% slower anyway don't pan out] > ... > I conclude that msort is at worst only a tiny bit slower when doing > NumberOfEmployees, and possibly a little faster. The only measure that > doesn't agree with that conclusion is time.clock() -- but time is such a > subjective thing I won't let that bother me . It's the dicts again. NumberOfEmployees isn't always unique, and in particular it's missing in 6635-6531 = 104 records, so that values = [(x.get(fieldname), x) for x in data] includes 104 tuples with a None first element. Comparing a pair of those gets resolved by comparing the dicts, and dict comparison ain't cheap. Building the DSU tuple-list via values = [(x.get(fieldname), i, x) for i, x in enumerate(data)] instead leads to Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 47.47 47.50 47.54 47.66 47.75 via msort: 48.21 48.23 48.43 48.81 48.85 which gives both methods a huge speed boost, and cuts .sort's speed advantage much closer to its small advantage in total # of comparisons. I expect it's just luck of the draw as to which method is going to end up comparing tuples with equal first elements more often, and msort apparently does in this case (and those comparisons are more expensive, because they have to go on to invoke int compare too). A larger lesson: even if Python gets a stable sort and advertises stability (we don't have to guarantee it even if it's there), there may *still* be strong "go fast" reasons to include an object's index in its DSU tuple. tickledly y'rs - tim From fredrik@pythonware.com Sun Jul 28 09:30:32 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 28 Jul 2002 10:30:32 +0200 Subject: [Python-Dev] RE: companies data for sorting comparisons References: Message-ID: <008b01c23611$0bc24460$ced241d5@hagrid> tim wrote: > A larger lesson: even if Python gets a stable sort and advertises stability > (we don't have to guarantee it even if it's there) if we guarantee it, all python implementors must provide one. how hard is it to implement a reasonably good stable sort from scratch? I can think of lots of really stupid ways to do it on top of existing sort code, which might be a reason to provide two different sort methods: sort (fast) and stablesort (guaranteed, but maybe not as fast as sort). in CPython, both names can map to timsort. (shouldn't you be writing a paper on this, btw? or start a sort blog ;-) From nhodgson@bigpond.net.au Sun Jul 28 14:00:06 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sun, 28 Jul 2002 23:00:06 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020727024012.85905.qmail@web40107.mail.yahoo.com> Message-ID: <003701c23636$b25b0a80$3da48490@neil> Scott Gilbert: > First, could this be implemented by a gapped_buffer object that implements > the locking functionality you want, but that returns simple buffers to work > with when the object is locked. In other words, do we need to add this > extra functionality up in the core protocol when it can be implemented > specifically the way Scintilla (cool editor by the way) wants it to be in (Thanks) > the Scintilla specific extension. Would this mean that the explicit locking completely defines the validity of the address or is the address valid until the 'view' buffer object is garbage collected? I would like the gapped_buffer to be put back into gapped mode as soon as possible and depending on the lifetime of a view buffer object is not that robust in the face of alternate Python implementations that use non-reference-counted GC implementations (Jython / Python .Net). > Second, if you are using mutexes to do this stuff, you'll have to be very > careful about deadlock. By locking, I want to change state on the buffer from having a gap and allowing resizes to having a static size and address which will remain valid until an unlock. The lock and unlock are not treating the buffer as a mutex (I'd call the operations 'acquire' and 'release' then) although mutexes may be needed for safety in the lock and unlock implementations. It is likely that the lock and unlock would be counted (it can be locked twice and then won't be expandable until it is unlocked twice) and that exceptions would be thrown for length changing operations while locked. If you think my particular use is out of the scope of what you are trying to achieve then that is fine. Neil From neal@metaslash.com Sun Jul 28 15:03:13 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 28 Jul 2002 10:03:13 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 References: <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D43F9A1.C4D491A3@metaslash.com> Guido van Rossum wrote: > > > A better fix, IMO, is to recognize that the `test' package has become > > a full fledged standard lib package (a Good Thing, IMO), heed our own > > admonitions not to do relative imports, and change the various places > > in the test suite that "import test_support" (or equiv) to "import > > test.test_support" (or equiv). > > Good idea. Shouldn't this also be done for from XXX import YYY? grep test_support `find Lib -name '*.py'` | \ egrep -v '(from test |test\.test_support)' | grep import Neal From guido@python.org Sun Jul 28 16:17:17 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 28 Jul 2002 11:17:17 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 In-Reply-To: Your message of "Sun, 28 Jul 2002 10:03:13 EDT." <3D43F9A1.C4D491A3@metaslash.com> References: <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net> <3D43F9A1.C4D491A3@metaslash.com> Message-ID: <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net> [Barry] > > > A better fix, IMO, is to recognize that the `test' package has become > > > a full fledged standard lib package (a Good Thing, IMO), heed our own > > > admonitions not to do relative imports, and change the various places > > > in the test suite that "import test_support" (or equiv) to "import > > > test.test_support" (or equiv). [Guido] > > Good idea. [Neal] > Shouldn't this also be done for from XXX import YYY? > > grep test_support `find Lib -name '*.py'` | \ > egrep -v '(from test |test\.test_support)' | grep import Good catch! Looks like Barry hardly scratched the surface of this. I *thought* his checkin which claimed to fix this throughout Lib/test was a tad small. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jul 28 16:23:41 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 28 Jul 2002 11:23:41 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 In-Reply-To: Your message of "Sun, 28 Jul 2002 11:17:17 EDT." <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net> References: <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net> <3D43F9A1.C4D491A3@metaslash.com> <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207281523.g6SFNfm16682@pcp02138704pcs.reston01.va.comcast.net> > [Neal] > > Shouldn't this also be done for from XXX import YYY? > > > > grep test_support `find Lib -name '*.py'` | \ > > egrep -v '(from test |test\.test_support)' | grep import [me] > Good catch! Looks like Barry hardly scratched the surface of this. > I *thought* his checkin which claimed to fix this throughout Lib/test > was a tad small. :-( Neal, Barry: on second thought, DON'T FIX THIS YET! I'd like to have a discussion with Barry about the motivation for this. I need to at least understand why Barry thinks he needs this, and reconcile this with my earlier position that relative imports were compulsory in this case. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Sun Jul 28 16:49:27 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 28 Jul 2002 17:49:27 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 References: Message-ID: <010901c2364e$5ce6be10$ced241d5@hagrid> > SF patch #577031, remove PyArg_Parse() since it's deprecated > ! v = PyNumber_Float(v); > ! if (!v) > return -1; > v = PyNumber_Int(v); > ! if (!v) > return -1; umm. doesn't PyNumber_Float and PyNumber_Int convert its argument to a float/integer, if it's not already the right type? in earlier versions of Python, "%g" % "1.0" raised a TypeError. does it still do that with this patch in place? From neal@metaslash.com Sun Jul 28 17:13:12 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 28 Jul 2002 12:13:12 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 References: <010901c2364e$5ce6be10$ced241d5@hagrid> Message-ID: <3D441818.83FD38F7@metaslash.com> Fredrik Lundh wrote: > > > SF patch #577031, remove PyArg_Parse() since it's deprecated > > > ! v = PyNumber_Float(v); > > ! if (!v) > > return -1; > > > v = PyNumber_Int(v); > > ! if (!v) > > return -1; > > umm. > > doesn't PyNumber_Float and PyNumber_Int convert its argument to > a float/integer, if it's not already the right type? Yes. > in earlier versions of Python, "%g" % "1.0" raised a TypeError. does > it still do that with this patch in place? No. :-( That wasn't an intentional change. The intent was to convert an int/long to a double in the case of '%g' et al and from a double to an int in the case of '%d'. What is the best way to fix this? If I call PyNumber_Check() before this code, the behaviour is the same as before. Neal From neal@metaslash.com Sun Jul 28 17:29:33 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 28 Jul 2002 12:29:33 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 References: <010901c2364e$5ce6be10$ced241d5@hagrid> <3D441818.83FD38F7@metaslash.com> Message-ID: <3D441BED.43049678@metaslash.com> Neal Norwitz wrote: > > Fredrik Lundh wrote: > > > > > SF patch #577031, remove PyArg_Parse() since it's deprecated > > > > > ! v = PyNumber_Float(v); > > > ! if (!v) > > > return -1; > > > > > v = PyNumber_Int(v); > > > ! if (!v) > > > return -1; > > > > umm. > > > > doesn't PyNumber_Float and PyNumber_Int convert its argument to > > a float/integer, if it's not already the right type? > > Yes. > > > in earlier versions of Python, "%g" % "1.0" raised a TypeError. does > > it still do that with this patch in place? > > No. :-( That wasn't an intentional change. The intent was > to convert an int/long to a double in the case of '%g' et al and > from a double to an int in the case of '%d'. > > What is the best way to fix this? To answer my own question, it appears that I should use PyFloat_AsDouble() and PyInt_AsLong() and check for an error. I don't know why I didn't do this before. This restores the original behaviour. I'll check this in later. Let me know if I screwed up again. I'll also update the tests to check for the exception. Neal From guido@python.org Sun Jul 28 17:37:39 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 28 Jul 2002 12:37:39 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 In-Reply-To: Your message of "Sun, 28 Jul 2002 12:13:12 EDT." <3D441818.83FD38F7@metaslash.com> References: <010901c2364e$5ce6be10$ced241d5@hagrid> <3D441818.83FD38F7@metaslash.com> Message-ID: <200207281637.g6SGbd816840@pcp02138704pcs.reston01.va.comcast.net> > Fredrik Lundh wrote: > > > > > SF patch #577031, remove PyArg_Parse() since it's deprecated > > > > > ! v = PyNumber_Float(v); > > > ! if (!v) > > > return -1; > > > > > v = PyNumber_Int(v); > > > ! if (!v) > > > return -1; > > > > umm. > > > > doesn't PyNumber_Float and PyNumber_Int convert its argument to > > a float/integer, if it's not already the right type? > > Yes. > > > in earlier versions of Python, "%g" % "1.0" raised a TypeError. does > > it still do that with this patch in place? > > No. :-( That wasn't an intentional change. The intent was > to convert an int/long to a double in the case of '%g' et al and > from a double to an int in the case of '%d'. > > What is the best way to fix this? If I call PyNumber_Check() > before this code, the behaviour is the same as before. Revert the change. I don't believe PyNumber_Check() is the right thing to use here at all. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jul 28 17:38:43 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 28 Jul 2002 12:38:43 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 In-Reply-To: Your message of "Sun, 28 Jul 2002 12:29:33 EDT." <3D441BED.43049678@metaslash.com> References: <010901c2364e$5ce6be10$ced241d5@hagrid> <3D441818.83FD38F7@metaslash.com> <3D441BED.43049678@metaslash.com> Message-ID: <200207281638.g6SGch016860@pcp02138704pcs.reston01.va.comcast.net> > To answer my own question, it appears that I should use > PyFloat_AsDouble() and PyInt_AsLong() and check for an error. > I don't know why I didn't do this before. This restores the > original behaviour. Good! > I'll check this in later. Let me know if I screwed up again. > > I'll also update the tests to check for the exception. Great! --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Sun Jul 28 18:21:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 28 Jul 2002 13:21:11 -0400 Subject: [Python-Dev] RE: companies data for sorting comparisons In-Reply-To: <008b01c23611$0bc24460$ced241d5@hagrid> Message-ID: [Tim] >> A larger lesson: even if Python gets a stable sort and >> advertises stability (we don't have to guarantee it even if >> it's there) [/F] > if we guarantee it, all python implementors must provide one. Or a middle ground, akin to CPython's semi-reluctant guarantees of refcount semantics for "timely" finalization. A great many CPython users appear quite happy to rely on this despite that the language doesn't guarantee it. > how hard is it to implement a reasonably good stable sort from > scratch? A straightforward mergesort using a temp vector of size N is dead easy, and reasonably good (O(N log N) worst case). There aren't any other major N log N sorts that are naturally stable, nor even any I know of (and I know of a lot ) that can be made stable without materializing list indices (or a moral equivalent). Insertion sort is naturally stable, but is O(N**2) expected case, so is DOA. > I can think of lots of really stupid ways to do it on top of existing > sort code, which might be a reason to provide two different sort > methods: sort (fast) and stablesort (guaranteed, but maybe not > as fast as sort). in CPython, both names can map to timsort. I don't want to see two sort methods on the list object, for reasons explained before. You've always been able to *get* a stable sort in Python via materializing the list indices in a 2-tuple, as in Alex's "stable sort" DSU recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234 People overly concerned about portability can stick to that. > (shouldn't you be writing a paper on this, btw? I don't think there's anything truly new here, although the combination of gimmicks may be unique. timsort.txt is close enough to a paper anyway, but better in that it only tells you useful things; the McIlroy paper covers all the rest . > or start a sort blog ;-) That must be some sort of web thing, hence beyond my limited abilities. From tim.one@comcast.net Sun Jul 28 18:52:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 28 Jul 2002 13:52:33 -0400 Subject: [Python-Dev] RE: companies data for sorting comparisons In-Reply-To: Message-ID: Turns out there was one comparison per merge step I wasn't extracting maximum value from. Changing the code to suck all I can out of it doesn't make a measurable difference on sortperf results, except for a tiny improvement on ~sort on my box, but makes a difference on the Exchange case of Kevin's data. Here using values = [(x.get(fieldname), i, x) for i, x in enumerate(data)] as the list to sort, and times are again in milliseconds: Sorting on field 'Address' -- 6589 records via sort: 41.24 41.39 41.41 41.42 86.71 via msort: 42.90 43.01 43.07 43.15 43.75 Sorting on field 'Company' -- 6635 records via sort: 40.24 40.34 40.42 40.43 42.58 via msort: 30.42 30.45 30.58 30.66 30.66 Sorting on field 'Exchange' -- 6579 records via sort: 59.64 59.70 59.71 59.72 59.81 via msort: 27.06 27.11 27.19 27.29 27.54 Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 47.61 47.65 47.73 47.75 47.76 via msort: 48.55 48.57 48.61 48.73 48.92 Sorting on field 'Phone' -- 6589 records via sort: 48.00 48.03 48.32 48.32 48.39 via msort: 49.60 49.64 49.68 49.79 49.85 Sorting on field 'Profile' -- 6635 records via sort: 58.63 58.70 58.80 58.85 58.92 via msort: 8.47 8.48 8.51 8.59 8.68 Sorting on field 'Symbol' -- 6635 records via sort: 39.93 40.13 40.16 40.28 41.37 via msort: 6.20 6.23 6.23 6.43 6.98 Sorting on field 'Web' -- 6632 records via sort: 46.75 46.77 46.86 46.87 47.05 via msort: 36.44 36.66 36.69 36.69 36.96 'Profile' is slower than the rest for samplesort because the strings it's comparing are Yahoo URLs with a long common prefix -- the compares just take longer in that case. I'm not sure why 'Exchange' takes so long for samplesort (it's a case with lots of duplicate primary keys, but the distribution is highly skewed, not uniform as in ~sort). In all cases now, msort is a major-to-killer win, or a small (but real) loss. I'll upload a new patch and new timsort.txt next. Then I'm taking a week off! No, I wish it were for fun . From Jack.Jansen@oratrix.com Sun Jul 28 22:03:30 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sun, 28 Jul 2002 23:03:30 +0200 Subject: [Python-Dev] python.org/switch/ In-Reply-To: <20020726223911.T70962-100000@onion.valueclick.com> Message-ID: <782E9B13-A26D-11D6-83B1-003065517236@oratrix.com> On zaterdag, juli 27, 2002, at 07:40 , Ask Bjoern Hansen wrote: > > As presented on the Perl Lightning talks here at OSCON: Switch > movies. > > You guys will dig Nathan's (nat.mov and nat.mpg). > > http://www.perl.org/tpc/2002/movies/switch/ They're all pretty good, but I think I liked David best, he actually seemed to mean what he said:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From greg@cosc.canterbury.ac.nz Mon Jul 29 00:43:05 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 29 Jul 2002 11:43:05 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> Message-ID: <200207282343.g6SNh55G016683@kuku.cosc.canterbury.ac.nz> Thomas Heller : > This PEP proposes an extension to the buffer interface called the > 'safe buffer interface'. I don't understand the need for this. The C-level buffer interface is already safe as long as you use it properly -- which means using it to fetch the pointer each time it's needed. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@zope.com Mon Jul 29 00:51:38 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 28 Jul 2002 19:51:38 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2 References: <15676.16356.112688.518256@anthem.wooz.org> <15676.24360.88972.449273@anthem.wooz.org> <200207272156.g6RLuU826463@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15684.33674.169550.228083@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> It's a bit uglier than that because since Lib/test gets >> magically added to sys.path during regrtest by virtue of >> running "python Lib/test/regrtest.py". GvR> Perhaps regrtest.py can specifically remove its own directory GvR> from sys.path? (Please don't just remove sys.path[0] or ''; GvR> look in sys.argv[0] and deduce from there.) Good idea: -------------------- snip snip -------------------- mydir = os.path.dirname(sys.argv[0]) sys.path.remove(mydir) -------------------- snip snip -------------------- I also followed up to Guido privately, re: the motivation for this change. Also, Neal's right, I missed some of the relative imports of test_support and I'm ready to commit those fixes once Guido gives the go ahead. -Barry From xscottg@yahoo.com Mon Jul 29 00:57:12 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 28 Jul 2002 16:57:12 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207282343.g6SNh55G016683@kuku.cosc.canterbury.ac.nz> Message-ID: <20020728235712.41025.qmail@web40112.mail.yahoo.com> --- Greg Ewing wrote: > Thomas Heller : > > > This PEP proposes an extension to the buffer interface called the > > 'safe buffer interface'. > > I don't understand the need for this. The C-level buffer > interface is already safe as long as you use it properly -- > which means using it to fetch the pointer each time it's > needed. > This is not my PEP, but let me defend it anyway. The need for this derives from wanting to do more than one thing at a time in Python (multiple processors with multiple threas, asynchronous I/O, DMA transers, ???). One thread grabs the pointer from the "safe buffer interface" and then releases the GIL while it works on that pointer. Now another thread is free to acquire the GIL and run concurrently with the first. (The asynchronous I/O case applies even on single processor machines...) I believe you were the one to explain to me why an extension can't release the GIL while it works with the PyBufferProcs acquired pointer. This PEP tries to allow the extension to do just that. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From zhaoqiang@neusoft.com Mon Jul 29 01:15:41 2002 From: zhaoqiang@neusoft.com (zhaoq) Date: Mon, 29 Jul 2002 08:15:41 +0800 Subject: [Python-Dev] Please remove me from the mailing list References: <000a01c23453$fc4b04e0$3745fea9@ibm1499> Message-ID: <010701c23695$633acc60$4a01010a@xpprofessional> This is a multi-part message in MIME format. --Boundary_(ID_yqfil81HVg0jUZY0v5uUNg) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Please remove me from the mailing list zhaoqiang@neusoft.com thanks ----- Original Message ----- From: Rick Farrer To: Python-Dev@python.org Sent: Friday, July 26, 2002 11:24 AM Subject: [Python-Dev] Please remove me from the mailing list Please remove me from the mailing list. rf@avisionone.com Thanks, Rick --Boundary_(ID_yqfil81HVg0jUZY0v5uUNg) Content-type: text/html; charset=iso-8859-1 Content-transfer-encoding: 7BIT
Please remove me from the mailing list
 
 
thanks
----- Original Message -----
Sent: Friday, July 26, 2002 11:24 AM
Subject: [Python-Dev] Please remove me from the mailing list

Please remove me from the mailing list.
 
 
Thanks,
Rick
 
--Boundary_(ID_yqfil81HVg0jUZY0v5uUNg)-- From xscottg@yahoo.com Mon Jul 29 01:29:57 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 28 Jul 2002 17:29:57 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <003701c23636$b25b0a80$3da48490@neil> Message-ID: <20020729002957.74716.qmail@web40101.mail.yahoo.com> --- Neil Hodgson wrote: > > Would this mean that the explicit locking completely defines the > validity of the address or is the address valid until the 'view' buffer > object is garbage collected? I would like the gapped_buffer to be put > back into gapped mode as soon as possible and depending on the lifetime > of a view buffer object is not that robust in the face of alternate > Python implementations that use non-reference-counted GC implementations > (Jython / Python .Net). > If you're worried about exactly when the object is released, you could add a specific release() method to your object indicating that you don't intend to use it anymore. My point was that, with Thomas Heller's safe buffer protocol (or my bytes object), you would have a pointer that could be manipulated independently of the GIL, but that putting locking semantics into your gapped_buffer is something you could add on top without complicating the core. In other words, his PEP (or mine) allows you to do something you couldn't necessarily do previously, and it doesn't sound like there is anything you want to do that you won't be able to. > > By locking, I want to change state on the buffer from having a gap and > allowing resizes to having a static size and address which will remain > valid until an unlock. The lock and unlock are not treating the buffer as > a mutex (I'd call the operations 'acquire' and 'release' then) although > mutexes may be needed for safety in the lock and unlock implementations. > It is likely that the lock and unlock would be counted (it can be locked > twice and then won't be expandable until it is unlocked twice) and that > exceptions would be thrown for length changing operations while locked. > You could easily implement the a counting (recursive) mutex as described above, and it might be the case that throwing an exception on the length changing operations keeps the dead lock from occurring. I'm still a bit confused though. When thread A locks (acquires) the buffer, and thread B tries to do a resize and it generates an exception, what is thread B supposed to do next? I assume that the resize was due to something like the user typing somewhere in the buffer. From a user interface point of view, you can't just ignore their request to insert text. Would you just try the same operation again after catching the exception? How long would you wait? > > If you think my particular use is out of the scope of what you are > trying to achieve then that is fine. > It is definitely up to Thomas Heller to decide what he wants his scope to be, and I don't want to step on his toes at all. Especially since the reason for his PEP getting written is that I didn't want to add this stuff to mine. :-) I'm just trying to point out two things: 1) With his PEP, there is a way to get the behavior you desire with out adding the complexity to the core of Python. And with recursive/counting mutexes, the behavior you want is getting more complicated. The "safe buffer protocol" is likely to cater to a wide class of users. I could be wrong, but the "lockable gapped buffer protocol" probably appeals to a much smaller set. 2) Any time you go from one lock (mutex, GIL, semaphore) to multiple locks, you can introduce deadlock states. Without my understanding your design fully, your use case sounds to me like it either has the potential for deadlock, or the potential for polling. There are ways to avoid this of course, but then everyone has to follow a more complicated set of rules (for instance build a hierarchy describing the order of locks to acquire). Since Thomas's PEP doesn't introduce any new types of locks, it sidesteps these problems. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From Rick Farrer" This is a multi-part message in MIME format. ------=_NextPart_000_0009_01C2366E.D543FBA0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable For the last time. Please remove me from your mailing = list!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ------=_NextPart_000_0009_01C2366E.D543FBA0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
For the last time. Please remove me = from your=20 mailing list!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 
------=_NextPart_000_0009_01C2366E.D543FBA0-- From greg@cosc.canterbury.ac.nz Mon Jul 29 03:13:23 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 29 Jul 2002 14:13:23 +1200 (NZST) Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 In-Reply-To: <3D441818.83FD38F7@metaslash.com> Message-ID: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> Neal Norwitz : > The intent was to convert an int/long to a double in the case of > '%g' et al and from a double to an int in the case of '%d'. Are you sure the latter part of that is a good idea? As a general principle, I don't think float->int conversions should be done automatically. What is the Python philosophy on that? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From neal@metaslash.com Mon Jul 29 03:31:39 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 28 Jul 2002 22:31:39 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> Message-ID: <3D44A90B.421E97DA@metaslash.com> Greg Ewing wrote: > > Neal Norwitz : > > > The intent was to convert an int/long to a double in the case of > > '%g' et al and from a double to an int in the case of '%d'. > > Are you sure the latter part of that is a good idea? As a general > principle, I don't think float->int conversions should be done > automatically. What is the Python philosophy on that? This is consistent with versions back to 1.5.2: Python 1.5.2 (#1, Jul 5 2001, 03:02:19) [GCC 2.96 20000731 (Red Hat Linux 7.1 2 on linux-i386 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> '%d' % 1.8 '1' Neal From guido@python.org Mon Jul 29 03:40:35 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 28 Jul 2002 22:40:35 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 In-Reply-To: Your message of "Mon, 29 Jul 2002 14:13:23 +1200." <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> Message-ID: <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net> > > The intent was to convert an int/long to a double in the case of > > '%g' et al and from a double to an int in the case of '%d'. > > Are you sure the latter part of that is a good idea? As a general > principle, I don't think float->int conversions should be done > automatically. What is the Python philosophy on that? I fully agree, but unfortunately, in a dark past, I was given a patch that did many good things, but as a side effect, made the PyArg_Parse* family silently truncate floats to ints. Two examples: >>> "%d" % 3.14 '3' >>> a = [] >>> a.insert(0.9, 42) >>> a [42] >>> I find the second example more aggravating than the first. This touches upon a recent discussion, where one of the suggestions was to use __index__ rather than __int__ in this case. I think that's not the right solution; perhaps instead, floats and float-like types should support __truncate__ and __round__ to convert them to ints in certain ways. (Of course then we can argue about whether to round to even, and what to do if the float is so large that its smallest unit of precision is larger than one.) --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Mon Jul 29 03:47:28 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 29 Jul 2002 14:47:28 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <20020728235712.41025.qmail@web40112.mail.yahoo.com> Message-ID: <200207290247.g6T2lSHV017233@kuku.cosc.canterbury.ac.nz> Scott Gilbert : > The need for this derives from wanting to do more than one thing at a time > in Python (multiple processors with multiple threas, asynchronous I/O, DMA > transers, ???). In any situation like that, you should be using some form of locking on the object concerned. The Python buffer interface is not the right place to deal with these issues. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Mon Jul 29 03:55:45 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 28 Jul 2002 22:55:45 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172 In-Reply-To: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> Message-ID: [Neal Norwitz] > The intent was to convert an int/long to a double in the case of > '%g' et al and from a double to an int in the case of '%d'. [Greg Ewing] > Are you sure the latter part of that is a good idea? As a general > principle, I don't think float->int conversions should be done > automatically. What is the Python philosophy on that? The philosophy for format codes is looser than elsewhere, else, e.g., "%s" % object would raise TypeError whenever object was a number or list, etc. I've often used %d with floats when I want them rounded to int and don't want to bother remembering how to trick a float format into suppressing the decimal point. Unfortunately, that's not quite what %d does (it truncates). Whatever, %s is like invoking str(), %r like invoking repr(), %d like invoking long(), and %g/e/f like invoking float() (although these are variants of long() and float() that refuse string arguments -- that's the exception that makes the rule easy to remember ). From skip@pobox.com Mon Jul 29 04:07:06 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 28 Jul 2002 22:07:06 -0500 Subject: [Python-Dev] Remove from mailing list In-Reply-To: <000c01c23698$bf2e32c0$3745fea9@ibm1499> References: <000c01c23698$bf2e32c0$3745fea9@ibm1499> Message-ID: <15684.45402.132334.108285@localhost.localdomain> Rick> For the last time. Please remove me from your mailing list! Try sending a note to python-dev-admin@python.org. Better yet, try using the interface Mailman provides for you: http://mail.python.org/mailman/listinfo/python-dev -- Skip Montanaro skip@pobox.com consulting: http://manatee.mojam.com/~skip/resume.html From aahz@pythoncraft.com Mon Jul 29 04:17:24 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 28 Jul 2002 23:17:24 -0400 Subject: [Python-Dev] Floats as indexes In-Reply-To: <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net> References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020729031724.GA20797@panix.com> On Sun, Jul 28, 2002, Guido van Rossum wrote: > > >>> "%d" % 3.14 > '3' > >>> a = [] > >>> a.insert(0.9, 42) > >>> a > [42] > >>> > > I find the second example more aggravating than the first. This > touches upon a recent discussion, where one of the suggestions was > to use __index__ rather than __int__ in this case. I think that's > not the right solution; perhaps instead, floats and float-like types > should support __truncate__ and __round__ to convert them to ints in > certain ways. (Of course then we can argue about whether to round to > even, and what to do if the float is so large that its smallest unit > of precision is larger than one.) Blech. I believe that floats and similar objects should never be implicitly converted to indexes. There are too many ways for silent errors to get propagated. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From xscottg@yahoo.com Mon Jul 29 04:23:03 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 28 Jul 2002 20:23:03 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207290247.g6T2lSHV017233@kuku.cosc.canterbury.ac.nz> Message-ID: <20020729032303.28931.qmail@web40108.mail.yahoo.com> --- Greg Ewing wrote: > > > The need for this derives from wanting to do more than one thing at a > > time in Python (multiple processors with multiple threas, asynchronous > > I/O, DMA transers, ???). > > In any situation like that, you should be using some form > of locking on the object concerned. The Python buffer > interface is not the right place to deal with these > issues. > I humbly disagree with you, and I like his proposal. His PEP is simple and the locking business could lead to a mess if everyone involved is not very careful. However, I'll let him champion his PEP. I've got my own stuff to worry about, and this is part of why I didn't want to add new protocol to the PEP I've been working on. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From martin@v.loewis.de Mon Jul 29 07:39:48 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 29 Jul 2002 08:39:48 +0200 Subject: [Python-Dev] Please remove me from the mailing list In-Reply-To: <010701c23695$633acc60$4a01010a@xpprofessional> References: <000a01c23453$fc4b04e0$3745fea9@ibm1499> <010701c23695$633acc60$4a01010a@xpprofessional> Message-ID: zhaoq writes: > Please remove me from the mailing list You have subscribed yourself by deliberate action, so you need to actively unsubscribe yourself as well. What mailing list are you talking about, anyway? Regards, Martin From ville.vainio@swisslog.com Mon Jul 29 09:10:58 2002 From: ville.vainio@swisslog.com (Ville Vainio) Date: Mon, 29 Jul 2002 11:10:58 +0300 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> Message-ID: <3D44F892.6090401@swisslog.com> Fran=E7ois Pinard wrote: >>> >>> def stripIndent( s ): >>> ... indent =3D len(s) - len(s.lstrip()) >>> ... sLines =3D s.split('\n') >>> ... resultLines =3D [ line[indent:] for line in sLines ] >>> ... return ''.join( resultLines ) >>> =20 >>> > > > =20 > >>Something like this should really be available somewhere in the standar= d >>library (string module [yeah, predeprecation, I know], string >> =20 >> >In fact, I like my doc-strings and other triple-quoted strings flushed l= eft. >So, I can see them in the code exactly as they will appear on the screen. > Enabling one to strip the indentation wouldn't hurt this practice of=20 yours one bit (nobody forces you to use it). To my eyes left-flushing=20 the blocks disrupts the natural "flow" of the code, and breaks the=20 intuitive block structure of the program. >If I used artificial margins in Python so my doc-strings appeared to be >indented more than the surrounding, and wrote my code this way, it would >appear artificially constricted on the left once printed. It's not wort= h. > Could you axplain what you mean by artificially constricted? Of course=20 only the amount of space in the left margin would be removed,=20 indentation would work exactly the same. Which one looks better: ++++++++++++++++++++++++ def usage(): if 1: print """\ You should have done this and that """.stripindent() +++++++++++++++++++++++++ def usage(): if 1: print """\ You should have done this and that """ ++++++++++++++++++++++++++ When you are scanning code, the non-stripindent version of the 3-quoted=20 string jumps at your face as a "top-level" construct, even if it is only=20 associated with the usage() function. >My opinion is that it is nice this way. Don't touch the thing! :-) > Again, the change would not influence your code or practices one bit. -- Ville From mal@lemburg.com Mon Jul 29 10:02:43 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 29 Jul 2002 11:02:43 +0200 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <3D44F892.6090401@swisslog.com> Message-ID: <3D4504B3.3030608@lemburg.com> Ville Vainio wrote: > Which one looks better: > ++++++++++++++++++++++++ > def usage(): > if 1: > print """\ > You should have done this > and that > """.stripindent() > +++++++++++++++++++++++++ > def usage(): > if 1: > print """\ > You should have done this > and that > """ > ++++++++++++++++++++++++++ > > When you are scanning code, the non-stripindent version of the 3-quoted > string jumps at your face as a "top-level" construct, even if it is only > associated with the usage() function. I think everybody has their own way of formatting multi-line strings and/or comments. There's no one-fits-all strategy. So instead of trying to find a compromise, why don't you write up a flexible helper function for the new textwrap module ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ville.vainio@swisslog.com Mon Jul 29 10:27:37 2002 From: ville.vainio@swisslog.com (Ville Vainio) Date: Mon, 29 Jul 2002 12:27:37 +0300 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> Message-ID: <3D450A89.7050400@swisslog.com> M.-A. Lemburg wrote: > I think everybody has their own way of formatting multi-line > strings and/or comments. There's no one-fits-all strategy. Yep, but having a standard solution available to a one, very sensible strategy would be nice. > > So instead of trying to find a compromise, why don't you write up > a flexible helper function for the new textwrap module ? I don't think there is all that much implementation to do: inspect.getdoc() already has an implementation that seems to do the right thing, it's just that the stripping is embedded into the getdoc function, instead of having it available as a seperate function. textwrap might be a good place to put it, considering that the string module is going away - even if no actual wrapping takes place. -------------------------------------------------- def getdoc(object): """Get the documentation string for an object. All tabs are expanded to spaces. To clean up docstrings that are indented to line up with blocks of code, any whitespace than can be uniformly removed from the second line onwards is removed.""" try: doc = object.__doc__ except AttributeError: return None if not isinstance(doc, (str, unicode)): return None try: lines = string.split(string.expandtabs(doc), '\n') except UnicodeError: return None else: margin = None for line in lines[1:]: content = len(string.lstrip(line)) if not content: continue indent = len(line) - content if margin is None: margin = indent else: margin = min(margin, indent) if margin is not None: for i in range(1, len(lines)): lines[i] = lines[i][margin:] return string.join(lines, '\n') ------------------------------------------ From mal@lemburg.com Mon Jul 29 10:44:37 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 29 Jul 2002 11:44:37 +0200 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> <3D450A89.7050400@swisslog.com> Message-ID: <3D450E85.5090806@lemburg.com> Ville Vainio wrote: > M.-A. Lemburg wrote: > >> I think everybody has their own way of formatting multi-line >> strings and/or comments. There's no one-fits-all strategy. > > > Yep, but having a standard solution available to a one, very sensible > strategy would be nice. > >> >> So instead of trying to find a compromise, why don't you write up >> a flexible helper function for the new textwrap module ? > > > I don't think there is all that much implementation to do: > inspect.getdoc() already has an implementation that seems to do the > right thing, it's just that the stripping is embedded into the getdoc > function, instead of having it available as a seperate function. > textwrap might be a good place to put it, considering that the string > module is going away - even if no actual wrapping takes place. Oh, I think it is worthwhile applying some optional wrapping for overly long doc-strings as well. But there you go again: people simply don't match up when it comes to text formatting. It's all a matter of taste and style (e.g. in the US it is very common to indent the first line of a paragraph while in most of Europe is not). How about starting with a simple textwrap.dedent() API and then moving on towards the full monty textwrap.reformat() API with tons of options ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ville.vainio@swisslog.com Mon Jul 29 11:22:54 2002 From: ville.vainio@swisslog.com (Ville Vainio) Date: Mon, 29 Jul 2002 13:22:54 +0300 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> <3D450A89.7050400@swisslog.com> <3D450E85.5090806@lemburg.com> Message-ID: <3D45177E.6030503@swisslog.com> M.-A. Lemburg wrote: > How about starting with a simple textwrap.dedent() API and then > moving on towards the full monty textwrap.reformat() API with tons of > options ?! Fine with me - at least w/ dedent() everyone can agree with the right behaviour (except handling of the first line?), and it would be general enough to be useful for everybody (no need for options/customization) - hence the justification for a position in the std lib. I haven't had much use for intricate wrapping/reformatting yet, but I guess I will once it hits the std lib ;-). -- Ville From rwgk@yahoo.com Mon Jul 29 14:02:00 2002 From: rwgk@yahoo.com (Ralf W. Grosse-Kunstleve) Date: Mon, 29 Jul 2002 06:02:00 -0700 (PDT) Subject: [Python-Dev] pickling of large arrays Message-ID: <20020729130200.73932.qmail@web20201.mail.yahoo.com> We are using Boost.Python to expose reference-counted C++ container types (similar to std::vector<>) to Python. E.g.: from arraytbx import shared d = shared.double(1000000) # double array with a million elements c = shared.complex_double(100) # std::complex array # and many more types, incl. several custom C++ types We need a way to pickle these arrays. Since they can easily be converted to tuples we could just define functions like: def __getstate__(self): return tuple(self) However, since the arrays are potentially huge this could incur a large overhead (e.g. a tuple of a million Python float). Next idea: def __getstate__(self): return iter(self) Unfortunately (but not unexpectedly) pickle is telling me: 'can't pickle iterator objects' Attached is a short Python script (tested with 2.2.1) with a prototype implementation of a pickle helper ("piece_meal") for large arrays. piece_meal's __getstate__ converts a block of a given size to a Python list and returns a tuple with that list and a new piece_meal instance which knows how to generate the next chunk. I.e. piece_meal instances are created recursively until the input sequence is exhausted. The corresponding __setstate__ puts the pieces back together again (uncomment the print statement to see the pieces). I am wondering if a similar mechanism could be used to enable pickling of iterators, or maybe special "pickle_iterators", which would immediately enable pickling of our large arrays or any other object that can be iterated over (e.g. Numpy arrays which are currently pickled as potentially huge strings). Has this been discussed already? Are there better ideas? Ralf import pickle class piece_meal: block_size = 4 def __init__(self, sequence, position): self.sequence = sequence self.position = position def __getstate__(self): next_position = self.position - piece_meal.block_size if (next_position <= 0): return (self.sequence[:self.position], 0) return (self.sequence[next_position:self.position], piece_meal(self.sequence, next_position)) def __setstate__(self, state): #print "piece_meal:", state if (state[1] == 0): self.sequence = state[0] else: self.sequence = state[1].sequence + state[0] class array: def __init__(self, n): self.elems = [i for i in xrange(n)] def __getstate__(self): return piece_meal(self.elems, len(self.elems)) def __setstate__(self, state): self.elems = state.sequence def exercise(): for i in xrange(11): a = array(i) print a.elems s = pickle.dumps(a) b = pickle.loads(s) print b.elems if (__name__ == "__main__"): exercise() __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From nhodgson@bigpond.net.au Mon Jul 29 14:52:40 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 29 Jul 2002 23:52:40 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> Message-ID: <00c601c23707$35819a20$3da48490@neil> Scott Gilbert: > You could easily implement the a counting (recursive) mutex as described > above, and it might be the case that throwing an exception on the length > changing operations keeps the dead lock from occurring. I'm still a bit > confused though. Not as confused as I am. I don't think deadlocks or threads are that relevant to me. The most likely situations in which I would use the buffer interface is to perform large I/O operations without copying or when performing asynchronous I/O to load or save documents while continuing to run styling or linting tasks. I think its likely that the pieces of code accessing the buffer will not be real threads, but instead be cooperating contexts within a single-threaded UI framework so using semaphores will not be possible. > 1) With his PEP, there is a way to get the behavior you desire with out > adding the complexity to the core of Python. And with recursive/counting > mutexes, the behavior you want is getting more complicated. I don't want counting mutexes. I'm not defining behaviour that needs them. > The "safe > buffer protocol" is likely to cater to a wide class of users. I could be > wrong, but the "lockable gapped buffer protocol" probably appeals to a much > smaller set. Its not that a "lockable gapped buffer protocol" is needed. It is that the problem with the old buffer was that the lifetime of the pointer is not well defined. The proposal changes that by making the lifetime of the pointer be the same as the underlying object. This restricts the set of objects that can be buffers to statically sized objects. I'd prefer that dynamically resizable objects be able to be buffers. > 2) Any time you go from one lock (mutex, GIL, semaphore) to multiple > locks, you can introduce deadlock states. My defined behaviour was "Upon receiving a lock call, it could collapse the gap and return a stable pointer to its contents and then revert to its normal behaviour on receiving an unlock". Where is a semaphore involved? Without a semaphore (or equivalent) there can be no deadlock. Neil From guido@python.org Mon Jul 29 15:19:00 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 10:19:00 -0400 Subject: [Python-Dev] Re: Multiline string constants, include in the standard library? In-Reply-To: Your message of "Mon, 29 Jul 2002 12:27:37 +0300." <3D450A89.7050400@swisslog.com> References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> <3D450A89.7050400@swisslog.com> Message-ID: <200207291419.g6TEJ0m26497@pcp02138704pcs.reston01.va.comcast.net> > > I think everybody has their own way of formatting multi-line > > strings and/or comments. There's no one-fits-all strategy. > > Yep, but having a standard solution available to a one, very sensible > strategy would be nice. Can you move this discussion to c.l.py please? --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Mon Jul 29 15:34:46 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 16:34:46 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> Message-ID: <06f301c2370d$16941060$e000a8c0@thomasnotebook> [Scott] > > The "safe > > buffer protocol" is likely to cater to a wide class of users. I could be > > wrong, but the "lockable gapped buffer protocol" probably appeals to a > much > > smaller set. > [Neil] > Its not that a "lockable gapped buffer protocol" is needed. It is that > the problem with the old buffer was that the lifetime of the pointer is not > well defined. The proposal changes that by making the lifetime of the > pointer be the same as the underlying object. That's exactly what *I* need, ... > This restricts the set of > objects that can be buffers to statically sized objects. I'd prefer that > dynamically resizable objects be able to be buffers. > ..., but I understand Neil's requirements. Can they be fulfilled by adding some kind of UnlockObject() call to the 'safe buffer interface', which should mean 'I won't use the pointer received by getsaferead/writebufferproc any more'? Thomas From guido@python.org Mon Jul 29 16:00:51 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 11:00:51 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Fri, 26 Jul 2002 16:28:50 +0200." <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> Message-ID: <200207291500.g6TF0pM26852@pcp02138704pcs.reston01.va.comcast.net> Thomas, I like your PEP. Could you clean it up (changing 'large' into 'safe' etc.) and send it to Barry? Some comments: > Backward Compatibility > > There are no backward compatibility problems. That's a simplification of the truth -- you're adding two new fields to an existing struct. But the flag bit you add makes that old and new versions of the struct can be distinguished. > It may be a good idea to expose the following convenience functions: > > int PyObject_AsSafeReadBuffer(PyObject *obj, > void **buffer, > size_t *buffer_len); > > int PyObject_AsSafeWriteBuffer(PyObject *obj, > void **buffer, > size_t *buffer_len); > > These functions return 0 on success, set buffer to the memory > location and buffer_len to the length of the memory block in > bytes. On failure, they return -1 and set an exception. Please make these a manadatory part of the proposal. Please also try to summarize the discussion so far here. My personal opinion: locking seems the wrong approach, given the danger of deadlock; Scintilla can use the existing buffer protocol, assuming its buffer doesn't move as long as you don't release the GIL and don't make calls into Scintilla. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 29 16:09:22 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 11:09:22 -0400 Subject: [Python-Dev] pickling of large arrays In-Reply-To: Your message of "Mon, 29 Jul 2002 06:02:00 PDT." <20020729130200.73932.qmail@web20201.mail.yahoo.com> References: <20020729130200.73932.qmail@web20201.mail.yahoo.com> Message-ID: <200207291509.g6TF9MX26908@pcp02138704pcs.reston01.va.comcast.net> > We are using Boost.Python to expose reference-counted C++ container > types (similar to std::vector<>) to Python. E.g.: > > from arraytbx import shared > d = shared.double(1000000) # double array with a million elements > c = shared.complex_double(100) # std::complex array > # and many more types, incl. several custom C++ types > > We need a way to pickle these arrays. Since they can easily be > converted to tuples we could just define functions like: > > def __getstate__(self): > return tuple(self) > > However, since the arrays are potentially huge this could incur > a large overhead (e.g. a tuple of a million Python float). > Next idea: > > def __getstate__(self): > return iter(self) > > Unfortunately (but not unexpectedly) pickle is telling me: > 'can't pickle iterator objects' > > Attached is a short Python script (tested with 2.2.1) with a prototype > implementation of a pickle helper ("piece_meal") for large arrays. That's a neat trick, unfortunately it only helps when the pickle is being written directly to disk; when it is returned as a string, you still get the entire array in memory. > piece_meal's __getstate__ converts a block of a given size to a Python > list and returns a tuple with that list and a new piece_meal instance > which knows how to generate the next chunk. I.e. piece_meal instances > are created recursively until the input sequence is exhausted. The > corresponding __setstate__ puts the pieces back together again > (uncomment the print statement to see the pieces). > > I am wondering if a similar mechanism could be used to enable pickling > of iterators, or maybe special "pickle_iterators", which would > immediately enable pickling of our large arrays or any other object > that can be iterated over (e.g. Numpy arrays which are currently > pickled as potentially huge strings). Has this been discussed already? I think pickling iterators is the wrong idea. An iterator doesn't represent data, it represents a single pass over data. Iterators may represent infinite series. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Mon Jul 29 16:51:25 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 29 Jul 2002 11:51:25 -0400 Subject: [Python-Dev] pickling of large arrays In-Reply-To: <20020729130200.73932.qmail@web20201.mail.yahoo.com> References: <20020729130200.73932.qmail@web20201.mail.yahoo.com> Message-ID: <20020729155125.GA5765@panix.com> On Mon, Jul 29, 2002, Ralf W. Grosse-Kunstleve wrote: > > We need a way to pickle these arrays. See PEP 296 and read the back discussion on python-dev in the archives. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Mon Jul 29 17:05:49 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 12:05:49 -0400 Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: Your message of "Fri, 26 Jul 2002 19:26:38 PDT." <20020727022638.86727.qmail@web40101.mail.yahoo.com> References: <20020727022638.86727.qmail@web40101.mail.yahoo.com> Message-ID: <200207291605.g6TG5o428945@pcp02138704pcs.reston01.va.comcast.net> > Even if I'm wrong about the need for this, at the very least, the > additional functionality can be added later. I really just want to push > through a simple, usable, bytes object for the time being. We can easily > add, we can't easily take away. Hi Scott, I've followed this discussion and it looks like the PEP is ready for another round of refinements based upon the discussion (e.g. to use size_t). Do you have time to do that? And then the next thing would be a prototype implementation. I like where this is going! --Guido van Rossum (home page: http://www.python.org/~guido/) From xscottg@yahoo.com Mon Jul 29 17:39:13 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 09:39:13 -0700 (PDT) Subject: [Python-Dev] PEP 296 - The Buffer Problem In-Reply-To: <200207291605.g6TG5o428945@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020729163913.46117.qmail@web40102.mail.yahoo.com> --- Guido van Rossum wrote: > > I've followed this discussion and it looks like the PEP is ready for > another round of refinements based upon the discussion (e.g. to use > size_t). Do you have time to do that? > > And then the next thing would be a prototype implementation. > > I like where this is going! > Very cool. I'm glad to hear it. I'll integrate the new changes to the text tonight and post the next version to python-dev and comp.lang.python tomorrow. Implementation is in progress, but not far enough along that I can swag a done date yet. It shouldn't take too long, but like you indicated before, I may need some help on doing the pickling correctly. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From mcherm@destiny.com Mon Jul 29 17:42:08 2002 From: mcherm@destiny.com (Michael Chermside) Date: Mon, 29 Jul 2002 12:42:08 -0400 Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants Message-ID: <3D457060.4060505@destiny.com> > So... What you (and others) think about just adding flag 'i' to string > constants (that will strip indentation etc.)? This doesn't affect > existing code, but it will be useful (at least for me ;-) Motivation > was posted here by Michael Chermside, but I don't like his solutions. Please understand that the motivation I posted was an attempt to describe YOUR possible motivation for desiring the change. I wouldn't like this feature, myself. I was just trying to point out that it could all be achieved with somewhere between 1 character and 5 lines worth of code. The solution to this (so-called) "problem" simply does not belong in the language itself, despite the fact that you don't like my solutions. However, if you have a particular reason why you don't like these solutions, send me an email (don't CC the list), and I'll see if I can come up with a different solution you DO like. -- Michael Chermside From xscottg@yahoo.com Mon Jul 29 17:45:32 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 09:45:32 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <00c601c23707$35819a20$3da48490@neil> Message-ID: <20020729164532.48588.qmail@web40110.mail.yahoo.com> --- Neil Hodgson wrote: > Scott Gilbert: > > > You could easily implement the a counting (recursive) mutex as > > described above, and it might be the case that throwing an exception > > on the length changing operations keeps the dead lock from occurring. > > I'm still a bit confused though. > > Not as confused as I am. I don't think deadlocks or threads are that > relevant to me. The most likely situations in which I would use the > buffer interface is to perform large I/O operations without copying or > when performing asynchronous I/O to load or save documents while > continuing to run styling or linting tasks. I think its likely that the > pieces of code accessing the buffer will not be real threads, but instead > be cooperating contexts within a single-threaded UI framework so using > semaphores will not be possible. > What happens when you've locked the buffer and passed a pointer to the I/O system for an asynchronous operation, but before that operation has completed, your main program wants to resize the buffer due to a user generated event? I had written responses/questions to other parts of your message, but I found that I was just asking the same question above over and over, so I've chopped them out. If you can explain this to me, and there aren't any problems with deadlock or polling, then I'll quit interfering and let you and Thomas decide if you really think the locking semantics are useful to a wide enough audience that it should be included in the core. > > I don't want counting mutexes. I'm not defining behavior that needs > them. > You said you wanted the locks to keep a count. So that you could call acquire() multiple times and have the buffer not truly become unlocked until release() was called the same amount of times. I'm willing to adopt any terminology you want for the purpose of this discussion. I think I understand the semantics or the counting operation, but I want to understand more what actually happens when the buffer is locked. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From thomas.heller@ion-tof.com Mon Jul 29 17:52:17 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 18:52:17 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook> <200207291500.g6TF0pM26852@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <091701c23720$4c3c3310$e000a8c0@thomasnotebook> From: "Guido van Rossum" > Thomas, > > I like your PEP. Could you clean it up (changing 'large' into 'safe' > etc.) and send it to Barry? Some comments: Great. I have changed it to your reqeusts, and also included Greg's and Neil's points. Thomas From xscottg@yahoo.com Mon Jul 29 17:54:19 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 09:54:19 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <06f301c2370d$16941060$e000a8c0@thomasnotebook> Message-ID: <20020729165419.31643.qmail@web40111.mail.yahoo.com> --- Thomas Heller wrote: > > > This restricts the set of objects that can be buffers to statically > > sized objects. I'd prefer that dynamically resizable objects be able to > > be buffers. > > > > ..., but I understand Neil's requirements. > > Can they be fulfilled by adding some kind of UnlockObject() > call to the 'safe buffer interface', which should mean 'I won't > use the pointer received by getsaferead/writebufferproc any more'? > I assume this means any call to getsafereadpointer()/getsafewritepointer() will increment the lock count. So the UnlockObject() calls will be mandatory. Either that, or you'll have an explicit LockObject() call as well. What behavior should happen when a resise is attempted while the lock count is positive? __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From thomas.heller@ion-tof.com Mon Jul 29 18:03:30 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 19:03:30 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> Message-ID: <093b01c23721$dd908680$e000a8c0@thomasnotebook> From: "Scott Gilbert" > > --- Thomas Heller wrote: > > > > > This restricts the set of objects that can be buffers to statically > > > sized objects. I'd prefer that dynamically resizable objects be able to > > > be buffers. > > > > > > > ..., but I understand Neil's requirements. > > > > Can they be fulfilled by adding some kind of UnlockObject() > > call to the 'safe buffer interface', which should mean 'I won't > > use the pointer received by getsaferead/writebufferproc any more'? > > > > I assume this means any call to getsafereadpointer()/getsafewritepointer() > will increment the lock count. So the UnlockObject() calls will be > mandatory. Either that, or you'll have an explicit LockObject() call as > well. What behavior should happen when a resise is attempted while the > lock count is positive? This question is not difficult to answer;-) The resize should fail. That's the only possibility. If this can be handled robust enough by the object is another question. Probably this all is too complicated to be solved by the safe buffer interface, and it should be left out? Thomas From guido@python.org Mon Jul 29 18:03:55 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 13:03:55 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Mon, 29 Jul 2002 09:54:19 PDT." <20020729165419.31643.qmail@web40111.mail.yahoo.com> References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> Message-ID: <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> > --- Thomas Heller wrote: > > > > > This restricts the set of objects that can be buffers to statically > > > sized objects. I'd prefer that dynamically resizable objects be able to > > > be buffers. > > > > > > > ..., but I understand Neil's requirements. > > > > Can they be fulfilled by adding some kind of UnlockObject() > > call to the 'safe buffer interface', which should mean 'I won't > > use the pointer received by getsaferead/writebufferproc any more'? > > > > I assume this means any call to getsafereadpointer()/getsafewritepointer() > will increment the lock count. So the UnlockObject() calls will be > mandatory. Either that, or you'll have an explicit LockObject() call as > well. What behavior should happen when a resise is attempted while the > lock count is positive? I don't like where this is going. Let's not add locking to the buffer protocol. If an object's buffer isn't allocated for the object's life when the object is created, it should not support the "safe" version of the protocol (maybe a different name would be better), and users should not release the GIL while using on to the pointer. (Exactly which other API calls are safe while using the pointer is not clear; probably nothing that could possibly invoke the Python interpreter recursively, since that might release the GIL. This would generally mean that calls to Py_DECREF() are unsafe while holding on to a buffer pointer!) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Mon Jul 29 18:08:11 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 19:08:11 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <095701c23722$84e06770$e000a8c0@thomasnotebook> From: "Guido van Rossum" > If an object's buffer isn't allocated for the object's life > when the object is created, it should not support the "safe" version > of the protocol (maybe a different name would be better), and users > should not release the GIL while using on to the pointer. 'Persistent' buffer interface? Too long? Thomas From oren-py-d@hishome.net Mon Jul 29 18:08:24 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 29 Jul 2002 20:08:24 +0300 Subject: [Python-Dev] patch: try/finally in generators Message-ID: <20020729200824.A5391@hishome.net> http://www.python.org/sf/584626 This patch removes the limitation of not allowing yield in the try part of a try/finally. The dealloc function of a generator checks if the generator is still alive and resumes it one last time from the return instruction at the end of the code, causing any try/finally blocks to be triggered. Any exceptions raised are treated just like exceptions in a __del__ finalizer (printed and ignored). Oren From guido@python.org Mon Jul 29 18:10:44 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 13:10:44 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Mon, 29 Jul 2002 19:08:11 +0200." <095701c23722$84e06770$e000a8c0@thomasnotebook> References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> <095701c23722$84e06770$e000a8c0@thomasnotebook> Message-ID: <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net> > > If an object's buffer isn't allocated for the object's life > > when the object is created, it should not support the "safe" version > > of the protocol (maybe a different name would be better), and users > > should not release the GIL while using on to the pointer. > > 'Persistent' buffer interface? Too long? No, persistent typically refers to things that survive longer than a process. Maybe 'static' buffer interface would work. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Mon Jul 29 18:14:51 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 19:14:51 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> <095701c23722$84e06770$e000a8c0@thomasnotebook> <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <098e01c23723$73867590$e000a8c0@thomasnotebook> > > > If an object's buffer isn't allocated for the object's life > > > when the object is created, it should not support the "safe" version > > > of the protocol (maybe a different name would be better), and users > > > should not release the GIL while using on to the pointer. > > > > 'Persistent' buffer interface? Too long? > > No, persistent typically refers to things that survive longer than a > process. Maybe 'static' buffer interface would work. > Ahem, right. Maybe Barry can change it before committing this? Thomas From guido@python.org Mon Jul 29 18:34:01 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 13:34:01 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 20:08:24 +0300." <20020729200824.A5391@hishome.net> References: <20020729200824.A5391@hishome.net> Message-ID: <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> > http://www.python.org/sf/584626 > > This patch removes the limitation of not allowing yield in the try part > of a try/finally. The dealloc function of a generator checks if the > generator is still alive and resumes it one last time from the return > instruction at the end of the code, causing any try/finally blocks to be > triggered. Any exceptions raised are treated just like exceptions in a > __del__ finalizer (printed and ignored). I'm not sure I understand what it does. The return instruction at the end of the code, if I take this literally, isn't enclosed in any try/finally blocks. So how can this have the desired effect? Have you verified that Jython can implement these semantics too? Do you *really* need this? --Guido van Rossum (home page: http://www.python.org/~guido/) From DavidA@ActiveState.com Mon Jul 29 19:01:46 2002 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 29 Jul 2002 11:01:46 -0700 Subject: [Python-Dev] python.org/switch/ References: <782E9B13-A26D-11D6-83B1-003065517236@oratrix.com> Message-ID: <3D45830A.6090207@ActiveState.com> Jack Jansen wrote: > They're all pretty good, but I think I liked David best, he actually > seemed to mean what he said:-) I_do_,_it's_why_you_haven't_seen_me_much_around_these_parts_recently... --david (those_who_saw_the_ad_may_understand_my_typing_oddities). From xscottg@yahoo.com Mon Jul 29 19:13:38 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 11:13:38 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <098e01c23723$73867590$e000a8c0@thomasnotebook> Message-ID: <20020729181338.59568.qmail@web40107.mail.yahoo.com> --- Thomas Heller and Guido wrote: > > > > If an object's buffer isn't allocated for the object's life > > > > when the object is created, it should not support the "safe" > > > > version of the protocol (maybe a different name would be better), > > > > and users should not release the GIL while using on to the pointer. > > > > > > 'Persistent' buffer interface? Too long? > > > > No, persistent typically refers to things that survive longer than a > > process. Maybe 'static' buffer interface would work. > > I'll just chime in with the name "Fixed" Buffer Interface. They aren't really static either, and fixed applies in at least two senses. :-) __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From guido@python.org Mon Jul 29 19:24:41 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 14:24:41 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Mon, 29 Jul 2002 11:13:38 PDT." <20020729181338.59568.qmail@web40107.mail.yahoo.com> References: <20020729181338.59568.qmail@web40107.mail.yahoo.com> Message-ID: <200207291824.g6TIOfq30468@pcp02138704pcs.reston01.va.comcast.net> > > > > 'Persistent' buffer interface? Too long? > > > > > > No, persistent typically refers to things that survive longer than a > > > process. Maybe 'static' buffer interface would work. > > I'll just chime in with the name "Fixed" Buffer Interface. They aren't > really static either, and fixed applies in at least two senses. :-) Nice! --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Mon Jul 29 19:36:56 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Mon, 29 Jul 2002 20:36:56 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729181338.59568.qmail@web40107.mail.yahoo.com> Message-ID: <0a9e01c2372e$ea80fb60$e000a8c0@thomasnotebook> From: "Scott Gilbert" > --- Thomas Heller and Guido wrote: > > > > > > If an object's buffer isn't allocated for the object's life > > > > > when the object is created, it should not support the "safe" > > > > > version of the protocol (maybe a different name would be better), > > > > > and users should not release the GIL while using on to the pointer. > > > > > > > > 'Persistent' buffer interface? Too long? > > > > > > No, persistent typically refers to things that survive longer than a > > > process. Maybe 'static' buffer interface would work. > > > > > I'll just chime in with the name "Fixed" Buffer Interface. They aren't > really static either, and fixed applies in at least two senses. :-) > Yup. I'll change it. Thanks, Thomas From barry@zope.com Mon Jul 29 19:38:06 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 29 Jul 2002 14:38:06 -0400 Subject: [Python-Dev] PEP 1, PEP Purpose and Guidelines Message-ID: <15685.35726.678832.241665@anthem.wooz.org> It has been a while since I posted a copy of PEP 1 to the mailing lists and newsgroups. I've recently done some updating of a few sections, so in the interest of gaining wider community participation in the Python development process, I'm posting the latest revision of PEP 1 here. A version of the PEP is always available on-line at http://www.python.org/peps/pep-0001.html Enjoy, -Barry -------------------- snip snip -------------------- PEP: 1 Title: PEP Purpose and Guidelines Version: $Revision: 1.36 $ Last-Modified: $Date: 2002/07/29 18:34:59 $ Author: Barry A. Warsaw, Jeremy Hylton Status: Active Type: Informational Created: 13-Jun-2000 Post-History: 21-Mar-2001, 29-Jul-2002 What is a PEP? PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python. The PEP should provide a concise technical specification of the feature and a rationale for the feature. We intend PEPs to be the primary mechanisms for proposing new features, for collecting community input on an issue, and for documenting the design decisions that have gone into Python. The PEP author is responsible for building consensus within the community and documenting dissenting opinions. Because the PEPs are maintained as plain text files under CVS control, their revision history is the historical record of the feature proposal[1]. Kinds of PEPs There are two kinds of PEPs. A standards track PEP describes a new feature or implementation for Python. An informational PEP describes a Python design issue, or provides general guidelines or information to the Python community, but does not propose a new feature. Informational PEPs do not necessarily represent a Python community consensus or recommendation, so users and implementors are free to ignore informational PEPs or follow their advice. PEP Work Flow The PEP editor, Barry Warsaw , assigns numbers for each PEP and changes its status. The PEP process begins with a new idea for Python. It is highly recommended that a single PEP contain a single key proposal or new idea. The more focussed the PEP, the more successfully it tends to be. The PEP editor reserves the right to reject PEP proposals if they appear too unfocussed or too broad. If in doubt, split your PEP into several well-focussed ones. Each PEP must have a champion -- someone who writes the PEP using the style and format described below, shepherds the discussions in the appropriate forums, and attempts to build community consensus around the idea. The PEP champion (a.k.a. Author) should first attempt to ascertain whether the idea is PEP-able. Small enhancements or patches often don't need a PEP and can be injected into the Python development work flow with a patch submission to the SourceForge patch manager[2] or feature request tracker[3]. The PEP champion then emails the PEP editor with a proposed title and a rough, but fleshed out, draft of the PEP. This draft must be written in PEP style as described below. If the PEP editor approves, he will assign the PEP a number, label it as standards track or informational, give it status 'draft', and create and check-in the initial draft of the PEP. The PEP editor will not unreasonably deny a PEP. Reasons for denying PEP status include duplication of effort, being technically unsound, not providing proper motivation or addressing backwards compatibility, or not in keeping with the Python philosophy. The BDFL (Benevolent Dictator for Life, Guido van Rossum) can be consulted during the approval phase, and is the final arbitrator of the draft's PEP-ability. If a pre-PEP is rejected, the author may elect to take the pre-PEP to the comp.lang.python newsgroup (a.k.a. python-list@python.org mailing list) to help flesh it out, gain feedback and consensus from the community at large, and improve the PEP for re-submission. The author of the PEP is then responsible for posting the PEP to the community forums, and marshaling community support for it. As updates are necessary, the PEP author can check in new versions if they have CVS commit permissions, or can email new PEP versions to the PEP editor for committing. Standards track PEPs consists of two parts, a design document and a reference implementation. The PEP should be reviewed and accepted before a reference implementation is begun, unless a reference implementation will aid people in studying the PEP. Standards Track PEPs must include an implementation - in the form of code, patch, or URL to same - before it can be considered Final. PEP authors are responsible for collecting community feedback on a PEP before submitting it for review. A PEP that has not been discussed on python-list@python.org and/or python-dev@python.org will not be accepted. However, wherever possible, long open-ended discussions on public mailing lists should be avoided. Strategies to keep the discussions efficient include, setting up a separate SIG mailing list for the topic, having the PEP author accept private comments in the early design phases, etc. PEP authors should use their discretion here. Once the authors have completed a PEP, they must inform the PEP editor that it is ready for review. PEPs are reviewed by the BDFL and his chosen consultants, who may accept or reject a PEP or send it back to the author(s) for revision. Once a PEP has been accepted, the reference implementation must be completed. When the reference implementation is complete and accepted by the BDFL, the status will be changed to `Final.' A PEP can also be assigned status `Deferred.' The PEP author or editor can assign the PEP this status when no progress is being made on the PEP. Once a PEP is deferred, the PEP editor can re-assign it to draft status. A PEP can also be `Rejected'. Perhaps after all is said and done it was not a good idea. It is still important to have a record of this fact. PEPs can also be replaced by a different PEP, rendering the original obsolete. This is intended for Informational PEPs, where version 2 of an API can replace version 1. PEP work flow is as follows: Draft -> Accepted -> Final -> Replaced ^ +----> Rejected v Deferred Some informational PEPs may also have a status of `Active' if they are never meant to be completed. E.g. PEP 1. What belongs in a successful PEP? Each PEP should have the following parts: 1. Preamble -- RFC822 style headers containing meta-data about the PEP, including the PEP number, a short descriptive title (limited to a maximum of 44 characters), the names, and optionally the contact info for each author, etc. 2. Abstract -- a short (~200 word) description of the technical issue being addressed. 3. Copyright/public domain -- Each PEP must either be explicitly labelled as placed in the public domain (see this PEP as an example) or licensed under the Open Publication License[4]. 4. Specification -- The technical specification should describe the syntax and semantics of any new language feature. The specification should be detailed enough to allow competing, interoperable implementations for any of the current Python platforms (CPython, JPython, Python .NET). 5. Motivation -- The motivation is critical for PEPs that want to change the Python language. It should clearly explain why the existing language specification is inadequate to address the problem that the PEP solves. PEP submissions without sufficient motivation may be rejected outright. 6. Rationale -- The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale should provide evidence of consensus within the community and discuss important objections or concerns raised during discussion. 7. Backwards Compatibility -- All PEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The PEP must explain how the author proposes to deal with these incompatibilities. PEP submissions without a sufficient backwards compatibility treatise may be rejected outright. 8. Reference Implementation -- The reference implementation must be completed before any PEP is given status 'Final,' but it need not be completed before the PEP is accepted. It is better to finish the specification and rationale first and reach consensus on it before writing code. The final implementation must include test code and documentation appropriate for either the Python language reference or the standard library reference. PEP Template PEPs are written in plain ASCII text, and should adhere to a rigid style. There is a Python script that parses this style and converts the plain text PEP to HTML for viewing on the web[5]. PEP 9 contains a boilerplate[7] template you can use to get started writing your PEP. Each PEP must begin with an RFC822 style header preamble. The headers must appear in the following order. Headers marked with `*' are optional and are described below. All other headers are required. PEP: Title: Version: Last-Modified: Author: * Discussions-To: Status: Type: * Requires: Created: * Python-Version: Post-History: * Replaces: * Replaced-By: The Author: header lists the names and optionally, the email addresses of all the authors/owners of the PEP. The format of the author entry should be address@dom.ain (Random J. User) if the email address is included, and just Random J. User if the address is not given. If there are multiple authors, each should be on a separate line following RFC 822 continuation line conventions. Note that personal email addresses in PEPs will be obscured as a defense against spam harvesters. Standards track PEPs must have a Python-Version: header which indicates the version of Python that the feature will be released with. Informational PEPs do not need a Python-Version: header. While a PEP is in private discussions (usually during the initial Draft phase), a Discussions-To: header will indicate the mailing list or URL where the PEP is being discussed. No Discussions-To: header is necessary if the PEP is being discussed privately with the author, or on the python-list or python-dev email mailing lists. Note that email addresses in the Discussions-To: header will not be obscured. Created: records the date that the PEP was assigned a number, while Post-History: is used to record the dates of when new versions of the PEP are posted to python-list and/or python-dev. Both headers should be in dd-mmm-yyyy format, e.g. 14-Aug-2001. PEPs may have a Requires: header, indicating the PEP numbers that this PEP depends on. PEPs may also have a Replaced-By: header indicating that a PEP has been rendered obsolete by a later document; the value is the number of the PEP that replaces the current document. The newer PEP must have a Replaces: header containing the number of the PEP that it rendered obsolete. PEP Formatting Requirements PEP headings must begin in column zero and the initial letter of each word must be capitalized as in book titles. Acronyms should be in all capitals. The body of each section must be indented 4 spaces. Code samples inside body sections should be indented a further 4 spaces, and other indentation can be used as required to make the text readable. You must use two blank lines between the last line of a section's body and the next section heading. You must adhere to the Emacs convention of adding two spaces at the end of every sentence. You should fill your paragraphs to column 70, but under no circumstances should your lines extend past column 79. If your code samples spill over column 79, you should rewrite them. Tab characters must never appear in the document at all. A PEP should include the standard Emacs stanza included by example at the bottom of this PEP. A PEP must contain a Copyright section, and it is strongly recommended to put the PEP in the public domain. When referencing an external web page in the body of a PEP, you should include the title of the page in the text, with a footnote reference to the URL. Do not include the URL in the body text of the PEP. E.g. Refer to the Python Language web site [1] for more details. ... [1] http://www.python.org When referring to another PEP, include the PEP number in the body text, such as "PEP 1". The title may optionally appear. Add a footnote reference that includes the PEP's title and author. It may optionally include the explicit URL on a separate line, but only in the References section. Note that the pep2html.py script will calculate URLs automatically, e.g.: ... Refer to PEP 1 [7] for more information about PEP style ... References [7] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton http://www.python.org/peps/pep-0001.html If you decide to provide an explicit URL for a PEP, please use this as the URL template: http://www.python.org/peps/pep-xxxx.html PEP numbers in URLs must be padded with zeros from the left, so as to be exactly 4 characters wide, however PEP numbers in text are never padded. Reporting PEP Bugs, or Submitting PEP Updates How you report a bug, or submit a PEP update depends on several factors, such as the maturity of the PEP, the preferences of the PEP author, and the nature of your comments. For the early draft stages of the PEP, it's probably best to send your comments and changes directly to the PEP author. For more mature, or finished PEPs you may want to submit corrections to the SourceForge bug manager[6] or better yet, the SourceForge patch manager[2] so that your changes don't get lost. If the PEP author is a SF developer, assign the bug/patch to him, otherwise assign it to the PEP editor. When in doubt about where to send your changes, please check first with the PEP author and/or PEP editor. PEP authors who are also SF committers, can update the PEPs themselves by using "cvs commit" to commit their changes. Remember to also push the formatted PEP text out to the web by doing the following: % python pep2html.py -i NUM where NUM is the number of the PEP you want to push out. See % python pep2html.py --help for details. Transferring PEP Ownership It occasionally becomes necessary to transfer ownership of PEPs to a new champion. In general, we'd like to retain the original author as a co-author of the transferred PEP, but that's really up to the original author. A good reason to transfer ownership is because the original author no longer has the time or interest in updating it or following through with the PEP process, or has fallen off the face of the 'net (i.e. is unreachable or not responding to email). A bad reason to transfer ownership is because you don't agree with the direction of the PEP. We try to build consensus around a PEP, but if that's not possible, you can always submit a competing PEP. If you are interested assuming ownership of a PEP, send a message asking to take over, addressed to both the original author and the PEP editor . If the original author doesn't respond to email in a timely manner, the PEP editor will make a unilateral decision (it's not like such decisions can be reversed. :). References and Footnotes [1] This historical record is available by the normal CVS commands for retrieving older revisions. For those without direct access to the CVS tree, you can browse the current and past PEP revisions via the SourceForge web site at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/nondist/peps/?cvsroot=python [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [3] http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse [4] http://www.opencontent.org/openpub/ [5] The script referred to here is pep2html.py, which lives in the same directory in the CVS tree as the PEPs themselves. Try "pep2html.py --help" for details. The URL for viewing PEPs on the web is http://www.python.org/peps/ [6] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [7] PEP 9, Sample PEP Template http://www.python.org/peps/pep-0009.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From oren-py-d@hishome.net Mon Jul 29 20:09:44 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 29 Jul 2002 22:09:44 +0300 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 01:34:01PM -0400 References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020729220944.A6113@hishome.net> On Mon, Jul 29, 2002 at 01:34:01PM -0400, Guido van Rossum wrote: > > http://www.python.org/sf/584626 > > > > This patch removes the limitation of not allowing yield in the try part > > of a try/finally. The dealloc function of a generator checks if the > > generator is still alive and resumes it one last time from the return > > instruction at the end of the code, causing any try/finally blocks to be > > triggered. Any exceptions raised are treated just like exceptions in a > > __del__ finalizer (printed and ignored). > > I'm not sure I understand what it does. The return instruction at the > end of the code, if I take this literally, isn't enclosed in any > try/finally blocks. So how can this have the desired effect? They're on the block stack. The stack unwind does the rest. > Have you verified that Jython can implement these semantics too? I don't see why not. The trick of jumping to the end was just my way to avoid adding a flag or some magic value to signal to eval_frame that it needs to trigger the block stack unwind on ceval.c:2201. There must be many other ways to implement this. > Do you *really* need this? I'm a plumber. I make pipelines by chaining iterators and transformations. My favorite fittings are generator functions and closures so I rarely need to actually define a class. One of my generator functions needed to clean up some stuff so I naturally used a try/finally block. When the compiler complained I recalled that when I first read with excitement about generator functions there was a comment there about some arbitrary limitation of yield statements in try/finally blocks... Anyway, I ended up creating a temporary local object just so I could take advantage of its __del__ method for cleanup but I really didn't like it. After a quick look at ceval.c I realized that it would be easy to fix this by having the dealloc function simulate a return statement just after the yield that was never resumed. So I wrote a little patch to remove something that I consider a wart. Oren Teaser: coming soon on the dataflow library! transparent two-way interoperability between iterators and unix pipes! From guido@python.org Mon Jul 29 20:30:34 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 15:30:34 -0400 Subject: [Python-Dev] HAVE_CONFIG_H Message-ID: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> I see no references to HAVE_CONFIG_H in the source code (except one #undef in readline.c), yet we #define it on the command line. Is that still necessary? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 29 20:40:01 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 15:40:01 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 22:09:44 +0300." <20020729220944.A6113@hishome.net> References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> Message-ID: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> > > > http://www.python.org/sf/584626 > > > > > > This patch removes the limitation of not allowing yield in the > > > try part of a try/finally. The dealloc function of a generator > > > checks if the generator is still alive and resumes it one last > > > time from the return instruction at the end of the code, causing > > > any try/finally blocks to be triggered. Any exceptions raised > > > are treated just like exceptions in a __del__ finalizer (printed > > > and ignored). > > > > I'm not sure I understand what it does. The return instruction at > > the end of the code, if I take this literally, isn't enclosed in > > any try/finally blocks. So how can this have the desired effect? > > They're on the block stack. The stack unwind does the rest. OK. Your way to find the last return statement gives me the willies though. :-( > > Have you verified that Jython can implement these semantics too? > > I don't see why not. The trick of jumping to the end was just my way > to avoid adding a flag or some magic value to signal to eval_frame > that it needs to trigger the block stack unwind on ceval.c:2201. > There must be many other ways to implement this. Please go to the Jython developers and ask their opinion. Implementing yield in Java is a bit of a hack, and we've been careful to make it possible at all. I don't want to break it. Of course, since Jython has garbage collection, your finally clause may be executed later than you had expected it, or not at all! Are you sure you want this? I don't recall all the reasons why this restriction was added to the PEP, but I believe it wasn't just because we couldn't figure out how to implement it -- it also had to do with not being able to explain what exactly the semantics would be. > > Do you *really* need this? > > I'm a plumber. I make pipelines by chaining iterators and > transformations. My favorite fittings are generator functions and > closures so I rarely need to actually define a class. One of my > generator functions needed to clean up some stuff so I naturally > used a try/finally block. When the compiler complained I recalled > that when I first read with excitement about generator functions > there was a comment there about some arbitrary limitation of yield > statements in try/finally blocks... > > Anyway, I ended up creating a temporary local object just so I could > take advantage of its __del__ method for cleanup but I really didn't > like it. After a quick look at ceval.c I realized that it would be > easy to fix this by having the dealloc function simulate a return > statement just after the yield that was never resumed. So I wrote a > little patch to remove something that I consider a wart. There are a few other places that invoke Python code in a dealloc handler (__del__ invocations in classobject.c and typeobject.c). They do a more complicated dance with the reference count. Can you check that you are doing the right thing? I'd also like to get Neil Schemenauer's review of the code, since he knows best how generators work under the covers. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 29 20:59:06 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 29 Jul 2002 21:59:06 +0200 Subject: [Python-Dev] HAVE_CONFIG_H References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D459E8A.1050602@lemburg.com> Guido van Rossum wrote: > I see no references to HAVE_CONFIG_H in the source code (except one > #undef in readline.c), yet we #define it on the command line. Is that > still necessary? What about these ? ./Mac/mwerks/old/mwerks_nsgusi_config.h: -- define HAVE_CONFIG_H ./Mac/mwerks/old/mwerks_tk_config.h: -- define HAVE_CONFIG_H ./Mac/mwerks/old/mwerks_shgusi_config.h: -- define HAVE_CONFIG_H ./Modules/expat/xmlparse.c: -- #ifdef HAVE_CONFIG_H ./Modules/expat/xmltok.c: -- #ifdef HAVE_CONFIG_H ./Modules/expat/xmlrole.c: -- #ifdef HAVE_CONFIG_H -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jul 29 21:06:57 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 16:06:57 -0400 Subject: [Python-Dev] HAVE_CONFIG_H In-Reply-To: Your message of "Mon, 29 Jul 2002 21:59:06 +0200." <3D459E8A.1050602@lemburg.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <3D459E8A.1050602@lemburg.com> Message-ID: <200207292006.g6TK6wq06015@pcp02138704pcs.reston01.va.comcast.net> > > I see no references to HAVE_CONFIG_H in the source code (except one > > #undef in readline.c), yet we #define it on the command line. Is that > > still necessary? > > What about these ? > ./Mac/mwerks/old/mwerks_nsgusi_config.h: > -- define HAVE_CONFIG_H > ./Mac/mwerks/old/mwerks_tk_config.h: > -- define HAVE_CONFIG_H > ./Mac/mwerks/old/mwerks_shgusi_config.h: > -- define HAVE_CONFIG_H I don't have a directory Mac/mwerks/old/. Maybe you created this yourself? > ./Modules/expat/xmlparse.c: > -- #ifdef HAVE_CONFIG_H > ./Modules/expat/xmltok.c: > -- #ifdef HAVE_CONFIG_H > ./Modules/expat/xmlrole.c: > -- #ifdef HAVE_CONFIG_H We don't pass HAVE_CONFIG_H to extension modules, only to the core (stuff built directly by the Makefile, not by setup.py). That's a good thing too, becaus these include , not "pyconfig.h". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 29 21:09:05 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 16:09:05 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 20:08:24 +0300." <20020729200824.A5391@hishome.net> References: <20020729200824.A5391@hishome.net> Message-ID: <200207292009.g6TK95i06131@pcp02138704pcs.reston01.va.comcast.net> > http://www.python.org/sf/584626 > > This patch removes the limitation of not allowing yield in the try part > of a try/finally. The dealloc function of a generator checks if the > generator is still alive and resumes it one last time from the return > instruction at the end of the code, causing any try/finally blocks to be > triggered. Any exceptions raised are treated just like exceptions in a > __del__ finalizer (printed and ignored). Try building Python in debug mode, and then run the test suite. I get a fatal error in test_generators (but not when that test is run in isolation): Fatal Python error: ../Python/ceval.c:2256 object at 0x40b05654 has negative ref count -1 --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Mon Jul 29 21:14:26 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 29 Jul 2002 23:14:26 +0300 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 03:40:01PM -0400 References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020729231426.A7209@hishome.net> On Mon, Jul 29, 2002 at 03:40:01PM -0400, Guido van Rossum wrote: > > > I'm not sure I understand what it does. The return instruction at > > > the end of the code, if I take this literally, isn't enclosed in > > > any try/finally blocks. So how can this have the desired effect? > > > > They're on the block stack. The stack unwind does the rest. > > OK. Your way to find the last return statement gives me the willies > though. :-( Yeah, I know. I'm not too proud of it but I was looking for instant gratification... > Of course, since Jython has garbage collection, your finally clause > may be executed later than you had expected it, or not at all! Are > you sure you want this? The same question applies to the __del__ method of any local variables inside the suspended generator. I tend to rely on the reference counting semantics of CPython in much of my code and I don't feel bad about it. > There are a few other places that invoke Python code in a dealloc > handler (__del__ invocations in classobject.c and typeobject.c). They > do a more complicated dance with the reference count. Can you check > that you are doing the right thing? The __del__ method gets a reference to the object so it needs to be revived. Generators are much simpler because the generator function does not have any reference to the generator object. Oren From nas@python.ca Mon Jul 29 21:25:15 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 29 Jul 2002 13:25:15 -0700 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 03:40:01PM -0400 References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020729132515.A31926@glacier.arctrix.com> Guido van Rossum wrote: > I'd also like to get Neil Schemenauer's review of the code, since he > knows best how generators work under the covers. I'm pretty sure it can be made to work (at least for CPython). The proposed patch is not correct since it doesn't handle "finally" code that creates a new reference to the generator. Also, setting the instruction pointer to the return statement is really ugly, IMO. There could be valid code out there that does not end with LOAD_CONST+RETURN. Those are minor details though. We need to decide if we really want this. For example, what happens if 'yield' is inside the finally block? With the proposed patch: >>> def f(): ... try: ... assert 0 ... finally: ... return 1 ... >>> f() 1 >>> def g(): ... try: ... assert 0 ... finally: ... yield 1 ... >>> list(g()) Traceback (most recent call last): File "", line 1, in ? File "", line 3, in g AssertionError Maybe some people whould expect [1] in the second case. Neil From guido@python.org Mon Jul 29 21:21:07 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 16:21:07 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 23:14:26 +0300." <20020729231426.A7209@hishome.net> References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729231426.A7209@hishome.net> Message-ID: <200207292021.g6TKL7u06204@pcp02138704pcs.reston01.va.comcast.net> > Yeah, I know. I'm not too proud of it but I was looking for instant > gratification... The search for instant gratification probably ties a lot of the Python community together... > > Of course, since Jython has garbage collection, your finally clause > > may be executed later than you had expected it, or not at all! Are > > you sure you want this? > > The same question applies to the __del__ method of any local variables > inside the suspended generator. I tend to rely on the reference counting > semantics of CPython in much of my code and I don't feel bad about it. But __del__ is in essence asynchronous. On the other hand, try/finally is traditionally completely synchronous. Adding a case where a finally clause can execute asynchronously (or not at all, if there is a global ref or cyclical garbage keeping the generator alive) sounds like a breach of promise almost. > > There are a few other places that invoke Python code in a dealloc > > handler (__del__ invocations in classobject.c and typeobject.c). They > > do a more complicated dance with the reference count. Can you check > > that you are doing the right thing? > > The __del__ method gets a reference to the object so it needs to be > revived. Generators are much simpler because the generator function does > not have any reference to the generator object. But you still have to be careful with how you incref/decref -- see my fatal error report in debug mode. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jul 29 21:30:36 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 16:30:36 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 13:25:15 PDT." <20020729132515.A31926@glacier.arctrix.com> References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com> Message-ID: <200207292030.g6TKUaW06234@pcp02138704pcs.reston01.va.comcast.net> > I'm pretty sure it can be made to work (at least for CPython). The > proposed patch is not correct since it doesn't handle "finally" code > that creates a new reference to the generator. As Oren pointed out, how can you create a reference to the generator when its reference count was 0? There can't be a global referencing it, and (unlike __del__) you aren't getting a pointer to yourself. > Also, setting the instruction pointer to the return statement is > really ugly, IMO. Agreed. ;-) > There could be valid code out there that does not end with > LOAD_CONST+RETURN. The current code generator always generates that as the final instruction. But someone might add an optimizer that takes that out if it is provably unreachable... > Those are minor details though. We need to decide if we really want > this. For example, what happens if 'yield' is inside the finally block? > With the proposed patch: > > >>> def f(): > ... try: > ... assert 0 > ... finally: > ... return 1 > ... > >>> f() > 1 > >>> def g(): > ... try: > ... assert 0 > ... finally: > ... yield 1 > ... > >>> list(g()) > Traceback (most recent call last): > File "", line 1, in ? > File "", line 3, in g > AssertionError > > Maybe some people whould expect [1] in the second case. The latter is not new; that example has no yield in the try clause. If you'd used a for loop or next() calls, you'd have noticed the yield got executed normally, but following next() call raises AssertionError. But this example behaves strangely: >>> def f(): ... try: ... yield 1 ... assert 0 ... finally: ... yield 2 ... >>> a = f() >>> a.next() 1 >>> del a >>> What happens at the yield here?!?! If I put prints before and after it, the finally clause is entered, but not exited. Bizarre!!! --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Mon Jul 29 21:41:12 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 29 Jul 2002 13:41:12 -0700 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: <20020729132515.A31926@glacier.arctrix.com>; from nas@python.ca on Mon, Jul 29, 2002 at 01:25:15PM -0700 References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com> Message-ID: <20020729134112.B31926@glacier.arctrix.com> I wrote: > The proposed patch is not correct since it doesn't handle "finally" > code that creates a new reference to the generator. It looks like that's not actually a problem since you can't get a hold of a reference to the generator. However, here's another bit of nastiness: $ cat > bad.py import sys import gc def g(): global gen self = gen try: yield 1 finally: gen = self gen = g() gen.next() del gen gc.collect() print gen $ ./python bad.py Segmentation fault (core dumped) Basically, the GC has to be taught that generators can have finalizers and it may not be safe to collect them. If we allow try/finally in generators then they can cause uncollectible garbage. It's not a show stopper but something else to take into consideration. Neil From guido@python.org Mon Jul 29 21:38:12 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 16:38:12 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 13:41:12 PDT." <20020729134112.B31926@glacier.arctrix.com> References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com> <20020729134112.B31926@glacier.arctrix.com> Message-ID: <200207292038.g6TKcC806273@pcp02138704pcs.reston01.va.comcast.net> > Basically, the GC has to be taught that generators can have finalizers > and it may not be safe to collect them. If we allow try/finally in > generators then they can cause uncollectible garbage. It's not a show > stopper but something else to take into consideration. I leave this in your capable hands. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jul 29 22:42:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 29 Jul 2002 17:42:50 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: <20020729134112.B31926@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > ... > Basically, the GC has to be taught that generators can have finalizers > and it may not be safe to collect them. If we allow try/finally in > generators Note that we already allow try/finally in generators. The only prohibition is against having a yield stmt in the try clause of a try/finally construct (YINTCOATFC). > then they can cause uncollectible garbage. It's not a show > stopper but something else to take into consideration. I'm concerned about semantic clarity. A "finally" block is supposed to get executed upon leaving its associated "try" block. A yield stmt doesn't leave the try block in that sense, so there's no justification for executing the finally block unless the generator is resumed, and the try block is exited "for real" via some other means (a return, an exception, or falling off the end of the try block). We could have allowed YINTCOATFC under those rules with clarity, but it would have been a great surprise then that the finally clause may never get executed at all. Better to outlaw it than that (or, as the PEP says, that would be "too much a violation of finally's purpose to bear"). Making up new control flow out of thin air upon destructing a generator ("OK, let's pretend that the generator was actually resumed in that case, and also pretend that a return statement immediately followed the yield") is plainly a hack; and because it's still possible then that the finally clause may never get executed at all (because it's possible to create an uncollectible generator), it's too much a violation of finally's purpose to bear even so. When I've needed resource-cleanup in a generator, I've made the generator a method of a class, and put the resources in instance variables. Then they're easy to clean up at will (even via a __del__ method, if need be; but the uncertainty about when and whether __del__ methods get called is already well-known, and I don't want to extend that fuzziness to 'finally' clauses too -- we left those reliable against anything short of a system crash, and IMO it's important to keep them that bulletproof). From guido@python.org Mon Jul 29 23:01:03 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 29 Jul 2002 18:01:03 -0400 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Your message of "Mon, 29 Jul 2002 17:42:50 EDT." References: Message-ID: <200207292201.g6TM14G06652@pcp02138704pcs.reston01.va.comcast.net> [Tim] > Note that we already allow try/finally in generators. The only > prohibition is against having a yield stmt in the try clause of a > try/finally construct (YINTCOATFC). > [Neil] > > then they can cause uncollectible garbage. It's not a show > > stopper but something else to take into consideration. > > I'm concerned about semantic clarity. A "finally" block is supposed > to get executed upon leaving its associated "try" block. A yield > stmt doesn't leave the try block in that sense, so there's no > justification for executing the finally block unless the generator > is resumed, and the try block is exited "for real" via some other > means (a return, an exception, or falling off the end of the try > block). We could have allowed YINTCOATFC under those rules with > clarity, but it would have been a great surprise then that the > finally clause may never get executed at all. Better to outlaw it > than that (or, as the PEP says, that would be "too much a violation > of finally's purpose to bear"). > > Making up new control flow out of thin air upon destructing a > generator ("OK, let's pretend that the generator was actually > resumed in that case, and also pretend that a return statement > immediately followed the yield") is plainly a hack; and because it's > still possible then that the finally clause may never get executed > at all (because it's possible to create an uncollectible generator), > it's too much a violation of finally's purpose to bear even so. > > When I've needed resource-cleanup in a generator, I've made the > generator a method of a class, and put the resources in instance > variables. Then they're easy to clean up at will (even via a > __del__ method, if need be; but the uncertainty about when and > whether __del__ methods get called is already well-known, and I > don't want to extend that fuzziness to 'finally' clauses too -- we > left those reliable against anything short of a system crash, and > IMO it's important to keep them that bulletproof). I hope that Oren will withdraw his patch based upon this explanation. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jul 29 23:46:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jul 2002 00:46:03 +0200 Subject: [Python-Dev] pickling of large arrays In-Reply-To: <20020729130200.73932.qmail@web20201.mail.yahoo.com> References: <20020729130200.73932.qmail@web20201.mail.yahoo.com> Message-ID: "Ralf W. Grosse-Kunstleve" writes: > We are using Boost.Python to expose reference-counted C++ container > types (similar to std::vector<>) to Python. E.g.: > > from arraytbx import shared > d = shared.double(1000000) # double array with a million elements > c = shared.complex_double(100) # std::complex array > # and many more types, incl. several custom C++ types I recommend to implement pickling differently, e.g. by returning a byte string with the underlying memory representation. If producing a duplicate is still not acceptable, I recommend to inherit from the Pickler class. Regards, Martin From tim.one@comcast.net Mon Jul 29 23:49:01 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 29 Jul 2002 18:49:01 -0400 Subject: [Python-Dev] test_imaplib failing elsewhere? Message-ID: On Windows: > python ../lib/test/test_imaplib.py incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0) incorrect result when converting '"18-May-2033 13:33:20 +1000"' > IOW, it tries two things, and fails on both. Beefing up its if t1 <> t2: print 'incorrect result when converting', `t` by adding print ' t1 was', `t1` print ' t2 was', `t2` yields incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0) t1 was '"18-May-2033 03:33:20 -0500"' t2 was '"18-May-2033 04:33:20 -0400"' incorrect result when converting '"18-May-2033 13:33:20 +1000"' t1 was '"18-May-2033 13:33:20 +1000"' t2 was '"17-May-2033 23:33:20 -0400"' I'm not sure when it started failing, but within the last week ... OK, rev 1.3 of test_imaplib.py worked here, and rev 1.4 broke it, checked in 2-3 days ago. From pinard@iro.umontreal.ca Tue Jul 30 00:05:56 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 29 Jul 2002 19:05:56 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <15685.35726.678832.241665@anthem.wooz.org> References: <15685.35726.678832.241665@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > It has been a while since I posted a copy of PEP 1 to the mailing > lists and newsgroups. Thanks for giving me this opportunity. There is a tiny detail that bothers me: > The format of the author entry should be > address@dom.ain (Random J. User) > if the email address is included, and just > Random J. User > if the address is not given. This makes me jump fifteen years behind (or so, I do not remember times), at the time of the great push so the Internet prefers: Random J. User It is more reasonable to always give the real name, optionally followed by an email, that to consider that the real name is a mere comment for the email address. Oh, I know some hackers who praise themselves as login names or dream having positronic brains :-), but most of us are humans before anything else! Could the PEP be reformulated, at least, for leaving the choice opened? -- François Pinard http://www.iro.umontreal.ca/~pinard From martin@v.loewis.de Mon Jul 29 23:52:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jul 2002 00:52:57 +0200 Subject: [Python-Dev] HAVE_CONFIG_H In-Reply-To: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > I see no references to HAVE_CONFIG_H in the source code (except one > #undef in readline.c), yet we #define it on the command line. Is that > still necessary? It's autoconf tradition to use that; it would replace DEFS to either many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears). I don't think we need this, and it can safely be removed. Regards, Martin From martin@v.loewis.de Tue Jul 30 00:22:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jul 2002 01:22:44 +0200 Subject: [Python-Dev] test_imaplib failing elsewhere? In-Reply-To: References: Message-ID: Tim Peters writes: > On Windows: > > > python ../lib/test/test_imaplib.py > incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0) > incorrect result when converting '"18-May-2033 13:33:20 +1000"' > > > > IOW, it tries two things, and fails on both. It fails on Linux and Solaris as well. Regards, Martin From pinard@iro.umontreal.ca Tue Jul 30 00:30:30 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 29 Jul 2002 19:30:30 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Martin v. Loewis] > Guido van Rossum writes: > > I see no references to HAVE_CONFIG_H in the source code (except one > > #undef in readline.c), yet we #define it on the command line. Is that > > still necessary? > It's autoconf tradition to use that; it would replace DEFS to either > many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears). > I don't think we need this, and it can safely be removed. The many `-D' options which appear when `AC_CONFIG_HEADER' is not used are rather inelegant, they create a lot, really a lot of clumsiness in `make' output. The idea, but you surely know it, was to regroup all auto-configured definitions into a single header file, and limit the `-D' to the sole `HAVE_CONFIG_H', or almost. While the: #if HAVE_CONFIG_H # include #endif idiom, for some widely used sources, was to cope with `AC_CONFIG_HEADER' being defined in some projects, and not in others. There is no need to include `config.h', nor to create it, if all `#define's have been already done through a litany of `-D' options. -- François Pinard http://www.iro.umontreal.ca/~pinard From nhodgson@bigpond.net.au Tue Jul 30 00:37:18 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 30 Jul 2002 09:37:18 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> Message-ID: <029801c23758$e13594b0$3da48490@neil> Thomas Heller: > ..., but I understand Neil's requirements. > > Can they be fulfilled by adding some kind of UnlockObject() > call to the 'safe buffer interface', which should mean 'I won't > use the pointer received by getsaferead/writebufferproc any more'? Yes, that is exactly what I want. Neil From nhodgson@bigpond.net.au Tue Jul 30 00:50:43 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 30 Jul 2002 09:50:43 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729164532.48588.qmail@web40110.mail.yahoo.com> Message-ID: <02a201c2375a$c1299f70$3da48490@neil> Scott Gilbert: > What happens when you've locked the buffer and passed a pointer to the I/O > system for an asynchronous operation, but before that operation has > completed, your main program wants to resize the buffer due to a user > generated event? That is up to the application or class designer. There are three reasonable responses I see: throw an exception, buffer the user event, or ignore the user event. The only thing guaranteed by providing the safe buffer interface is that the pointer will remain valid. > > I don't want counting mutexes. I'm not defining behavior that needs > > them. > > > > You said you wanted the locks to keep a count. So that you could call > acquire() multiple times and have the buffer not truly become unlocked > until release() was called the same amount of times. I'm willing to adopt > any terminology you want for the purpose of this discussion. I think I > understand the semantics or the counting operation, but I want to > understand more what actually happens when the buffer is locked. When the buffer is locked, it returns a pointer and promises that the pointer will remain valid until the buffer is unlocked. The buffer interface could be defined either to allow multiple (counted) locks or to fail further lock attempts. Counted locks would be applicable in more circumstances but require more implementation. I would prefer counted but it is not that important as a counting layer can be implemented over a single lock interface if needed. Neil From nhodgson@bigpond.net.au Tue Jul 30 01:02:53 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 30 Jul 2002 10:02:53 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> Message-ID: <02c001c2375c$74037de0$3da48490@neil> Scott Gilbert: > I assume this means any call to getsafereadpointer()/getsafewritepointer() > will increment the lock count. So the UnlockObject() calls will be > mandatory. The UnlockObject call will be needed if you do want to permit resizing (again). It will not be needed for statically sized objects, including all the types that are included in the PEP currently, or where you have an object that will no longer need to be resizable. For example: you construct a sound buffer, fill it with noise, then lock it so that a pointer to its data can be given to the asynch sound playing function. If you don't need to write to the sound buffer again, it doesn't need to be unlocked. > Either that, or you'll have an explicit LockObject() call as > well. What behavior should happen when a resise is attempted while the > lock count is positive? The most common response will be some form of failure, probably throwing an exception. Other responses, such as buffering the resize, may be sensible in particular circumstances. Neil From greg@cosc.canterbury.ac.nz Tue Jul 30 01:21:42 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Jul 2002 12:21:42 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <00c601c23707$35819a20$3da48490@neil> Message-ID: <200207300021.g6U0LgOG018189@kuku.cosc.canterbury.ac.nz> > This restricts the set of objects that can be buffers to statically > sized objects. I'd prefer that dynamically resizable objects be able > to be buffers. That's what bothers me about the proposal -- I suspect that this restriction will turn out to be too restrictive to make it useful. But maybe locking could be built into the safe-buffer protocol? Resizable objects wanting to support the safe buffer protocol would be required to maintain a lock count which is incremented on each getsafebufferptr call. There would also have to be a releasesafebufferptr call to decrement the lock count. As long as the lock count is nonzero, attempting to resize the object would raise an exception. That way, resizable objects could be used as asynchronous I/O buffers as long as you didn't try to resize them while actually doing I/O. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Tue Jul 30 02:12:19 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Jul 2002 13:12:19 +1200 (NZST) Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) In-Reply-To: Message-ID: <200207300112.g6U1CJoO018210@kuku.cosc.canterbury.ac.nz> > but it would have been a great surprise then that the finally clause > may never get executed at all. Better to outlaw it than that (or, as > the PEP says, that would be "too much a violation of finally's purpose > to bear"). I don't think you'd really be breaking any promises. After all, if someone wrote def asdf(): try: something_that_never_returns() finally: ... they wouldn't have much ground for complaint that the finally never got executed. The case we're talking about seems much the same situation. > When I've needed resource-cleanup in a generator, I've made the generator a > method of a class, and put the resources in instance variables. Then > they're easy to clean up at will (even via a __del__ method, if need > be; I take it you usually provide a method for explicit cleanup. How about giving generator-iterators one, then, called maybe close() or abort(). The effect would be to raise an appropriate exception at the point of the yield, triggering any except or finally blocks. This method could even be added to the general iterator protocol (implementing it would be optional). It would then provide a standard name for people to use for cleanup methods in their own iterator classes. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Tue Jul 30 02:25:44 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Jul 2002 13:25:44 +1200 (NZST) Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: Message-ID: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz> pinard@iro.umontreal.ca: > It is more reasonable to always give the real name, optionally > followed by an email, that to consider that the real name is a mere > comment for the email address. Not necessarily -- it depends on your point of view. I've always thought of the "To:" line as an address, not a salutation. In other words, an instruction to the email system as to where to send the message, not the name of the recipient. Putting a person's name in there at all seems to me a sop to computer-illiterate wimps who go all wobbly at the knees when they see anything as esoteric-looking as an email address. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Tue Jul 30 02:42:38 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Jul 2002 13:42:38 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz> Guido: > I don't like where this is going. Let's not add locking to the buffer > protocol. Do you still object to it even in the form I proposed in my last message? (I.e. no separate "lock" call, locking is implicit in the getxxxbuffer calls.) It does make the protocol slightly more complicated to use (must remember to make a release call when you're finished with the pointer) but it seems like a good tradeoff to me for the flexibility gained. Note that there can't be any problems with deadlock, since no blocking is involved. Maybe "locking" is even the wrong term -- it's more a form of reference counting. > probably nothing that could possibly invoke the Python interpreter > recursively, since that might release the GIL. This would generally > mean that calls to Py_DECREF() are unsafe while holding on to a buffer > pointer! That could be fixed by incrementing the Python refcount as long as a pointer is held. That could be done even without the rest of my locking proposal. Of course, if you do that you need a matching release call, so you might as well implement the locking while you're at it. Mind you, if a release call is necessary, whoever holds the pointer must also hold a reference to the Python object, so that they can make the release call. So incrementing the Python refcount might not be necessary after all! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pinard@iro.umontreal.ca Tue Jul 30 02:46:34 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 29 Jul 2002 21:46:34 -0400 Subject: [Python-Dev] Re: Priority queue (binary heap) python code In-Reply-To: <20020721193057.A1891@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net> <20020721193057.A1891@arizona.localdomain> Message-ID: [Guido van Rossum] > [...] I admire the compactness of his code. I believe that this would make > a good addition to the standard library, as a friend of the bisect module. > [...] The only change I would make would be to make heap[0] the lowest > value rather than the highest. I propose to call it heapq.py. [Kevin O'Connor] > Looks good to me. In case you going forward with `heapq', and glancing through my notes, I see that "Courageous" implemented a priority queue algorithm as a C extension, and discussed it on python-list on 2000-05-29. I'm not really expecting that you aim something else than a pure Python version, and I'm not pushing nor pulling for it, as I do not have an opinion. In any case, I'll keep these messages a few more days: just ask, and I'll send you a copy of what I saved at the time. P.S. - I'm quickly loosing interests in these bits of C code meant for speed, as if I ever need C speed, the wonderful Pyrex tool (from Greg Ewing) gives it to me while allowing the algorithm to be expressed in a language close to Python. I even wonder if Pyrex could not be a proper avenue for the development of some parts of the Python distribution itself. -- François Pinard http://www.iro.umontreal.ca/~pinard From pinard@iro.umontreal.ca Tue Jul 30 03:33:14 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 29 Jul 2002 22:33:14 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz> References: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > pinard@iro.umontreal.ca: > > It is more reasonable to always give the real name, optionally > > followed by an email, that to consider that the real name is a mere > > comment for the email address. > Not necessarily -- it depends on your point of view. An email address may change over time, but one's name do not change often. In a lifetime of maintenance, I saw email addresses of a lot of correspondents fluctuate more or less over time. Only two or three persons asked me to correct their name after they got it legalistically modified. The contact point for a PEP is really a given human, whatever his/her email address may currently be. The modern Internet usage is to write the name first, and the email address after, between angular brackets. So, I'm suggesting that the PEP documents the popular, modern usage. > I've always thought of the "To:" line as an address, not a salutation. It is dual. The human reads the civil name, the machine reads the email address. Many MUA's have limited space for the message summaries, and they favour the civil name over the email address in the listings. -- François Pinard http://www.iro.umontreal.ca/~pinard From sholden@holdenweb.com Tue Jul 30 04:43:23 2002 From: sholden@holdenweb.com (Steve Holden) Date: Mon, 29 Jul 2002 23:43:23 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines References: <15685.35726.678832.241665@anthem.wooz.org> Message-ID: <00eb01c2377b$41dd4340$6300000a@holdenweb.com> ----- Original Message ----- From: "François Pinard" To: "Barry A. Warsaw" Cc: ; Sent: Monday, July 29, 2002 7:05 PM Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines > [Barry A. Warsaw] > > > It has been a while since I posted a copy of PEP 1 to the mailing > > lists and newsgroups. > > Thanks for giving me this opportunity. There is a tiny detail that > bothers me: > > > The format of the author entry should be > > address@dom.ain (Random J. User) > > if the email address is included, and just > > Random J. User > > if the address is not given. > > This makes me jump fifteen years behind (or so, I do not remember times), > at the time of the great push so the Internet prefers: > > Random J. User > > It is more reasonable to always give the real name, optionally followed by > an email, that to consider that the real name is a mere comment for the > email address. Oh, I know some hackers who praise themselves as login > names or dream having positronic brains :-), but most of us are humans > before anything else! > > Could the PEP be reformulated, at least, for leaving the choice opened? > Should we instead say that any acceptable RFC822 address would be an acceptable alternative for a simple name? If so you'd get naiive mail users complaining that they couldn't reach "@python.org:sholden@holdenweb.com" (for example). I don't really see why the address format has to agree with any particular other format: if you're going to use it in a program then there's no reason why you shouldn't mangle it into whatever form you (or your possibly-crippled software) requires :-) The major benefit of the present situation is that it's well-defined. I don't feel additional alternatived would be helpful here, especially when the existing format is RFC822-compliant. though-i-admit-i'm-not-up-to-speed-on-rfc2822-ly y'rs - steve ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From sholden@holdenweb.com Tue Jul 30 04:51:22 2002 From: sholden@holdenweb.com (Steve Holden) Date: Mon, 29 Jul 2002 23:51:22 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net> <095701c23722$84e06770$e000a8c0@thomasnotebook> <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <013e01c2377c$5f61c2f0$6300000a@holdenweb.com> ----- Original Message ----- From: "Guido van Rossum" To: "Thomas Heller" Cc: "Scott Gilbert" ; "Neil Hodgson" ; Sent: Monday, July 29, 2002 1:10 PM Subject: Re: [Python-Dev] pre-PEP: The Safe Buffer Interface > > > If an object's buffer isn't allocated for the object's life > > > when the object is created, it should not support the "safe" version > > > of the protocol (maybe a different name would be better), and users > > > should not release the GIL while using on to the pointer. > > > > 'Persistent' buffer interface? Too long? > > No, persistent typically refers to things that survive longer than a > process. Maybe 'static' buffer interface would work. > "cautious"? regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From just@letterror.com Tue Jul 30 06:55:20 2002 From: just@letterror.com (Just van Rossum) Date: Tue, 30 Jul 2002 07:55:20 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules posixmodule.c,2.247,2.248 In-Reply-To: Message-ID: nnorwitz@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Modules > In directory usw-pr-cvs1:/tmp/cvs-serv31715/Modules > > Modified Files: > posixmodule.c > Log Message: > Use PyArg_ParseTuple() instead of PyArg_Parse() which is deprecated > > Index: posixmodule.c > =================================================================== [ ... ] > ! else if (!PyArg_Parse(arg, "(ll)", &atime, &mtime)) { [ ... ] > ! else if (!PyArg_ParseTuple(arg, "ll", &atime, &mtime)) { [ ... ] Probably no biggie here, but I'd like to point out that there is a significant difference between the two calls: the former will allow any sequence for 'arg', but the latter insists on a tuple. For that reason I always use PyArg_Parse() to parse coordinate pairs and the like: it greatly enhanced the usability in those cases. Examples of this usage can be found in the Mac subtree. Just From xscottg@yahoo.com Tue Jul 30 07:10:16 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 23:10:16 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <02a201c2375a$c1299f70$3da48490@neil> Message-ID: <20020730061016.32588.qmail@web40103.mail.yahoo.com> --- Neil Hodgson wrote: > Scott Gilbert: > > > What happens when you've locked the buffer and passed a pointer to the > > I/O system for an asynchronous operation, but before that operation has > > completed, your main program wants to resize the buffer due to a user > > generated event? > > That is up to the application or class designer. There are three > reasonable responses I see: throw an exception, buffer the user event, or > ignore the user event. The only thing guaranteed by providing the safe > buffer interface is that the pointer will remain valid. > The guarantee about the pointer remaining valid while the acquire_count is positive is clear. I'm concerned about what the other thread (the one that wants to resize it) is going to do while the lock count is positive. You've listed three possibilities, but lets narrow it down to the strategy that you intend to use in Scintilla (a real use case). I believe all three strategies lead to something undesirable (be it polling, deadlock, a confused user, or ???), but I don't want to exhaustively scrutinize all possibilities until we come up with one good example that you intend to use (it would bore you to read them, and me to type them). So what exactly would you do in Scintilla? (Or pick another good use case if you prefer.) > > The buffer interface could be defined either to allow multiple > (counted) locks or to fail further lock attempts. Counted locks would be > applicable in more circumstances but require more implementation. I would > prefer counted but it is not that important as a counting layer can be > implemented over a single lock interface if needed. > A single lock interface can be implemented over an object without any locking. Have the lockable object return simple "fixed buffer objects" with a limited lifespan. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Tue Jul 30 07:10:26 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Mon, 29 Jul 2002 23:10:26 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz> Message-ID: <20020730061026.33569.qmail@web40106.mail.yahoo.com> --- Greg Ewing wrote: > Guido: > > > I don't like where this is going. Let's not add locking to the buffer > > protocol. > > Do you still object to it even in the form I proposed in > my last message? (I.e. no separate "lock" call, locking > is implicit in the getxxxbuffer calls.) > > It does make the protocol slightly more complicated to > use (must remember to make a release call when you're > finished with the pointer) but it seems like a good > tradeoff to me for the flexibility gained. > I realize this wasn't addressed to me, and that I said I would butt out when you were in favor of canning the proposal altogether, but I won't let that get in the way. :-) We haven't seen a semi-thorough use case where the locking behavior is beneficial yet. While I appreciate and agree with the intent of trying to get a more flexible object, I think there is at least one of several problems buried down a little further than you and Neil are looking. I'm concerned that this is very much like the segment count features of the current PyBufferProcs. It was apparently designed for more generality, and while no one uses it, everyone has to check that the segment count is one or raise an exception. If there is no realizable benefit to the acquire/release semantics of the new interface, then this is just extra burden too. Lets find a realizable benefit before we muck up Thomas's good simple proposal with this stuff. In the current Python core, I can think of the following objects that would need a retrofit to this new interface (there may be more): string unicode mmap array The string, unicode, and mmap objects do not resize or reallocate by design. So for them the extra acquire/release requirements are burden with no benefit. The array object does resize (via the extend method among others). So lets say that an array object gets passed to an extension that locks the buffer and grabs the pointer. The extension releases the GIL so that another thread can work on the array object. Another thread comes in and wants to do a resize (via the extend method). (We don't need to introduce threads for this since the asynchronous I/O case is just the same.) If extend() is called while thread 1 has the array locked, it can: A) raise an exception or return an error B) block until the lock count returns to zero C) ??? .) .) Case A is troublesome because depending on thread scheduling/disk performance, you will or won't get the exception. So you've got a weird race condition where an operation might have been valid if it had only executed a split second later, but due to misfortune it raised an exception. I think this non-determinism is ugly at the very least. However since it's recoverable, you could try again (polling), or ignore the request completely (odd behavior). I think this is what both you and Neil are proposing, and I don't see how this is terribly useful. While I don't think B is the strategy anyone is proposing, it means you have two blocking objects in effect (the GIL and whatever the array uses to implement blocking). If we're not extremely careful, we can get deadlock here. I'm still looking for any good examples that fall into cases C and beyond. Neil offered a third example that might fit. He says that he could buffer the user event that led to the resize operation. If that is his strategy, I'd like to see it explained further. It sounds like taking the event and not processing it until the asynchronous I/O operation has completed. At which point I wonder what using asynchronous I/O achieved since the resize operation had to wait synchronously for the I/O to complete. This also sounds suspiciously like blocking the resize thread, but I won't argue that point. __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From Jack.Jansen@cwi.nl Tue Jul 30 10:07:56 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 30 Jul 2002 11:07:56 +0200 Subject: [Python-Dev] HAVE_CONFIG_H In-Reply-To: <3D459E8A.1050602@lemburg.com> Message-ID: On Monday, July 29, 2002, at 09:59 , M.-A. Lemburg wrote: > Guido van Rossum wrote: >> I see no references to HAVE_CONFIG_H in the source code (except one >> #undef in readline.c), yet we #define it on the command line. Is that >> still necessary? > > What about these ? > > ./Mac/mwerks/old/mwerks_nsgusi_config.h: > -- define HAVE_CONFIG_H [...] They're turds, they can go. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mwh@python.net Tue Jul 30 10:33:31 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2002 10:33:31 +0100 Subject: [Python-Dev] patch: try/finally in generators In-Reply-To: Guido van Rossum's message of "Mon, 29 Jul 2002 16:30:36 -0400" References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com> <200207292030.g6TKUaW06234@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2m1y9llfv8.fsf@starship.python.net> Guido van Rossum writes: > > There could be valid code out there that does not end with > > LOAD_CONST+RETURN. > > The current code generator always generates that as the final > instruction. But someone might add an optimizer that takes that out > if it is provably unreachable... The bytecodehacks has one of them :) It would probably scream and run away if presented with a generator, but that's just a matter of bitrot. Cheers, M. -- All obscurity will buy you is time enough to contract venereal diseases. -- Tim Peters, python-dev From nhodgson@bigpond.net.au Tue Jul 30 10:48:44 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 30 Jul 2002 19:48:44 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020730061016.32588.qmail@web40103.mail.yahoo.com> Message-ID: <005d01c237ae$4b2f6670$3da48490@neil> Scott Gilbert: > You've listed three possibilities, but lets narrow it down to the strategy > that you intend to use in Scintilla (a real use case). I believe all three > strategies lead to something undesirable (be it polling, deadlock, a > confused user, or ???), but I don't want to exhaustively scrutinize all > possibilities until we come up with one good example that you intend to use > (it would bore you to read them, and me to type them). > > So what exactly would you do in Scintilla? (Or pick another good use case > if you prefer.) I'd prefer to ignore the input. Unfortunately users prefer a higher degree of friendliness :-( Since Scintilla is a component within a user interface, it shares this responsibility with the container application with the application being the main determinant. If I was writing a Windows-specific application that used Scintilla, and I wanted to use Asynchronous I/O then my preferred technique would be to change the message processing loop to leave the UI input messages in the queue until the I/O had completed. Once the I/O had completed then the message loop would change back to processing all messages which would allow the banked up input to come through. If I was feeling ambitious I may try to process some UI messages, possible detecting pressing Escape to abort a file load if it turned out the read was taking too long. > A single lock interface can be implemented over an object without any > locking. Have the lockable object return simple "fixed buffer objects" > with a limited lifespan. This returns to the possibility of indeterminate lifespan as mentioned earlier in the thread. > At which point I wonder what using asynchronous I/O achieved since the > resize operation had to wait synchronously for the I/O to complete. This > also sounds suspiciously like blocking the resize thread, but I won't argue > that point. There may be other tasks that the application can perform while waiting for the I/O to complete, such as displaying, styling or line-wrapping whatever text has already arrived (assuming that there are some facilities for discovering this) or performing similar tasks for other windows. Neil From smurf@noris.de Tue Jul 30 11:24:05 2002 From: smurf@noris.de (Matthias Urlichs) Date: Tue, 30 Jul 2002 12:24:05 +0200 Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) Message-ID: Greg: > I take it you usually provide a method for explicit cleanup. > How about giving generator-iterators one, then, called > maybe close() or abort(). The effect would be to raise > an appropriate exception at the point of the yield, > triggering any except or finally blocks. Objects already have a perfectly valid cleanup method -- "__del__". If your code is so complicated that it needs a try/yield/finally, it would make much more sense to convert the thing to an iterator object. It probably would make the code a whole lot more understandable, too. (It did happen with mine.) Stated another way: functions which yield stuff are special. If that specialness gets buried in nested try/except/finally/whatever constructs, things tend to get messy. Better make that messiness explicit by packaging the code in an object with well-defined methods. This is actually easy to do because of the existence of iterators, because this code def some_iter(foo): prepare(foo) try: for i in foo: yield something(i) finally: cleanup(foo) painlessly transmutes to this: class some_iter(object): def __init__(foo): prepare(foo) self.foo = foo self.it = foo.__iter__() def next(self): i = self.it.next() return something(i) def __del__(self): cleanup(self.foo) Personally I think the latter version is more readable because the important thing, i.e. how the next element is obtained, is clearly separated from the rest of the code (and one level dedented, compared to the first version). -- Matthias Urlichs From mwh@python.net Tue Jul 30 11:27:11 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2002 11:27:11 +0100 Subject: [Python-Dev] seeing off SET_LINENO Message-ID: <2mvg6xjytc.fsf@starship.python.net> I've submitted a(nother) patch to sf that removes SET_LINENO: http://www.python.org/sf/587993 It supports tracing by digging around in the c_lnotab[*] to see when execution moves onto a different line. I think it's more or less sound but any changes to the interpreter main loop are going to be subtle, so I have a few points to raise here. In no particular order: 1) this is a change I'd like to see anyway: the use of f->f_lasti in the main loop is confusing. let's just set it at the start of opcode dispatch and leave it the hell alone. there's actually what is probably a very old bug in the implementation of SET_LINENO. It does more or less this: f->f_lasti = INSTR_OFFSET(); /* call the trace function */ It should do this: f->f_lasti = INSTR_OFFSET() - 3; /* call the trace function */ The field is called f_LASTi, after all... 2) As I say in the patch, I will buy anyone a beer who can explain (without using LLTRACE or reading a lot of dis.py output) why we don't call the trace function on POP_TOP opcodes. 3) The patch changes behaviour -- for the better! You're now rather less likely to get the trace function called several times per line. 4) The patch installs a descriptor for f_lineno so that there is no incompatibility for Python code. The question is what to do with the f_lineno field in the C struct? Remove it? That would (probably) mean bumping PY_API_VERSION. Leave it in? Then its contents would usually be meaningless (keeping it up to date would rather defeat the point of this patch). 5) We've already bumped the MAGIC for 2.3a0, so we probably don't need to do that again. 6) Someone should teach dis.py how to find line breaks from the c_lnotab. I can do this, but not right now.... 7) The changes tickle what may be a very old bug in freeze: http://www.python.org/sf/588452 8) I haven't measured the performance impact of the changes to code that is tracing or code that isn't. There's a possible optimization mentioned in the patch for traced code. For not traced code it MAY be worthwhile putting the tracing support code in a static function somewhere so there's less code to jump over in the main loop (for i-caches and such). 9) This patch stops LLTRACE telling you when execution moves onto a different line. This could be restored, but a) I expect I'm the only persion to have used LLTRACE recently (debugging this patch). b) This will cause obfuscation, so I'd prefer to do it last. Comments welcome! Cheers, M. [*] I've cheated with my sigmonster: -- 34. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From barry@python.org Tue Jul 30 13:14:05 2002 From: barry@python.org (Barry A. Warsaw) Date: Tue, 30 Jul 2002 08:14:05 -0400 Subject: [Python-Dev] seeing off SET_LINENO References: <2mvg6xjytc.fsf@starship.python.net> Message-ID: <15686.33549.262832.740505@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> 3) The patch changes behaviour -- for the better! You're now MH> rather less likely to get the trace function called several MH> times per line. Does this change affect debugging? Have you tested how this change might interact with e.g. hotshot? -Barry From neal@metaslash.com Tue Jul 30 13:19:20 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 30 Jul 2002 08:19:20 -0400 Subject: [Python-Dev] PyArg_ParseTuple vs. PyArg_Parse References: Message-ID: <3D468448.45C22891@metaslash.com> Just van Rossum wrote: > > nnorwitz@users.sourceforge.net wrote: > > > Use PyArg_ParseTuple() instead of PyArg_Parse() which is deprecated > > > > Index: posixmodule.c > > =================================================================== > [ ... ] > > ! else if (!PyArg_Parse(arg, "(ll)", &atime, &mtime)) { > [ ... ] > > ! else if (!PyArg_ParseTuple(arg, "ll", &atime, &mtime)) { > [ ... ] > > Probably no biggie here, but I'd like to point out that there is a significant > difference between the two calls: the former will allow any sequence for 'arg', > but the latter insists on a tuple. For that reason I always use PyArg_Parse() to > parse coordinate pairs and the like: it greatly enhanced the usability in those > cases. Examples of this usage can be found in the Mac subtree. I'll back out this change. But this raises the question should PyArg_Parse() be deprecated or should just METH_OLDARGS be deprecated? Neal From mwh@python.net Tue Jul 30 13:31:53 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2002 13:31:53 +0100 Subject: [Python-Dev] seeing off SET_LINENO In-Reply-To: barry@python.org's message of "Tue, 30 Jul 2002 08:14:05 -0400" References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> Message-ID: <2md6t5ieh2.fsf@starship.python.net> barry@python.org (Barry A. Warsaw) writes: > >>>>> "MH" == Michael Hudson writes: > > MH> 3) The patch changes behaviour -- for the better! You're now > MH> rather less likely to get the trace function called several > MH> times per line. > > Does this change affect debugging? Hmm, I hadn't actually dared to run pdb with my patch... have now, and it seems OK. There is a difference: The bytecode for, say, def f(): print 1 begins with two SET_LINENO's. One is for the line containing "def f():", one is for "print 1". My patch means the debugger doesn't stop on the "def f():" line -- unsurprisingly, given that no execution ever takes place on that line. It would be possible to force a call to the trace function on entry to the function. In fact, there's a commented out block for this in my patch. Another approach would presuambly be for pdb to stop on 'call' trace events as well as 'line' ones. I don't really understand, or use all that often, pdb. Also, you currently stop twice on the first line of a for loop, but only once with my patch. There are probably other situations of excessive SET_LINENO emission. I know Skip (think it was him) killed a couple last week. Bug compatibility is possible here too, but I don't see the advantage. > Have you tested how this change might interact with e.g. hotshot? test_hotshot was very important to me as evidence I was making progress! It currently fails due to the not-calling-trace-on-def-line issue, but as I said, I think this is a *good* thing... Cheers, M. -- The ability to quote is a serviceable substitute for wit. -- W. Somerset Maugham From mal@lemburg.com Tue Jul 30 13:42:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 30 Jul 2002 14:42:19 +0200 Subject: [Python-Dev] seeing off SET_LINENO References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net> Message-ID: <3D4689AB.2020107@lemburg.com> Michael Hudson wrote: > barry@python.org (Barry A. Warsaw) writes: > > >>>>>>>"MH" == Michael Hudson writes: >>>>>> >> MH> 3) The patch changes behaviour -- for the better! You're now >> MH> rather less likely to get the trace function called several >> MH> times per line. >> >>Does this change affect debugging? > > > Hmm, I hadn't actually dared to run pdb with my patch... have now, and > it seems OK. > > There is a difference: > > The bytecode for, say, > > def f(): > print 1 > > begins with two SET_LINENO's. One is for the line containing "def > f():", one is for "print 1". My patch means the debugger doesn't stop > on the "def f():" line -- unsurprisingly, given that no execution ever > takes place on that line. This might be used in debugging application to setup some environment *before* diving into the function itself. Note that many C debuggers stop at the declare line of a function as well (because they execute stack setup code), so a sudden change in this would probably confuse users of todays Python IDEs. > It would be possible to force a call to the trace function on entry to > the function. In fact, there's a commented out block for this in my > patch. Another approach would presuambly be for pdb to stop on 'call' > trace events as well as 'line' ones. I don't really understand, or > use all that often, pdb. > > Also, you currently stop twice on the first line of a for loop, but > only once with my patch. There are probably other situations of > excessive SET_LINENO emission. I know Skip (think it was him) killed > a couple last week. Bug compatibility is possible here too, but I > don't see the advantage. > > >>Have you tested how this change might interact with e.g. hotshot? > > > test_hotshot was very important to me as evidence I was making > progress! > > It currently fails due to the not-calling-trace-on-def-line issue, but > as I said, I think this is a *good* thing... Have you also tested this with the commonly used Python IDEs out there ? E.g. IDLE, IDLE-fork, PythonWorks, WingIDE, Emacs, BlackAdder, BOA Constructor, etc. etc. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Tue Jul 30 13:58:10 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2002 13:58:10 +0100 Subject: [Python-Dev] seeing off SET_LINENO In-Reply-To: "M.-A. Lemburg"'s message of "Tue, 30 Jul 2002 14:42:19 +0200" References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net> <3D4689AB.2020107@lemburg.com> Message-ID: <2mado9id99.fsf@starship.python.net> "M.-A. Lemburg" writes: > > begins with two SET_LINENO's. One is for the line containing "def > > f():", one is for "print 1". My patch means the debugger doesn't stop > > on the "def f():" line -- unsurprisingly, given that no execution ever > > takes place on that line. > > This might be used in debugging application to setup some > environment *before* diving into the function itself. So do that when you get the 'call' trace function call! That's what it's there for. > Note that many C debuggers stop at the declare line of > a function as well (because they execute stack setup code), > so a sudden change in this would probably confuse users of > todays Python IDEs. However, sudden changes here are *very* likely to confuse, I agree. Perhaps bug-compatibility is something to aim for. [...] > >>Have you tested how this change might interact with e.g. hotshot? > > > > > > test_hotshot was very important to me as evidence I was making > > progress! > > > > It currently fails due to the not-calling-trace-on-def-line issue, but > > as I said, I think this is a *good* thing... > > Have you also tested this with the commonly used Python IDEs > out there ? E.g. IDLE, IDLE-fork, PythonWorks, WingIDE, Emacs, > BlackAdder, BOA Constructor, etc. etc. No. Don't think it's relavent to IDLE (at least, I can't see any calls to settrace in there that aren't commented out). Python-mode's pdbtrack should just carry on working. Don't have easy access to the others. I'd be amazed if other IDE's were severely adversely affected. Anyway, isn't this what alphas are for? I have no problem emailing a relavent person for each of the above IDEs and pointing out that this change may affect them. Cheers, M. -- If a train station is a place where a train stops, what's a workstation? -- unknown (to me, at least) From barry@python.org Tue Jul 30 16:16:05 2002 From: barry@python.org (Barry A. Warsaw) Date: Tue, 30 Jul 2002 11:16:05 -0400 Subject: [Python-Dev] seeing off SET_LINENO References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net> Message-ID: <15686.44469.22988.913649@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> Hmm, I hadn't actually dared to run pdb with my patch... have MH> now, and it seems OK. Cool. MH> There is a difference: MH> The bytecode for, say, | def f(): | print 1 MH> begins with two SET_LINENO's. One is for the line containing MH> "def f():", one is for "print 1". My patch means the debugger MH> doesn't stop on the "def f():" line -- unsurprisingly, given MH> that no execution ever takes place on that line. MH> It would be possible to force a call to the trace function on MH> entry to the function. In fact, there's a commented out block MH> for this in my patch. Another approach would presuambly be MH> for pdb to stop on 'call' trace events as well as 'line' ones. MH> I don't really understand, or use all that often, pdb. I can't decide whether it would be good to stop on the def or not. Not doing so makes pdb act more like gdb, which also only stops on the first executable line, so maybe that's a good thing. MH> Also, you currently stop twice on the first line of a for MH> loop, but only once with my patch. That /is/ a good thing! >> Have you tested how this change might interact with >> e.g. hotshot? MH> test_hotshot was very important to me as evidence I was making MH> progress! :) MH> It currently fails due to the not-calling-trace-on-def-line MH> issue, but as I said, I think this is a *good* thing... So maybe we need two different behaviors depending on whether we're debugging or profiling. That might get a bit kludgy if we're using the same trace mechanism for both, but I'm sure it's tractable. -Barry From guido@python.org Tue Jul 30 16:26:23 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 11:26:23 -0400 Subject: [Python-Dev] PyArg_ParseTuple vs. PyArg_Parse In-Reply-To: Your message of "Tue, 30 Jul 2002 08:19:20 EDT." <3D468448.45C22891@metaslash.com> References: <3D468448.45C22891@metaslash.com> Message-ID: <200207301526.g6UFQNZ09835@odiug.zope.com> > I'll back out this change. But this raises the question should > PyArg_Parse() be deprecated or should just METH_OLDARGS be deprecated? Only METH_OLDARGS. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Tue Jul 30 16:27:46 2002 From: barry@python.org (Barry A. Warsaw) Date: Tue, 30 Jul 2002 11:27:46 -0400 Subject: [Python-Dev] seeing off SET_LINENO References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net> <3D4689AB.2020107@lemburg.com> <2mado9id99.fsf@starship.python.net> Message-ID: <15686.45170.12110.403625@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> Python-mode's pdbtrack should just carry on working. Yup, because it is basically just looking for the pdb prompt, so it shouldn't care. -Barry From guido@python.org Tue Jul 30 16:32:24 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 11:32:24 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: Your message of "Mon, 29 Jul 2002 23:43:23 EDT." <00eb01c2377b$41dd4340$6300000a@holdenweb.com> References: <15685.35726.678832.241665@anthem.wooz.org> <00eb01c2377b$41dd4340$6300000a@holdenweb.com> Message-ID: <200207301532.g6UFWOt09871@odiug.zope.com> > > This makes me jump fifteen years behind (or so, I do not remember times), > > at the time of the great push so the Internet prefers: > > > > Random J. User > > > > It is more reasonable to always give the real name, optionally followed by > > an email, that to consider that the real name is a mere comment for the > > email address. Oh, I know some hackers who praise themselves as login > > names or dream having positronic brains :-), but most of us are humans > > before anything else! > > > > Could the PEP be reformulated, at least, for leaving the choice opened? Yes. The rule will be Name first, Email second. We won't convert all 200 existing PEPs to that format yet, but if someone with commit privileges wants to volunteer, be our guest. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Jul 30 16:36:13 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 30 Jul 2002 11:36:13 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines References: <15685.35726.678832.241665@anthem.wooz.org> Message-ID: <15686.45677.421287.717866@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: >> It has been a while since I posted a copy of PEP 1 to the >> mailing lists and newsgroups. FP> Thanks for giving me this opportunity. There is a tiny detail FP> that bothers me: >> The format of the author entry should be address@dom.ain >> (Random J. User) if the email address is included, and just >> Random J. User if the address is not given. FP> This makes me jump fifteen years behind (or so, I do not FP> remember times), at the time of the great push so the Internet FP> prefers: FP> Random J. User FP> It is more reasonable to always give the real name, optionally FP> followed by an email, that to consider that the real name is a FP> mere comment for the email address. This is a good point. Originally we thought it was more important to be able to contact the author, but there are quite a few reasons to revise this intention. As pointed out, email addresses change. Also, experience has shown that most of the discussions about PEPs are conducted on the public forums (mailing lists / newsgroups), so that's a fine way to contact the people working on the PEP. And of course, we allow the PEP authors to obfuscate or omit their email addresses altogether. FP> Could the PEP be reformulated, at least, for leaving the FP> choice opened? I'd rather have one preferred way of writing the header, so I'm going to change PEP 1 to mandate "Random J. User " with the email address optional. However, I'm going to let the old style remain for historical purposes since I don't think it's worth changing the existing PEPs. Thanks, -Barry From guido@python.org Tue Jul 30 16:37:36 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 11:37:36 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 13:42:38 +1200." <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz> References: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz> Message-ID: <200207301537.g6UFbad09910@odiug.zope.com> > > I don't like where this is going. Let's not add locking to the buffer > > protocol. > > Do you still object to it even in the form I proposed in > my last message? (I.e. no separate "lock" call, locking > is implicit in the getxxxbuffer calls.) Yes, I still object. Having to make a call to release a resource with a function call is extremely error-prone, as we've seen with reference counting. There are too many cases where some early exit from a piece of code doesn't make the release call. > It does make the protocol slightly more complicated to > use (must remember to make a release call when you're > finished with the pointer) but it seems like a good > tradeoff to me for the flexibility gained. I'm not sure I see the use case. The main data types for which I expect this will be used would be strings and the new 'bytes' type, and both have fixed buffers that never move. > > probably nothing that could possibly invoke the Python interpreter > > recursively, since that might release the GIL. This would generally > > mean that calls to Py_DECREF() are unsafe while holding on to a buffer > > pointer! > > That could be fixed by incrementing the Python refcount as > long as a pointer is held. That could be done even without > the rest of my locking proposal. Of course, if you do that you > need a matching release call, so you might as well implement > the locking while you're at it. I think you misunderstand what I wrote. A py_DECREF() for an *unrelated* object can invoke Python code (if it ends up deleting a class instance with a __del__ method). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 30 16:39:30 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 11:39:30 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: Your message of "Mon, 29 Jul 2002 19:30:30 EDT." References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200207301539.g6UFdUS09930@odiug.zope.com> > > > I see no references to HAVE_CONFIG_H in the source code (except one > > > #undef in readline.c), yet we #define it on the command line. Is that > > > still necessary? > > > It's autoconf tradition to use that; it would replace DEFS to either > > many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears). > > > I don't think we need this, and it can safely be removed. > > The many `-D' options which appear when `AC_CONFIG_HEADER' is not used > are rather inelegant, they create a lot, really a lot of clumsiness in > `make' output. The idea, but you surely know it, was to regroup all > auto-configured definitions into a single header file, and limit the `-D' > to the sole `HAVE_CONFIG_H', or almost. While the: > > #if HAVE_CONFIG_H > # include > #endif > > idiom, for some widely used sources, was to cope with `AC_CONFIG_HEADER' > being defined in some projects, and not in others. There is no need to > include `config.h', nor to create it, if all `#define's have been already > done through a litany of `-D' options. Since we don't use this idiom, we can safely remove the -DHAVE_CONFIG_H (if we can find where it is set). --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Tue Jul 30 17:09:40 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 30 Jul 2002 18:09:40 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020730061016.32588.qmail@web40103.mail.yahoo.com> <005d01c237ae$4b2f6670$3da48490@neil> Message-ID: <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook> [Scott] > > A single lock interface can be implemented over an object without any > > locking. Have the lockable object return simple "fixed buffer objects" > > with a limited lifespan. > [Neil] > This returns to the possibility of indeterminate lifespan as mentioned > earlier in the thread. > Can't you do something like this (maybe this is what Scott has in mind): static void _unlock(void *ptr, MyObject *self) { /* do whatever needed to unlock the object */ self->locked--; Py_DECREF(self); } static PyObject* MyObject_GetBuffer(MyObject *self) { /* Do whatever needed to lock the object */ self->lock++; Py_INCREF(self); return PyCObject_FromVoidPtrAndDesc(self->ptr, self, _unlock) } In plain text: Provide a method which returns a 'view' into your object's buffer after locking the object. The view holds a reference to object, the objects is unlocked and decref'd when the view is destroyed. In practice something better than a PyCObject will be used, and this one can even implement the 'fixed buffer' interface. Thomas From guido@python.org Tue Jul 30 17:22:11 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 12:22:11 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: Your message of "Tue, 30 Jul 2002 11:39:30 EDT." <200207301539.g6UFdUS09930@odiug.zope.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> Message-ID: <200207301622.g6UGMBl17143@odiug.zope.com> > Since we don't use this idiom, we can safely remove the > -DHAVE_CONFIG_H (if we can find where it is set). I looked. It's generated by AC_OUTPUT. I don't think I can get rid of it. So never mind. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 30 17:39:00 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 12:39:00 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 09:37:18 +1000." <029801c23758$e13594b0$3da48490@neil> References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> Message-ID: <200207301639.g6UGd1S17363@odiug.zope.com> > > ..., but I understand Neil's requirements. > > > > Can they be fulfilled by adding some kind of UnlockObject() > > call to the 'safe buffer interface', which should mean 'I won't > > use the pointer received by getsaferead/writebufferproc any more'? > > Yes, that is exactly what I want. I guess I still don't understand Neil's requirements. What can't be done with the existing buffer interface (which requires you to hold the GIL while using the pointer)? --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Tue Jul 30 17:39:27 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 30 Jul 2002 12:39:27 -0400 Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) In-Reply-To: References: Message-ID: <20020730163927.GA63620@hishome.net> On Tue, Jul 30, 2002 at 12:24:05PM +0200, Matthias Urlichs wrote: > def some_iter(foo): > prepare(foo) > > try: > for i in foo: > yield something(i) > finally: > cleanup(foo) > > painlessly transmutes to this: > > class some_iter(object): > def __init__(foo): > prepare(foo) > > self.foo = foo > self.it = foo.__iter__() > > def next(self): > i = self.it.next() > return something(i) > > def __del__(self): > cleanup(self.foo) Bad example. Generators are useful precisely because some types of code are quite painful to change to this form. Anyway, it appears that generators can create reference loops if someone was peverted enough to keep a reference to the generator inside the generator. It doesn't seem to be worth the effort of making generators into GC objects just for this. Oren From pinard@iro.umontreal.ca Tue Jul 30 17:44:06 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 30 Jul 2002 12:44:06 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: <200207301539.g6UFdUS09930@odiug.zope.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> Message-ID: [Guido van Rossum] > Since we don't use this idiom, we can safely remove the > -DHAVE_CONFIG_H (if we can find where it is set). I guess you will have to override some `m4' macro within `configure.in', or related machinery. If things did not change too much, this probably means diving into `acgeneral.m4', to find out how and where this is best done. -- François Pinard http://www.iro.umontreal.ca/~pinard From nas@python.ca Tue Jul 30 17:56:58 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 30 Jul 2002 09:56:58 -0700 Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) In-Reply-To: <20020730163927.GA63620@hishome.net>; from oren-py-d@hishome.net on Tue, Jul 30, 2002 at 12:39:27PM -0400 References: <20020730163927.GA63620@hishome.net> Message-ID: <20020730095658.A3196@glacier.arctrix.com> Oren Tirosh wrote: > It doesn't seem to be worth the effort of making generators > into GC objects just for this. What do you mean. They are already GC objects. Neil From thomas.heller@ion-tof.com Tue Jul 30 17:51:41 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 30 Jul 2002 18:51:41 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> Message-ID: <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> From: "Guido van Rossum" > > > ..., but I understand Neil's requirements. > > > > > > Can they be fulfilled by adding some kind of UnlockObject() > > > call to the 'safe buffer interface', which should mean 'I won't > > > use the pointer received by getsaferead/writebufferproc any more'? > > > > Yes, that is exactly what I want. > > I guess I still don't understand Neil's requirements. What can't be > done with the existing buffer interface (which requires you to hold > the GIL while using the pointer)? Processing in Python :-(. Thoms From pinard@iro.umontreal.ca Tue Jul 30 17:53:38 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 30 Jul 2002 12:53:38 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: <200207301622.g6UGMBl17143@odiug.zope.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> Message-ID: [Guido van Rossum] > > Since we don't use this idiom, we can safely remove the > > -DHAVE_CONFIG_H (if we can find where it is set). > I looked. It's generated by AC_OUTPUT. I don't think I can get rid > of it. So never mind. :-) Maybe AC_OUTPUT, or macros called by AC_OUTPUT, can be overridden. If this is not easy to do, you might want to discuss the matter with Akim, Cc:ed. Maybe he could tear down AC_OUTPUT in parts so the overriding gets easier? I know my friend Akim as good, helping and nice fellow! Don't fear him! :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From thomas.heller@ion-tof.com Tue Jul 30 18:37:19 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 30 Jul 2002 19:37:19 +0200 Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface Message-ID: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> Here is PEP 298 - the Fixed Buffer Interface, posted to get feedback from the Python community. Enjoy! Thomas PS: I'll going to a 2 weeks vacation at the end of this week, so don't hold your breath on replies from me if you post after, let's say, thursday. ----- PEP: 298 Title: The Fixed Buffer Interface Version: $Revision: 1.3 $ Last-Modified: $Date: 2002/07/30 16:52:53 $ Author: Thomas Heller Status: Draft Type: Standards Track Created: 26-Jul-2002 Python-Version: 2.3 Post-History: Abstract This PEP proposes an extension to the buffer interface called the 'fixed buffer interface'. The fixed buffer interface fixes the flaws of the 'old' buffer interface as defined in Python versions up to and including 2.2, see [1]: The lifetime of the retrieved pointer is clearly defined. The buffer size is returned as a 'size_t' data type, which allows access to large buffers on platforms where sizeof(int) != sizeof(void *). Specification The fixed buffer interface exposes new functions which return the size and the pointer to the internal memory block of any python object which chooses to implement this interface. The size and pointer returned must be valid as long as the object is alive (has a positive reference count). So, only objects which never reallocate or resize the memory block are allowed to implement this interface. The fixed buffer interface omits the memory segment model which is present in the old buffer interface - only a single memory block can be exposed. Implementation Define a new flag in Include/object.h: /* PyBufferProcs contains bf_getfixedreadbuffer and bf_getfixedwritebuffer */ #define Py_TPFLAGS_HAVE_GETFIXEDBUFFER (1L<<15) This flag would be included in Py_TPFLAGS_DEFAULT: #define Py_TPFLAGS_DEFAULT ( \ .... Py_TPFLAGS_HAVE_GETFIXEDBUFFER | \ .... 0) Extend the PyBufferProcs structure by new fields in Include/object.h: typedef size_t (*getfixedreadbufferproc)(PyObject *, void **); typedef size_t (*getfixedwritebufferproc)(PyObject *, void **); typedef struct { getreadbufferproc bf_getreadbuffer; getwritebufferproc bf_getwritebuffer; getsegcountproc bf_getsegcount; getcharbufferproc bf_getcharbuffer; /* fixed buffer interface functions */ getfixedreadbufferproc bf_getfixedreadbufferproc; getfixedwritebufferproc bf_getfixedwritebufferproc; } PyBufferProcs; The new fields are present if the Py_TPFLAGS_HAVE_GETFIXEDBUFFER flag is set in the object's type. The Py_TPFLAGS_HAVE_GETFIXEDBUFFER flag implies the Py_TPFLAGS_HAVE_GETCHARBUFFER flag. The getfixedreadbufferproc and getfixedwritebufferproc functions return the size in bytes of the memory block on success, and fill in the passed void * pointer on success. If these functions fail - either because an error occurs or no memory block is exposed - they must set the void * pointer to NULL and raise an exception. The return value is undefined in these cases and should not be used. Usually the getfixedwritebufferproc and getfixedreadbufferproc functions aren't called directly, they are called through convenience functions declared in Include/abstract.h: int PyObject_AsFixedReadBuffer(PyObject *obj, void **buffer, size_t *buffer_len); int PyObject_AsFixedWriteBuffer(PyObject *obj, void **buffer, size_t *buffer_len); These functions return 0 on success, set buffer to the memory location and buffer_len to the length of the memory block in bytes. On failure, or if the fixed buffer interface is not implemented by obj, they return -1 and set an exception. Backward Compatibility The size of the PyBufferProcs structure changes if this proposal is implemented, but the type's tp_flags slot can be used to determine if the additional fields are present. Reference Implementation Will be uploaded to the SourceForge patch manager by the author. Additional Notes/Comments Python strings, Unicode strings, mmap objects, and maybe other types would expose the fixed buffer interface, but the array type would *not*, because its memory block may be reallocated during its lifetime. Community Feedback Greg Ewing doubts the fixed buffer interface is needed at all, he thinks the normal buffer interface could be used if the pointer is (re)fetched each time it's used. This seems to be dangerous, because even innocent looking calls to the Python API like Py_DECREF() may trigger execution of arbitrary Python code. Neil Hodgson wants to expose pointers to memory blocks with limited lifetime: do some kind of lock operation on the object, retrieve the pointer, use it, and unlock the object again. While the author sees the need for this, it cannot be addressed by this proposal. Beeing required to call a function after not using the pointer received by the getfixedbufferprocs any more seems too error prone. Credits Scott Gilbert came up with the name 'fixed buffer interface'. References [1] The buffer interface http://mail.python.org/pipermail/python-dev/2000-October/009974.html [2] The Buffer Problem http://www.python.org/peps/pep-0296.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From martin@v.loewis.de Tue Jul 30 18:55:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jul 2002 19:55:59 +0200 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: <200207301622.g6UGMBl17143@odiug.zope.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> Message-ID: Guido van Rossum writes: > I looked. It's generated by AC_OUTPUT. I don't think I can get rid > of it. So never mind. :-) Just remove the @DEFS@ from Makefile.pre.in. Regards, Martin From oren-py-d@hishome.net Tue Jul 30 19:13:08 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 30 Jul 2002 21:13:08 +0300 Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) In-Reply-To: <20020730095658.A3196@glacier.arctrix.com>; from nas-dated-1028480220.f9673d@python.ca on Tue, Jul 30, 2002 at 09:56:58AM -0700 References: <20020730163927.GA63620@hishome.net> <20020730095658.A3196@glacier.arctrix.com> Message-ID: <20020730211308.A27690@hishome.net> On Tue, Jul 30, 2002 at 09:56:58AM -0700, Neil Schemenauer wrote: > Oren Tirosh wrote: > > It doesn't seem to be worth the effort of making generators > > into GC objects just for this. > > What do you mean. They are already GC objects. Ooops. Oren From guido@python.org Tue Jul 30 19:57:00 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 14:57:00 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: Your message of "Tue, 30 Jul 2002 12:44:06 EDT." References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> Message-ID: <200207301857.g6UIv0G17893@odiug.zope.com> > > Since we don't use this idiom, we can safely remove the > > -DHAVE_CONFIG_H (if we can find where it is set). > > I guess you will have to override some `m4' macro within `configure.in', or > related machinery. If things did not change too much, this probably means > diving into `acgeneral.m4', to find out how and where this is best done. I haven't the guts. Would you mind sending a patch? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 30 19:59:06 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 14:59:06 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 18:51:41 +0200." <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> Message-ID: <200207301859.g6UIx6117906@odiug.zope.com> > From: "Guido van Rossum" > > > > ..., but I understand Neil's requirements. > > > > > > > > Can they be fulfilled by adding some kind of UnlockObject() > > > > call to the 'safe buffer interface', which should mean 'I won't > > > > use the pointer received by getsaferead/writebufferproc any more'? > > > > > > Yes, that is exactly what I want. > > > > I guess I still don't understand Neil's requirements. What can't be > > done with the existing buffer interface (which requires you to hold > > the GIL while using the pointer)? > > Processing in Python :-(. Can you work out an example? I don't understand what you can do in Python, apart from passing it to something else that takes the buffer API or converting the data to a string or a bytes buffer. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 30 20:06:47 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 15:06:47 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: Your message of "Tue, 30 Jul 2002 14:57:00 EDT." <200207301857.g6UIv0G17893@odiug.zope.com> References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301857.g6UIv0G17893@odiug.zope.com> Message-ID: <200207301906.g6UJ6l619069@odiug.zope.com> > > > Since we don't use this idiom, we can safely remove the > > > -DHAVE_CONFIG_H (if we can find where it is set). > > > > I guess you will have to override some `m4' macro within `configure.in', or > > related machinery. If things did not change too much, this probably means > > diving into `acgeneral.m4', to find out how and where this is best done. > > I haven't the guts. Would you mind sending a patch? Never mind. Getting rid of DEFS from Makefile.pre.in did the trick. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Tue Jul 30 20:22:53 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 30 Jul 2002 21:22:53 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> Message-ID: <063301c237fe$80506b10$e000a8c0@thomasnotebook> [Guido] > > > I guess I still don't understand Neil's requirements. What can't be > > > done with the existing buffer interface (which requires you to hold > > > the GIL while using the pointer)? > > > > Processing in Python :-(. > > Can you work out an example? Not sure, maybe Neil could do it better. However, you yourself pointed out to Greg that it may be unsafe to even call Py_DECREF() on an unrelated object. > I don't understand what you can do in > Python, apart from passing it to something else that takes the buffer > API or converting the data to a string or a bytes buffer. Or pack it into a buffer *object* and hand it to arbitrary Python code. That's what we have now. What does 'hold the GIL' mean in this context? No other thread can execute: we have complete control over what we do. But what are we *allowed* to do? Thomas From guido@python.org Tue Jul 30 20:37:37 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 15:37:37 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 21:22:53 +0200." <063301c237fe$80506b10$e000a8c0@thomasnotebook> References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook> Message-ID: <200207301937.g6UJbb220763@odiug.zope.com> > > > > I guess I still don't understand Neil's requirements. What can't be > > > > done with the existing buffer interface (which requires you to hold > > > > the GIL while using the pointer)? > > > > > > Processing in Python :-(. > > > > Can you work out an example? > Not sure, maybe Neil could do it better. > > However, you yourself pointed out to Greg that it may be unsafe > to even call Py_DECREF() on an unrelated object. The safe rule is that you should grab the pointer and then do some I/O on it and nothing else. > > I don't understand what you can do in > > Python, apart from passing it to something else that takes the buffer > > API or converting the data to a string or a bytes buffer. > > Or pack it into a buffer *object* and hand it to arbitrary > Python code. That's what we have now. Since the object you're packing already supports the buffer API, I don't see the point of packing it in a buffer object. > What does 'hold the GIL' mean in this context? > No other thread can execute: we have complete control > over what we do. But what are we *allowed* to do? When accessing a movable buffer, the safest rule is no Python API calls. There's a less restrictive safe rule, but it's messy because the end goal is "don't do anything that could conceivably end up in the Python interpreter main loop (ceval.c)" and there's no easy rule for that -- anything that uses Py_DECREF can end up doing that. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 30 20:46:41 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 15:46:41 -0400 Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 19:37:19 +0200." <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> Message-ID: <200207301946.g6UJkf520799@odiug.zope.com> > Here is PEP 298 - the Fixed Buffer Interface, posted to > get feedback from the Python community. > Enjoy! +1 from me (but you already knew that). > Thomas > > PS: I'll going to a 2 weeks vacation at the end of this week, > so don't hold your breath on replies from me if you post > after, let's say, thursday. > > ----- > PEP: 298 > Title: The Fixed Buffer Interface > Version: $Revision: 1.3 $ > Last-Modified: $Date: 2002/07/30 16:52:53 $ > Author: Thomas Heller > Status: Draft > Type: Standards Track > Created: 26-Jul-2002 > Python-Version: 2.3 > Post-History: > > > Abstract > > This PEP proposes an extension to the buffer interface called the > 'fixed buffer interface'. > > The fixed buffer interface fixes the flaws of the 'old' buffer > interface as defined in Python versions up to and including 2.2, > see [1]: (I keep reading this backwards, thinking that the following two items list the flaws in [1]. :-) > The lifetime of the retrieved pointer is clearly defined. > > The buffer size is returned as a 'size_t' data type, which > allows access to large buffers on platforms where sizeof(int) > != sizeof(void *). This second sounds like a change we could also make to the "old" buffer interface, if we introduce another flag bit that's *not* part of the default flags. > Specification > > The fixed buffer interface exposes new functions which return the > size and the pointer to the internal memory block of any python > object which chooses to implement this interface. > > The size and pointer returned must be valid as long as the object > is alive (has a positive reference count). So, only objects which > never reallocate or resize the memory block are allowed to > implement this interface. > > The fixed buffer interface omits the memory segment model which is > present in the old buffer interface - only a single memory block > can be exposed. > > > Implementation > > Define a new flag in Include/object.h: > > /* PyBufferProcs contains bf_getfixedreadbuffer > and bf_getfixedwritebuffer */ > #define Py_TPFLAGS_HAVE_GETFIXEDBUFFER (1L<<15) > > > This flag would be included in Py_TPFLAGS_DEFAULT: > > #define Py_TPFLAGS_DEFAULT ( \ > .... > Py_TPFLAGS_HAVE_GETFIXEDBUFFER | \ > .... > 0) > > > Extend the PyBufferProcs structure by new fields in > Include/object.h: > > typedef size_t (*getfixedreadbufferproc)(PyObject *, void **); > typedef size_t (*getfixedwritebufferproc)(PyObject *, void **); > > typedef struct { > getreadbufferproc bf_getreadbuffer; > getwritebufferproc bf_getwritebuffer; > getsegcountproc bf_getsegcount; > getcharbufferproc bf_getcharbuffer; > /* fixed buffer interface functions */ > getfixedreadbufferproc bf_getfixedreadbufferproc; > getfixedwritebufferproc bf_getfixedwritebufferproc; > } PyBufferProcs; > > > The new fields are present if the Py_TPFLAGS_HAVE_GETFIXEDBUFFER > flag is set in the object's type. > > The Py_TPFLAGS_HAVE_GETFIXEDBUFFER flag implies the > Py_TPFLAGS_HAVE_GETCHARBUFFER flag. > > The getfixedreadbufferproc and getfixedwritebufferproc functions > return the size in bytes of the memory block on success, and fill > in the passed void * pointer on success. If these functions fail > - either because an error occurs or no memory block is exposed - > they must set the void * pointer to NULL and raise an exception. > The return value is undefined in these cases and should not be > used. > > Usually the getfixedwritebufferproc and getfixedreadbufferproc > functions aren't called directly, they are called through > convenience functions declared in Include/abstract.h: > > int PyObject_AsFixedReadBuffer(PyObject *obj, > void **buffer, > size_t *buffer_len); > > int PyObject_AsFixedWriteBuffer(PyObject *obj, > void **buffer, > size_t *buffer_len); > > These functions return 0 on success, set buffer to the memory > location and buffer_len to the length of the memory block in > bytes. On failure, or if the fixed buffer interface is not > implemented by obj, they return -1 and set an exception. > > > Backward Compatibility > > The size of the PyBufferProcs structure changes if this proposal > is implemented, but the type's tp_flags slot can be used to > determine if the additional fields are present. > > > Reference Implementation > > Will be uploaded to the SourceForge patch manager by the author. I'm holding my breath now... > > Additional Notes/Comments > > Python strings, Unicode strings, mmap objects, and maybe other > types would expose the fixed buffer interface, but the array type > would *not*, because its memory block may be reallocated during > its lifetime. > > > Community Feedback > > Greg Ewing doubts the fixed buffer interface is needed at all, he > thinks the normal buffer interface could be used if the pointer is > (re)fetched each time it's used. This seems to be dangerous, > because even innocent looking calls to the Python API like > Py_DECREF() may trigger execution of arbitrary Python code. > > Neil Hodgson wants to expose pointers to memory blocks with > limited lifetime: do some kind of lock operation on the object, > retrieve the pointer, use it, and unlock the object again. While > the author sees the need for this, it cannot be addressed by this > proposal. Beeing required to call a function after not using the x > pointer received by the getfixedbufferprocs any more seems too > error prone. > > > Credits > > Scott Gilbert came up with the name 'fixed buffer interface'. > > > References > > [1] The buffer interface > http://mail.python.org/pipermail/python-dev/2000-October/009974.html > > [2] The Buffer Problem > http://www.python.org/peps/pep-0296.html > > > Copyright > > This document has been placed in the public domain. > > > > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > End: --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Tue Jul 30 21:15:11 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 30 Jul 2002 23:15:11 +0300 Subject: [Python-Dev] Valgrinding Python Message-ID: <20020730231511.A28762@hishome.net> I ran some tests with Julian Seward's amazing Valgrind memory debugger. Python is remarkably clean. Much cleaner than any other program of non-trivial size that I tested. Objects/obmalloc.c: The ADDRESS_IN_RANGE macro makes references to uninitialized memory. This produced tons of warnings so I ran the rest of the tests without pymalloc. The following tests produced invalid accesses inside the external library: test_anydbm.py test_bsddb.py test_dbm.py test_gdbm.py test_curses.py test_pwd.py test_socket_ssl.py I also got some invalid accesses in Modules/arraymodule.c:array_ass_subscr while running test_array and in Objects/Listobject.c:list_ass_subscript running test_types. For some reason I couldn't reproduce them later. Oren From jacobs@penguin.theopalgroup.com Tue Jul 30 21:21:36 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 30 Jul 2002 16:21:36 -0400 (EDT) Subject: [Python-Dev] Valgrinding Python In-Reply-To: <20020730231511.A28762@hishome.net> Message-ID: On Tue, 30 Jul 2002, Oren Tirosh wrote: > I ran some tests with Julian Seward's amazing Valgrind memory debugger. > Python is remarkably clean. Much cleaner than any other program of > non-trivial size that I tested. I've been using Python with valgrind too, and with great success. I've caught several non-trivial problems in some of our extension modules, though only a few very picky things in the Python core. Valgrind has options to attached gdb to running processes when problems occur. Combining this with gdb patched to produce mixed C/Python tracebacks, and you get an awesome memory debugger. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From nhodgson@bigpond.net.au Tue Jul 30 21:55:39 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 31 Jul 2002 06:55:39 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook> Message-ID: <004e01c2380b$762ef5e0$3da48490@neil> Thomas Heller (Guido, Thomas, Guido): > [Guido] > > > > I guess I still don't understand Neil's requirements. What can't be > > > > done with the existing buffer interface (which requires you to hold > > > > the GIL while using the pointer)? > > > > > > Processing in Python :-(. > > > > Can you work out an example? > Not sure, maybe Neil could do it better. I see this interface as a bridge between objects offering generic buffer oriented facilities (asynch or low level I/O for example) and objects that want to make it possible to use these facilities on their data (text buffers, multimedia buffers, numeric arrays) by yielding a pointer to their otherwise internal data. The bridging code between the two objects is unrestricted Python code that may cause memory to be moved around. Neil From guido@python.org Tue Jul 30 22:13:00 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 30 Jul 2002 17:13:00 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Wed, 31 Jul 2002 06:55:39 +1000." <004e01c2380b$762ef5e0$3da48490@neil> References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook> <004e01c2380b$762ef5e0$3da48490@neil> Message-ID: <200207302113.g6ULD0N21213@odiug.zope.com> > I see this interface as a bridge between objects offering generic buffer > oriented facilities (asynch or low level I/O for example) and objects that > want to make it possible to use these facilities on their data (text > buffers, multimedia buffers, numeric arrays) by yielding a pointer to their > otherwise internal data. > > The bridging code between the two objects is unrestricted Python code > that may cause memory to be moved around. If the buffer is relatively small, copying the data an extra time shouldn't be a problem, and you can use the old API. If the buffer is huge, you probably shouldn't want to move the buffer around in memory anyway, So I don't think your case for needing a lockable interface is very strong. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue Jul 30 22:56:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 30 Jul 2002 17:56:35 -0400 Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators) In-Reply-To: <200207300112.g6U1CJoO018210@kuku.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > I don't think you'd really be breaking any promises. > After all, if someone wrote > > def asdf(): > try: > something_that_never_returns() > finally: > ... > > they wouldn't have much ground for complaint that the > finally never got executed. The case we're talking about > seems much the same situation. Not to me -- you can't write something_that_never_returns() in Python unless the program runs forever, you crash the system, you get the thread stuck in deadlock or permanent starvation, or you're anti-social by calling os._exit() (sys.exit() is fine: it raises SystemExit, and pending finally blocks get run then). All of those are highly exceptional use cases; everyone else is guaranteed their finally block will eventually run. > I take it you usually provide a method for explicit cleanup. Yup. > How about giving generator-iterators one, then, called > maybe close() or abort(). The effect would be to raise > an appropriate exception at the point of the yield, > triggering any except or finally blocks. As before, I'm already happy; sharing state via instance variables is all "the solution" I've felt a need for. If consensus is that something needs to be done here anyway, I'd rather think of generators more as threads of control than as lumps of data with attributes. From that view, I think it would be easier to make a coherent case that generators should support a termination protocol involving raising SystemExit. But then that should apply to all thread-like objects too, and there's no way now for one thread to raise SystemExit in another (but it's arguable that there should be). > This method could even be added to the general iterator > protocol (implementing it would be optional). It would > then provide a standard name for people to use for > cleanup methods in their own iterator classes. Generalizing from zero examples ? From tim.one@comcast.net Tue Jul 30 23:53:04 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 30 Jul 2002 18:53:04 -0400 Subject: [Python-Dev] Valgrinding Python In-Reply-To: <20020730231511.A28762@hishome.net> Message-ID: [Oren Tirosh] > I ran some tests with Julian Seward's amazing Valgrind memory debugger. > Python is remarkably clean. Much cleaner than any other program of > non-trivial size that I tested. It's been thru Purify and Insure++, off and on, several times, and we enjoyed many wasted hours squashing suprious complaints from those . > Objects/obmalloc.c: > > The ADDRESS_IN_RANGE macro makes references to uninitialized memory. > > This produced tons of warnings so I ran the rest of the tests without > pymalloc. Ouch. That's not going to change, so it may be worth learning how to write a Valgrind suppression file. ADDRESS_IN_RANGE determines whether an address was passed out by pymalloc. It does this by (a) reading an index from an address computed *from* the claimant address; then (b) using that to index into its own data structures, which record the range of addresses pymalloc controls; then (c) comparing the claimant address to that range. Part #a can easily end up reading uninitialized memory. but pymalloc doesn't care (a junk value found there can't fool it). This is needed to determine whether to hand off an address to the platform free() or realloc(), and in such cases part #a may well read up any kind of trash. > The following tests produced invalid accesses inside the external > library: > > test_anydbm.py > test_bsddb.py > test_dbm.py > test_gdbm.py > test_curses.py > test_pwd.py > test_socket_ssl.py Figures . > I also got some invalid accesses in > Modules/arraymodule.c:array_ass_subscr > while running test_array and in Objects/Listobject.c:list_ass_subscript > running test_types. For some reason I couldn't reproduce them later. Another memory-debugging tool, another chance to debug a memory-debugging tool. From neal@metaslash.com Wed Jul 31 00:15:34 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 30 Jul 2002 19:15:34 -0400 Subject: [Python-Dev] Valgrinding Python References: Message-ID: <3D471E16.A01B14D8@metaslash.com> Tim Peters wrote: > > [Oren Tirosh] > > > I also got some invalid accesses in > > Modules/arraymodule.c:array_ass_subscr > > while running test_array and in Objects/Listobject.c:list_ass_subscript > > running test_types. For some reason I couldn't reproduce them later. > > Another memory-debugging tool, another chance to debug a memory-debugging > tool. Naw, cvs update can explain this one. :-) Michael Hudson fixed this (extended slice problem) based on a bug report I submitted. I ran valgrind on RedHat 7.2. I also had problems w/pymalloc originally so I disabled it. I may try again. There's somthing I found very interesting, though. I run purify on a sparc w/gcc 2.95.3 (maybe 3.0.x too, I can't remember). The problems with pymalloc and some of the dbm problems were also reported by purify. I've reviewed the code and can't find any problems. But different tools on different architectures with somewhat different compilers report similar errors. Neal From greg@cosc.canterbury.ac.nz Wed Jul 31 00:34:29 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 11:34:29 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <20020730061026.33569.qmail@web40106.mail.yahoo.com> Message-ID: <200207302334.g6UNYTZ7018964@kuku.cosc.canterbury.ac.nz> Scott Gilbert : > We haven't seen a semi-thorough use case where the locking behavior is > beneficial yet. ... If there is no realizable benefit to the > acquire/release semantics of the new interface, then this is just extra > burden too. The proposer of the original safe-buffer interface claimed to have a use case where the existing buffer interface is not safe enough, involving asynchronous I/O. I've been basing my comments on the assumption that he does actually have a need for it. The original proposal was restricted to non-resizable objects. I suggested a small extension which would remove this restriction, at what seems to me quite a small cost. It may turn out that the restriction is easily lived with. On the other hand, we might decide later that it's a nuisance. What worries me is if we design a restricted safe-buffer interface now, and start using it, and later decide that we want an unrestricted safe-buffer interface, we'll then have two different safe-buffer interfaces around, with lots of code that will only accept non-resizable objects for no reason other than that it's using the old interface. So I think it's worth putting in some thought and getting it as right as we can from the beginning. > I'm concerned that this is very much like the segment count features > of the current PyBufferProcs. It was apparently designed for more > generality, and while no one uses it, everyone has to check that the > segment count is one or raise an exception. It's not as bad as that! My version of the proposal would impose *no* burden on implementations that did not require locking, for the following reasons: 1) Locking is an optional task performed by the getxxxbuffer routines. Objects which do not require locking just don't do it. 2) For objects not requiring locking, the releasebuffer operation is a no-op. Such an object can simply not implement this routine, and the type machinery can fill it in with a stub. It does place one extra burden on users of the interface, namely calling the release routine. But I believe that this could even be beneficial, in a way. The user is going to have to think about the lifetime of the pointer, and be sure to keep a reference to the underlying Python object as long as the pointer is needed. Having to keep it around so that you can call the release routine on it would help to bring this into sharp focus. > The extension releases the GIL so that another > thread can work on the array object. Hey, whoa right there! If you have two threads accessing this array object simulaneously, you should be using a mutex or semaphore or something to coordinate them. As I pointed out before, thread synchronisation is outside the scope of my proposal. The only purpose of the locking, in my proposal, is to ensure that an exception occurs instead of a crash if the programmer screws up and tries to resize an object whose internals are being messed with. It's up to the programmer to do whatever is necessary to ensure that he doesn't do that. > If extend() is called while thread 1 has the array locked, it can: > > A) raise an exception or return an error Yes. (Raise an exception.) > Case A is troublesome because depending on thread scheduling/disk > performance, you will or won't get the exception. As I said before, you should be synchronising your threads somehow *before* they operate on the object! If you don't, you deserve whatever you get. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Wed Jul 31 01:03:55 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 12:03:55 +1200 (NZST) Subject: [Python-Dev] seeing off SET_LINENO In-Reply-To: <2md6t5ieh2.fsf@starship.python.net> Message-ID: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz> Michael Hudson : > My patch means the debugger doesn't stop > on the "def f():" line -- unsurprisingly, given that no execution ever > takes place on that line. If there is no code there, there shouldn't be any need to stop there, should there? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Wed Jul 31 01:12:55 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 12:12:55 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207301537.g6UFbad09910@odiug.zope.com> Message-ID: <200207310012.g6V0Ctj5019001@kuku.cosc.canterbury.ac.nz> > I think you misunderstand what I wrote. A py_DECREF() for an > *unrelated* object can invoke Python code (if it ends up deleting a > class instance with a __del__ method). I don't see why that's a problem. If the unrelated object's __del__ ends up messing with the object in question, that's an issue for the programmer to sort out. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Wed Jul 31 01:09:59 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 30 Jul 2002 20:09:59 -0400 Subject: [Python-Dev] Valgrinding Python In-Reply-To: <3D471E16.A01B14D8@metaslash.com> Message-ID: [Neal Norwitz] > ... > I also had problems w/pymalloc originally so I disabled it. > I may try again. There's somthing I found very interesting, though. > > I run purify on a sparc w/gcc 2.95.3 (maybe 3.0.x too, > I can't remember). The problems with pymalloc and some of the dbm > problems were also reported by purify. I've reviewed the code > and can't find any problems. But different tools on different > architectures with somewhat different compilers report similar errors. pymalloc does read uninitialized memory, and routinely, as explained in the msg you're replying to. If that occurs outside code generated for the ADDRESS_IN_RANGE macro, though, it may be a real problem (inside code generated by that macro, reading uninitialized memory is-- curiously enough! --necessary for proper operation). From greg@cosc.canterbury.ac.nz Wed Jul 31 01:14:56 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 12:14:56 +1200 (NZST) Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <15686.45677.421287.717866@anthem.wooz.org> Message-ID: <200207310014.g6V0EuUS019007@kuku.cosc.canterbury.ac.nz> > Originally we thought it was more important to > be able to contact the author, but there are quite a few reasons to > revise this intention. As pointed out, email addresses change. Also, > experience has shown that most of the discussions about PEPs are > conducted on the public forums (mailing lists / newsgroups), so that's > a fine way to contact the people working on the PEP. And of course, > we allow the PEP authors to obfuscate or omit their email addresses > altogether. Why not have *two* fields in the PEP, one for the real name, and the other for an email address? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@python.org Wed Jul 31 01:49:06 2002 From: barry@python.org (Barry A. Warsaw) Date: Tue, 30 Jul 2002 20:49:06 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines References: <15686.45677.421287.717866@anthem.wooz.org> <200207310014.g6V0EuUS019007@kuku.cosc.canterbury.ac.nz> Message-ID: <15687.13314.271722.779762@anthem.wooz.org> >>>>> "GE" == Greg Ewing writes: GE> Why not have *two* fields in the PEP, one for the real GE> name, and the other for an email address? I dunno, that seems like overkill. -Barry From greg@cosc.canterbury.ac.nz Wed Jul 31 02:44:23 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 13:44:23 +1200 (NZST) Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <15687.13314.271722.779762@anthem.wooz.org> Message-ID: <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz> Barry: > GE> Why not have *two* fields in the PEP, one for the real > GE> name, and the other for an email address? > > I dunno, that seems like overkill. It would certainly put an end to this argument, though! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From nhodgson@bigpond.net.au Wed Jul 31 03:15:28 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 31 Jul 2002 12:15:28 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook> <004e01c2380b$762ef5e0$3da48490@neil> <200207302113.g6ULD0N21213@odiug.zope.com> Message-ID: <039701c23838$26dbfab0$3da48490@neil> Guido van Rossum: > If the buffer is relatively small, copying the data an extra time > shouldn't be a problem, and you can use the old API. > > If the buffer is huge, you probably shouldn't want to move the buffer > around in memory anyway, Even large (or huge) buffers may need extension (inserting text in Scintilla, adding a frame to a movie), leading to a reallocation and thus a move. Neil From nhodgson@bigpond.net.au Wed Jul 31 03:01:25 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 31 Jul 2002 12:01:25 +1000 Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> Message-ID: <039301c23838$24a21040$3da48490@neil> Thomas Heller: > Abstract > > This PEP proposes an extension to the buffer interface called the > 'fixed buffer interface'. I'd like to see the purpose of the interface defined here rather than rely upon a reference to an email which talks about two buffer entities, the API and the object. Reading the email produces a purpose that could be used here: [the Buffer API is] intended to allow efficient binary I/O from and (in some cases) to large objects that have a relatively well-understood underlying memory representation Neil From nhodgson@bigpond.net.au Wed Jul 31 03:12:31 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 31 Jul 2002 12:12:31 +1000 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020730061016.32588.qmail@web40103.mail.yahoo.com> <005d01c237ae$4b2f6670$3da48490@neil> <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook> Message-ID: <039401c23838$25ad8cd0$3da48490@neil> Thomas Heller: > In plain text: > Provide a method which returns a 'view' into your object's > buffer after locking the object. The view holds a reference > to object, the objects is unlocked and decref'd when the > view is destroyed. Yes, this handles the situation. However I see some problems here: 1 Explicit resource release, such as closing files, is easier to understand and debug than implicit ref-count exhaustion. 2 On platforms such as .NET and the JVM, the view object will live for an indeterminate time, prohibiting resizes until the VM decides to garbage collect. While the JVM can not return pointers, and so may seem to not be a candidate for this interface, it can return array references. 3 More complex implementation requiring a secondary view object. Neil From neal@metaslash.com Wed Jul 31 03:19:08 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 30 Jul 2002 22:19:08 -0400 Subject: [Python-Dev] Valgrinding Python References: Message-ID: <3D47491C.B0E9E165@metaslash.com> Tim Peters wrote: > pymalloc does read uninitialized memory, and routinely, as explained in the > msg you're replying to. If that occurs outside code generated for the > ADDRESS_IN_RANGE macro, though, it may be a real problem (inside code > generated by that macro, reading uninitialized memory is-- curiously > enough! --necessary for proper operation). This is good news. I changed ADDRESS_IN_RANGE to a function, then suppressed it. There were no other uninitialized memory reads. Valgrind does report a bunch of problems with pthreads, but these are likely valgrind's fault. There are some complaints about memory leaks, but these seem to appear only to occur when spawning/threading. The leaks are small and short lived. Neal From barry@python.org Wed Jul 31 03:22:12 2002 From: barry@python.org (Barry A. Warsaw) Date: Tue, 30 Jul 2002 22:22:12 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines References: <15687.13314.271722.779762@anthem.wooz.org> <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz> Message-ID: <15687.18900.871205.963521@anthem.wooz.org> >>>>> "GE" == Greg Ewing writes: GE> Barry: >> GE> Why not have *two* fields in the PEP, one for the real GE> >> name, and the other for an email address? >> I dunno, that seems like overkill. GE> It would certainly put an end to this argument, though! What argument? :) -Barry From aahz@pythoncraft.com Wed Jul 31 04:36:39 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 30 Jul 2002 23:36:39 -0400 Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <15687.18900.871205.963521@anthem.wooz.org> References: <15687.13314.271722.779762@anthem.wooz.org> <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz> <15687.18900.871205.963521@anthem.wooz.org> Message-ID: <20020731033639.GB14993@panix.com> On Tue, Jul 30, 2002, Barry A. Warsaw wrote: > > >>>>> "GE" == Greg Ewing writes: > > GE> Barry: > > >> GE> Why not have *two* fields in the PEP, one for the real GE> > >> name, and the other for an email address? > >> I dunno, that seems like overkill. > > GE> It would certainly put an end to this argument, though! > > What argument? :) You blithering idiot, you ought to be smacked with a fish. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From greg@cosc.canterbury.ac.nz Wed Jul 31 05:18:35 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Jul 2002 16:18:35 +1200 (NZST) Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines In-Reply-To: <20020731033639.GB14993@panix.com> Message-ID: <200207310418.g6V4IZVf019187@kuku.cosc.canterbury.ac.nz> Aahz : > > GE> It would certainly put an end to this argument, though! > > > > What argument? :) > > You blithering idiot, you ought to be smacked with a fish. No, that's abuse. Arguments are next door... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mhammond@skippinet.com.au Wed Jul 31 06:28:44 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 31 Jul 2002 15:28:44 +1000 Subject: [Python-Dev] seeing off SET_LINENO In-Reply-To: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz> Message-ID: > Michael Hudson : > > > My patch means the debugger doesn't stop > > on the "def f():" line -- unsurprisingly, given that no execution ever > > takes place on that line. [Greg] > If there is no code there, there shouldn't be any > need to stop there, should there? [Barry in a different message] > I can't decide whether it would be good to stop on the def or not. > Not doing so makes pdb act more like gdb, which also only stops on the > first executable line, so maybe that's a good thing. IMO, the Python debugger "interface" should include function entry. The debugger UI (in this case pdb, but any other debugger) may choose not to break there, but the debugger itself may be able to implement some useful things by having the hook. Mark. From xscottg@yahoo.com Wed Jul 31 07:29:50 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 30 Jul 2002 23:29:50 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <005d01c237ae$4b2f6670$3da48490@neil> Message-ID: <20020731062950.59376.qmail@web40105.mail.yahoo.com> --- Neil Hodgson wrote: > > Since Scintilla is a component within a user interface, it shares this > responsibility with the container application with the application being > the main determinant. If I was writing a Windows-specific application that > used Scintilla, and I wanted to use Asynchronous I/O then my preferred > technique would be to change the message processing loop to leave the UI > input messages in the queue until the I/O had completed. > Once the I/O had completed then the message loop would change back to > processing all messages which would allow the banked up input to come > through. > Cool. This is what I was looking for. It's a tad complicated, but it makes a bit of sense. Is there anything in here that can't be done if you only had the simple (no locking) version of the fixed buffer interface? > > > A single lock interface can be implemented over an object without any > > locking. Have the lockable object return simple "fixed buffer objects" > > with a limited lifespan. > > This returns to the possibility of indeterminate lifespan as mentioned > earlier in the thread. > Not if you add an explicit release() method. Just like the file object has an explicit close() method. Your object with the locking smarts could just return "snapshot" views with an explicit release() method on them. > > > At which point I wonder what using asynchronous I/O achieved since the > > resize operation had to wait synchronously for the I/O to complete. > > This also sounds suspiciously like blocking the resize thread, but I > > won't argue that point. > > There may be other tasks that the application can perform while > waiting for the I/O to complete, such as displaying, styling or line- > wrapping whatever text has already arrived (assuming that there are some > facilities for discovering this) or performing similar tasks for other > windows. > All good points. Thank you for indulging me. Sorry to be such a PITA. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 31 07:29:59 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 30 Jul 2002 23:29:59 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook> Message-ID: <20020731062959.59382.qmail@web40105.mail.yahoo.com> --- Thomas Heller wrote: > > In plain text: > Provide a method which returns a 'view' into your object's > buffer after locking the object. The view holds a reference > to object, the objects is unlocked and decref'd when the > view is destroyed. > Exactly. This is just like putting an explicit close() on the file object. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 31 07:30:58 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 30 Jul 2002 23:30:58 -0700 (PDT) Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface In-Reply-To: <039301c23838$24a21040$3da48490@neil> Message-ID: <20020731063058.76595.qmail@web40103.mail.yahoo.com> --- Neil Hodgson wrote: > > I'd like to see the purpose of the interface defined here rather than > rely upon a reference to an email which talks about two buffer entities, > the API and the object. Reading the email produces a purpose that could > be used here: > > [the Buffer API is] intended to allow efficient > binary I/O from and (in some cases) to large objects that have a > relatively well-understood underlying memory representation > It's not just for I/O. In addition to I/O, I intend to use it for numerical calculations that can be run independently of the GIL. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 31 07:30:55 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 30 Jul 2002 23:30:55 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <039401c23838$25ad8cd0$3da48490@neil> Message-ID: <20020731063055.74410.qmail@web40110.mail.yahoo.com> --- Neil Hodgson wrote: > Thomas Heller: > > > In plain text: > > Provide a method which returns a 'view' into your object's > > buffer after locking the object. The view holds a reference > > to object, the objects is unlocked and decref'd when the > > view is destroyed. > > Yes, this handles the situation. However I see some problems here: > 1 Explicit resource release, such as closing files, is easier to > understand and debug than implicit ref-count exhaustion. > So add an explicit release() method to your object. Just because it supports the "Fixed Buffer API" doesn't mean you can't add other methods to it. > > 2 On platforms such as .NET and the JVM, the view object will live for an > indeterminate time, prohibiting resizes until the VM decides to garbage > collect. While the JVM can not return pointers, and so may seem to not be > a candidate for this interface, it can return array references. > This is solved with the explicit release() method above. Just like files solve this problem with an explicit close() method. > > 3 More complex implementation requiring a secondary view object. > It's also a more complex problem that you're trying to solve. Putting the complexity on the common, simple, cases may not be appropriate when the complex cases are few and far between. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From xscottg@yahoo.com Wed Jul 31 07:31:13 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Tue, 30 Jul 2002 23:31:13 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207302334.g6UNYTZ7018964@kuku.cosc.canterbury.ac.nz> Message-ID: <20020731063113.74481.qmail@web40110.mail.yahoo.com> --- Greg Ewing wrote: > Scott Gilbert : > > > We haven't seen a semi-thorough use case where the locking behavior is > > beneficial yet. ... If there is no realizable benefit to the > > acquire/release semantics of the new interface, then this is just extra > > burden too. > > The proposer of the original safe-buffer interface claimed to have a > use case where the existing buffer interface is not safe enough, > involving asynchronous I/O. I've been basing my comments on the > assumption that he does actually have a need for it. > I believe Thomas Heller's needs were met without making locking part of the interface, but that he was willing to bend to please you and Neil. His original proposal did not include any notion of locking. Nor does his current since Guido has taken a stand on this issue. > > So I think it's worth putting in some thought and getting it as > right as we can from the beginning. > Absolutely. I just wanted to make sure that there is at least one sensible use case before adding the complexity. Moreover, if the sensible use cases for locking are few and far between, then I'm still inclined to leave it out since you can add the locking semantics at a different level. It looks like Neil has sufficiently defined an example where it's useful. His use case is a bit complicated though, and I think he could get every bit of that functionality by putting the locking in a smarter object tailored for his application, and working with temporary "snapshot" objects with an explicit release() method. What if Neil decides he needs Reader/Writer locks? This is completely justifiable too, since multiple threads can read an object without interfering, but only one should be writing it. We shouldn't arbitrarily add complexity for the exceptional cases. > > > I'm concerned that this is very much like the segment count features > > of the current PyBufferProcs. It was apparently designed for more > > generality, and while no one uses it, everyone has to check that the > > segment count is one or raise an exception. > > It's not as bad as that! My version of the proposal would impose *no* > burden on implementations that did not require locking, for the > following reasons: > Your use of the word *no* is different than mine. :-) I could similarly claim that the segment count puts no burden on implementations that don't need it. > > 1) Locking is an optional task performed by the getxxxbuffer > routines. Objects which do not require locking just don't > do it. > > 2) For objects not requiring locking, the releasebuffer > operation is a no-op. Such an object can simply not > implement this routine, and the type machinery can fill > it in with a stub. > I believe it will be a no-op in enough places that extension writers will do it wrong without even knowing. > > > The extension releases the GIL so that another > > thread can work on the array object. > > Hey, whoa right there! If you have two threads accessing this array > object simulaneously, you should be using a mutex or semaphore or > something to coordinate them. As I pointed out before, thread > synchronisation is outside the scope of my proposal. > This is exactly Neil's use case. He's got two threads reading it simultaneously. One thread (not really a thread, but the asynchronous I/O operation) is writing to disk, and the other thread is keeping the user interface updated. There is no problem until the user tries to enter text (which forces a resize) before the asynchronous I/O is complete. Neil has a solution for this, but I think it's less than typical. > > The only purpose of the locking, in my proposal, is to ensure that an > exception occurs instead of a crash if the programmer screws up and > tries to resize an object whose internals are being messed with. It's > up to the programmer to do whatever is necessary to ensure that he > doesn't do that. > > > If extend() is called while thread 1 has the array locked, it can: > > > > A) raise an exception or return an error > > Yes. (Raise an exception.) > Which exception? Would you introduce a standard exception that should be raised when the user tries to do an operation that currently isn't allowed because the buffer is locked? Truthfully, now that Neil has given his explanation, I'm beginning to bend on this a bit. You're right in that it's not that much burden (however, it's more than *no* burden :-), and someone might find it useful. I still think it's going to be pretty uncommon, and I still believe the locking can be added on top of the simpler interface as needed. Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From mhammond@skippinet.com.au Wed Jul 31 07:43:50 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 31 Jul 2002 16:43:50 +1000 Subject: [Python-Dev] Get fame and fortune from mindless editing Message-ID: An offer too good to refuse ;) We recently deprecated the DL_EXPORT and DL_IMPORT macros, replacing them with purpose oriented macros. In an effort to cleanup the source, it would be good to remove all such macros from the Python source tree. I have already made a start on this, and only mindless editing remains. What needs to be done is: * Modules/*.c - all 'DL_EXPORT(void)' references (which are all module init functions) are to be replaced with 'PyMODINIT_FUNC' - note no parens, and not no return type is specified. Eg, the following patch would be most suitable : Index: timemodule.c ... @@ -621,5 +621,5 @@ -DL_EXPORT(void) +PyMODINIT_FUNC inittime(void) { * Include/*.h - all public declarations need to be changed. All 'DL_IMPORT(type)' references, *including* any leading 'extern' declaration, should be changed to either PyAPI_FUNC (for functions) or PyAPI_DATA (for data) For example, the following 3 lines (from various .h files): extern DL_IMPORT(PyTypeObject) PyUnicode_Type; extern DL_IMPORT(PyObject*) PyUnicode_FromUnicode(...); DL_IMPORT(void) PySys_SetArgv(int, char **); would be changed to: PyAPI_DATA(PyTypeObject) PyUnicode_Type; PyAPI_FUNC(PyObject*) PyUnicode_FromUnicode(...); PyAPI_FUNC(void) PySys_SetArgv(int, char **); Note all 'extern' declarations were removed, and PyUnicode_Type is data (and declared as such) while the other 2 are functions. This is all mindless editing, suitable for a day when the brain doesn't quite seem to be firing! The fame comes from getting your name splashed all over the CVS logs. The fortune... well, not all valuable things can be measured in dollars . Thanks, Mark. From kalle@lysator.liu.se Wed Jul 31 09:19:57 2002 From: kalle@lysator.liu.se (Kalle Svensson) Date: Wed, 31 Jul 2002 10:19:57 +0200 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <20020731081957.GB1161@i92.ryd.student.liu.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [Mark Hammond] > An offer too good to refuse ;) Right. http://python.org/sf/588982 Since this is my first post here, I'll introduce myself. I'm a first year student in computer engineering at Linköping University, Sweden. I've been lurking here for a few months. My primary Python interest at the moment is the Snake Farm project. Otherwise, I like Unix, free software and all that usual stuff. Peace, Kalle - -- Kalle Svensson, http://www.juckapan.org/~kalle/ Student, root and saint in the Church of Emacs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) Comment: Processed by Mailcrypt 3.5.6 iD8DBQE9R52GdNeA1787sd0RAjVwAJ9/c4y8Tq0lqf6tUfgGeaD2DZIV3QCfQAvh tBwRn/mmh52sFncmo3shxhg= =6Z3v -----END PGP SIGNATURE----- From mwh@python.net Wed Jul 31 09:22:22 2002 From: mwh@python.net (Michael Hudson) Date: 31 Jul 2002 09:22:22 +0100 Subject: [Python-Dev] seeing off SET_LINENO In-Reply-To: "Mark Hammond"'s message of "Wed, 31 Jul 2002 15:28:44 +1000" References: Message-ID: <2mu1mgfgsh.fsf@starship.python.net> "Mark Hammond" writes: > > Michael Hudson : > > > > > My patch means the debugger doesn't stop > > > on the "def f():" line -- unsurprisingly, given that no execution ever > > > takes place on that line. > > [Greg] > > If there is no code there, there shouldn't be any > > need to stop there, should there? > > [Barry in a different message] > > I can't decide whether it would be good to stop on the def or not. > > Not doing so makes pdb act more like gdb, which also only stops on the > > first executable line, so maybe that's a good thing. > > IMO, the Python debugger "interface" should include function entry. There goes the time machine: it does. I just think everyone ignores 'call' messages because they're a bit redundant today (because of the matter under discussion). > The debugger UI (in this case pdb, but any other debugger) may > choose not to break there, but the debugger itself may be able to > implement some useful things by having the hook. bdb.Bdb.user_call(), I believe. Cheers, M. -- One of the great skills in using any language is knowing what not to use, what not to say. ... There's that simplicity thing again. -- Ron Jeffries From akim@epita.fr Wed Jul 31 10:11:11 2002 From: akim@epita.fr (Akim Demaille) Date: 31 Jul 2002 11:11:11 +0200 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> Message-ID: >>>>> "Fran=E7ois" =3D=3D Fran=E7ois Pinard wri= tes: Fran=E7ois> [Guido van Rossum] Hi Guido, Hi Francois ! >> Since we don't use this idiom, we can safely remove the >> -DHAVE_CONFIG_H (if we can find where it is set). >> I looked. It's generated by AC_OUTPUT. I don't think I can get >> rid of it. So never mind. :-) Fran=E7ois> Maybe AC_OUTPUT, or macros called by AC_OUTPUT, can be Fran=E7ois> overridden. If this is not easy to do, you might want to Fran=E7ois> discuss the matter with Akim, Cc:ed. Maybe he could tear Fran=E7ois> down AC_OUTPUT in parts so the overriding gets easier? Fran=E7ois> I know my friend Akim as good, helping and nice fellow! Fran=E7ois> Don't fear him! :-) I'm not sure I completely understand the question here: if HAVE_CONFIG_H is specified, it means config.h is created. So if you use a config.h, why does it matter not to define HAVE_CONFIG_H? From barry@python.org Wed Jul 31 13:15:49 2002 From: barry@python.org (Barry A. Warsaw) Date: Wed, 31 Jul 2002 08:15:49 -0400 Subject: [Python-Dev] seeing off SET_LINENO References: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz> Message-ID: <15687.54517.580299.350054@anthem.wooz.org> >>>>> "MH" == Mark Hammond writes: MH> [Barry in a different message] >> I can't decide whether it would be good to stop on the def or >> not. Not doing so makes pdb act more like gdb, which also only >> stops on the first executable line, so maybe that's a good >> thing. MH> IMO, the Python debugger "interface" should include function MH> entry. The debugger UI (in this case pdb, but any other MH> debugger) may choose not to break there, but the debugger MH> itself may be able to implement some useful things by having MH> the hook. Good point. -Barry From thomas.heller@ion-tof.com Wed Jul 31 13:32:25 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 31 Jul 2002 14:32:25 +0200 Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> Message-ID: <0ac201c2388e$53c0f020$e000a8c0@thomasnotebook> > Additional Notes/Comments > > Python strings, Unicode strings, mmap objects, and maybe other > types would expose the fixed buffer interface, but the array type > would *not*, because its memory block may be reallocated during > its lifetime. > Unfortunately it's impossible to implement the fixed buffer interface on mmap objects - the memory mapped file can be closed at any time. This would leave the pointers unusable. It seems this is another use case for locking - if we want it. Thomas From pinard@iro.umontreal.ca Wed Jul 31 13:41:02 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 31 Jul 2002 08:41:02 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> Message-ID: [Akim Demaille] > I'm not sure I completely understand the question here: if HAVE_CONFIG_H > is specified, it means config.h is created. So if you use a config.h, > why does it matter not to define HAVE_CONFIG_H? Hi, Akim. I hope life is still good to you! :-) In the beginnings of Autoconf, the `config.h' file did not exist. David MacKenzie added it as a way to reduce the `make' output clutter. Nowadays, I suspect almost all packages of at least moderate size uses it. Our traditional `lib/' modules have to work in many packages, whether `config.h' has been created or not, this being decided on a per package basis, and that is why there is a conditional inclusion of `config.h' in each of these `lib/' modules. He took a good while before we got stabilised on the exact stanza of this inclusion (I especially remember the massive unilateral changes by Roland McGrath introducing the BROKEN_BROKET define, or something like that, and all the doing it later took to clean this out.) Python (the distribution, which is what is in question here) does not use any of our `lib/' things, it is not going to use them, and it is not going to provide new such modules, so the distribution includes `config.h' everywhere, by permanent choice, without any need to use `HAVE_CONFIG_H' to decide if that inclusion is needed or not. So, even `-DHAVE_CONFIG_H' is useless `make' clutter in this case, and that's why the Python packagers wanted to get rid of it. In fact, in practice `-DHAVE_CONFIG_H' is only needed for packages using those common `lib/' modules, but many packages do not. Now that Autoconf is used with projects who have a life outside GNU, this is less necessary. Guido found, and got me to remember, that `@DEFS@' is the culprit: people just do not have to use it in their hand-crafted Makefiles, which is the case for Python. For away-from-GNU packages using Automake, some Automake option might exist so `@DEFS@' does not get generated? The only goal here is to get a cleaner `make' output. -- François Pinard http://www.iro.umontreal.ca/~pinard From skip@pobox.com Wed Jul 31 14:39:31 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 08:39:31 -0500 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <15687.59539.842887.296794@localhost.localdomain> Mark> I have already made a start on this, and only mindless editing Mark> remains. "mindless editing" ==> sed script or Emacs macros... ;-) Skip From skip@pobox.com Wed Jul 31 14:52:52 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 08:52:52 -0500 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <15687.60340.880139.545471@localhost.localdomain> Mark> We recently deprecated the DL_EXPORT and DL_IMPORT macros, Mark> replacing them with purpose oriented macros. In an effort to Mark> cleanup the source, it would be good to remove all such macros Mark> from the Python source tree. I modified the Modules/*.c and Includes/*.h files. Is there a patch/bug number I should attach the context diffs to for review? Skip From skip@pobox.com Wed Jul 31 14:59:09 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 08:59:09 -0500 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <15687.60717.533154.63118@localhost.localdomain> What about the references to DL_IMPORT/DL_EXPORT in Includes/Python.h and the two #ifndef DL_EXPORT lines in Modules/{cPickle.c,cStringIO.c}? Skip From guido@python.org Wed Jul 31 15:18:27 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 10:18:27 -0400 Subject: [Python-Dev] Re: HAVE_CONFIG_H In-Reply-To: Your message of "Wed, 31 Jul 2002 11:11:11 +0200." References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> Message-ID: <200207311418.g6VEIRW32518@odiug.zope.com> > I'm not sure I completely understand the question here: if > HAVE_CONFIG_H is specified, it means config.h is created. So if you > use a config.h, why does it matter not to define HAVE_CONFIG_H? It's just clutter on the command line that we don't need. But never mind, I found a way to lose it already. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 31 15:36:37 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 10:36:37 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Tue, 30 Jul 2002 23:29:50 PDT." <20020731062950.59376.qmail@web40105.mail.yahoo.com> References: <20020731062950.59376.qmail@web40105.mail.yahoo.com> Message-ID: <200207311436.g6VEabH32668@odiug.zope.com> Based on the example of mmap (which can be closed at any time) I agree that the fixed buffer interface needs to have "get" and "release" methods (please pick better names). Maybe Thomas can update PEP 298. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed Jul 31 16:16:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 10:16:20 -0500 Subject: [Python-Dev] imaplib test failure Message-ID: <15687.65348.589402.540281@localhost.localdomain> Anyone else seeing this? I doubt it's related to the DL_EXPORT/DL_IMPORT changes I was just testing, and my local copy of Lib/imaplib.py matches what's in CVS. Skip test test_imaplib produced unexpected output: ********************************************************************** *** lines 2-3 of actual output doesn't appear in expected output after line 1: + incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0) + incorrect result when converting '"18-May-2033 13:33:20 +1000"' ********************************************************************** From jeremy@alum.mit.edu Wed Jul 31 15:56:40 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Wed, 31 Jul 2002 10:56:40 -0400 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <15687.64168.403730.225372@slothrop.zope.com> >>>>> "MH" == Mark Hammond writes: MH> An offer too good to refuse ;) We recently deprecated the MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose MH> oriented macros. In an effort to cleanup the source, it would MH> be good to remove all such macros from the Python source tree. Would it make any sense to backport the new macros to the 2.2 branch? It might ease the life of extension writers who want their code to work with either version. The practical problem, however, is that their code would only work with a too-be-released 2.2.2. Jeremy From barry@python.org Wed Jul 31 16:23:29 2002 From: barry@python.org (Barry A. Warsaw) Date: Wed, 31 Jul 2002 11:23:29 -0400 Subject: [Python-Dev] imaplib test failure References: <15687.65348.589402.540281@localhost.localdomain> Message-ID: <15688.241.352958.223156@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Anyone else seeing this? I doubt it's related to the SM> DL_EXPORT/DL_IMPORT changes I was just testing, and my local SM> copy of Lib/imaplib.py matches what's in CVS. Yes, everyone is: http://mail.python.org/pipermail/python-dev/2002-July/027056.html but no one's stepped up to the plate yet, including pierslauder <1.4 wink>. -Barry From jeremy@alum.mit.edu Wed Jul 31 16:25:37 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Wed, 31 Jul 2002 11:25:37 -0400 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: <200207311525.g6VFPRf00831@odiug.zope.com> References: <15687.64168.403730.225372@slothrop.zope.com> <200207311525.g6VFPRf00831@odiug.zope.com> Message-ID: <15688.369.568227.177521@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: MH> An offer too good to refuse ;) We recently deprecated the MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose MH> oriented macros. In an effort to cleanup the source, it would MH> be good to remove all such macros from the Python source tree. >> >> Would it make any sense to backport the new macros to the 2.2 >> branch? It might ease the life of extension writers who want >> their code to work with either version. The practical problem, >> however, is that their code would only work with a >> too-be-released 2.2.2. GvR> Maybe both the old and the new macros could be supported by GvR> 2.2.2? Yes. That's my suggestion. Jeremy From xscottg@yahoo.com Wed Jul 31 16:28:32 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Wed, 31 Jul 2002 08:28:32 -0700 (PDT) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <200207311436.g6VEabH32668@odiug.zope.com> Message-ID: <20020731152832.99003.qmail@web40106.mail.yahoo.com> --- Guido van Rossum wrote: > > Based on the example of mmap (which can be closed at any time) I > agree that the fixed buffer interface needs to have "get" > and "release" methods (please pick better names). Maybe Thomas can > update PEP 298. > Wow, the tides have turned. Fair enough. I think Neil put forth the names "acquire" and "release". So how about typedef struct { getreadbufferproc bf_getreadbuffer; getwritebufferproc bf_getwritebuffer; getsegcountproc bf_getsegcount; getcharbufferproc bf_getcharbuffer; /* fixed buffer interface functions */ acquirereadbufferproc bf_acquirereadbuffer; acquirewritebufferproc bf_acquirewritebuffer; releasebufferproc bf_releasebuffer; } PyBufferProcs; Whatever the actual names, should there be a bf_releasereadbuffer and bf_releasewritebuffer? Or just the one bf_releasebuffer? Could also just have one acquire function that indicates whether it is read-write or read-only via a return parameter. Is write-only ever useful? Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From guido@python.org Wed Jul 31 16:25:27 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 11:25:27 -0400 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: Your message of "Wed, 31 Jul 2002 10:56:40 EDT." <15687.64168.403730.225372@slothrop.zope.com> References: <15687.64168.403730.225372@slothrop.zope.com> Message-ID: <200207311525.g6VFPRf00831@odiug.zope.com> > MH> An offer too good to refuse ;) We recently deprecated the > MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose > MH> oriented macros. In an effort to cleanup the source, it would > MH> be good to remove all such macros from the Python source tree. > > Would it make any sense to backport the new macros to the 2.2 branch? > It might ease the life of extension writers who want their code to > work with either version. The practical problem, however, is that > their code would only work with a too-be-released 2.2.2. Maybe both the old and the new macros could be supported by 2.2.2? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 31 16:37:07 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 11:37:07 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Wed, 31 Jul 2002 08:28:32 PDT." <20020731152832.99003.qmail@web40106.mail.yahoo.com> References: <20020731152832.99003.qmail@web40106.mail.yahoo.com> Message-ID: <200207311537.g6VFb7r01081@odiug.zope.com> > I think Neil put forth the names "acquire" and "release". So how about > > typedef struct { > getreadbufferproc bf_getreadbuffer; > getwritebufferproc bf_getwritebuffer; > getsegcountproc bf_getsegcount; > getcharbufferproc bf_getcharbuffer; > /* fixed buffer interface functions */ > acquirereadbufferproc bf_acquirereadbuffer; > acquirewritebufferproc bf_acquirewritebuffer; > releasebufferproc bf_releasebuffer; > } PyBufferProcs; > > Whatever the actual names, should there be a bf_releasereadbuffer and > bf_releasewritebuffer? Or just the one bf_releasebuffer? Just the one. > Could also just have one acquire function that indicates whether it > is read-write or read-only via a return parameter. That loses the (weak) symmetry with the existing API. > Is write-only ever useful? No, write implies read. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 31 16:47:46 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 11:47:46 -0400 Subject: [Python-Dev] What to do about the Wiki? Message-ID: <200207311547.g6VFlk601129@odiug.zope.com> I don't know what to do about the Moinmoin Wiki on python.org. Lots of useful information was recently moved to the Wiki, like the editors list and Andrew Kuchling's bookstore. But the Wiki brought the website down twice this weekend, by growing without bounds. To prevent this from happening again, we've disabled the Wiki, but that's not a solution. Juergen Hermann, Moinmoin's author, said he fixed a few things, but also said that Moinmoin is essentially vulnerable to "recursive wget" (e.g. someone trying to suck up the entire Wiki by following links). Apparently this is what brought the site down this weekend -- if I understand correctly, an in-memory log was growing too fast. There are a lot of links in the Wiki, e.g. for each Wiki page there's the page itself, the edit form, the history, various other actions, etc. I believe that Juergen has fixed the log-growing problem. Should we enable the Wiki again and hope for the best? --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Wed Jul 31 16:49:20 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 31 Jul 2002 17:49:20 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020731152832.99003.qmail@web40106.mail.yahoo.com> <200207311537.g6VFb7r01081@odiug.zope.com> Message-ID: <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook> > > Could also just have one acquire function that indicates whether it > > is read-write or read-only via a return parameter. > > That loses the (weak) symmetry with the existing API. > There's nothing a client expecting a read/write pointer could do with a read only pointer IMO. > > Is write-only ever useful? > > No, write implies read. Should it be named getfixedreadwritebuffer then? Thomas From guido@python.org Wed Jul 31 16:54:41 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 11:54:41 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Wed, 31 Jul 2002 17:49:20 +0200." <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook> References: <20020731152832.99003.qmail@web40106.mail.yahoo.com> <200207311537.g6VFb7r01081@odiug.zope.com> <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook> Message-ID: <200207311554.g6VFsfO01268@odiug.zope.com> > > > Could also just have one acquire function that indicates whether it > > > is read-write or read-only via a return parameter. > > > > That loses the (weak) symmetry with the existing API. > > There's nothing a client expecting a read/write pointer could > do with a read only pointer IMO. So we agree that it's a bad idea to have one function. :-) > > > Is write-only ever useful? > > > > No, write implies read. > > Should it be named getfixedreadwritebuffer then? No, the existing API also uses getwritebuffer implying read/write. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed Jul 31 16:57:08 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 10:57:08 -0500 Subject: [Python-Dev] Get fame and fortune from mindless editing In-Reply-To: References: Message-ID: <15688.2260.68645.786641@localhost.localdomain> Mark> * Modules/*.c - all 'DL_EXPORT(void)' references ... Mark> * Include/*.h - all public declarations need to be changed ... Context diff of these changes are attached to http://python.org/sf/566100 Regression tests pass on my Linux box. See my note for a couple caveats. Skip From thomas.heller@ion-tof.com Wed Jul 31 16:58:05 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 31 Jul 2002 17:58:05 +0200 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface References: <20020731062950.59376.qmail@web40105.mail.yahoo.com> <200207311436.g6VEabH32668@odiug.zope.com> Message-ID: <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook> From: "Guido van Rossum" > Based on the example of mmap (which can be closed at any time) I > agree that the fixed buffer interface needs to have "get" > and "release" methods (please pick better names). Maybe Thomas can > update PEP 298. The consequence: mmap objects need a 'buffer lock counter', and cannot be closed while the count is >0. Which exception is raised then? Or do you have something different in mind? The lock counter wouuld not be needed for strings and unicode... Thomas From guido@python.org Wed Jul 31 17:06:13 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 12:06:13 -0400 Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: Your message of "Wed, 31 Jul 2002 17:58:05 +0200." <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook> References: <20020731062950.59376.qmail@web40105.mail.yahoo.com> <200207311436.g6VEabH32668@odiug.zope.com> <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook> Message-ID: <200207311606.g6VG6Ds01363@odiug.zope.com> > The consequence: mmap objects need a 'buffer lock counter', > and cannot be closed while the count is >0. Which exception > is raised then? Pick one -- mmap.error (== EnvironmentError) seems fine to me. Alternately, close() could set a "please close me" flag which causes the mmap file to be closed when the last release is called. Of course, the acquire method should raise an exception when it's already closed. > Or do you have something different in mind? > The lock counter wouuld not be needed for strings and unicode... And the array module could have one. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed Jul 31 17:09:13 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 11:09:13 -0500 Subject: [Python-Dev] Re: What to do about the Wiki? In-Reply-To: <200207311547.g6VFlk601129@odiug.zope.com> References: <200207311547.g6VFlk601129@odiug.zope.com> Message-ID: <15688.2985.118330.48738@localhost.localdomain> Guido> Juergen Hermann, Moinmoin's author, said he fixed a few thin= gs, Guido> but also said that Moinmoin is essentially vulnerable to Guido> "recursive wget" (e.g. someone trying to suck up the entire = Wiki Guido> by following links). Apparently this is what brought the si= te Guido> down this weekend -- if I understand correctly, an in-memory= log Guido> was growing too fast. I'm a bit confused by these statements. MoinMoin is a CGI script. I d= on't understand where "recursive wget" and "in-memory log" would come into p= lay. I recently fired up two Wikis on the Mojam server. I never see any long-running process which would suggest there's an in-memory log which= could grow without bound. The MoinMoin package does generate HTTP redirects, but while they might coax wget into firing off another reque= st, it should be handled by a separate MoinMoin process on the server side.= You should see the load grow significantly as the requests pour in, but shouldn't see any one MoinMoin process gobbling up all sorts of resourc= es. J=FCrgen, can you elaborate on these themes a little more? Guido> I believe that Juergen has fixed the log-growing problem. S= hould Guido> we enable the Wiki again and hope for the best? With an XS4ALL person at the ready? Perhaps someone can keep a window = open on creosote running something like while true ; do ps auxww | egrep python | sort -r -n -k 5,5 | head -1 =09sleep 15 done I'm running out for the next few hours. I'll be happy to run the while= loop when I return. Skip From webmaster@python.org Wed Jul 31 17:21:47 2002 From: webmaster@python.org (webmaster@python.org) Date: Wed, 31 Jul 2002 12:21:47 -0400 Subject: [Python-Dev] Re: What to do about the Wiki? References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> Message-ID: <15688.3739.1719.207581@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: Guido> I believe that Juergen has fixed the log-growing problem. Guido> Should we enable the Wiki again and hope for the best? I just did, by twiddling the +x bits on moinmoin SM> With an XS4ALL person at the ready? Perhaps someone can keep SM> a window open on creosote running something like | while true ; do | ps auxww | egrep python | sort -r -n -k 5,5 | head -1 | sleep 15 | done SM> I'm running out for the next few hours. I'll be happy to run SM> the while loop when I return. I'm doing this now, but even hitting the wiki it doesn't show up. I'm just going to run top for a while, but it's a fairly old version of top. :/ -Barry From guido@python.org Wed Jul 31 17:16:56 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 12:16:56 -0400 Subject: [Python-Dev] Re: What to do about the Wiki? In-Reply-To: Your message of "Wed, 31 Jul 2002 11:09:13 CDT." <15688.2985.118330.48738@localhost.localdomain> References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> Message-ID: <200207311616.g6VGGuF01886@odiug.zope.com> > Guido> Juergen Hermann, Moinmoin's author, said he fixed a few things, > Guido> but also said that Moinmoin is essentially vulnerable to > Guido> "recursive wget" (e.g. someone trying to suck up the entire Wiki > Guido> by following links). Apparently this is what brought the site > Guido> down this weekend -- if I understand correctly, an in-memory log > Guido> was growing too fast. > > I'm a bit confused by these statements. MoinMoin is a CGI script. I don't > understand where "recursive wget" and "in-memory log" would come into play. > I recently fired up two Wikis on the Mojam server. I never see any > long-running process which would suggest there's an in-memory log which > could grow without bound. The MoinMoin package does generate HTTP > redirects, but while they might coax wget into firing off another request, > it should be handled by a separate MoinMoin process on the server side. You > should see the load grow significantly as the requests pour in, but > shouldn't see any one MoinMoin process gobbling up all sorts of resources. > Jürgen, can you elaborate on these themes a little more? Juergen seems offline or too busy to respond. Here's what he wrote on the matter. I guess he's reading the entire log into memory and updating it there. | Subject: [Pydotorg] wiki | From: Juergen Hermann | To: "pydotorg@python.org" | Date: Mon, 29 Jul 2002 20:32:31 +0200 | Hi! | | I looked into the wiki, and two things killed us: | | a) apart from google hits, some $!&%$""$% did a recursive wget. And the | wiki spans a rather wide uri space... | | b) the event log grows much faster than I'm used to, thus some | "simple" algorithms don't hold for this size. | | | Solutions: | | a) I just updated the wiki software, the current cvs contains a | robot/wget filter that forbids any access except to "view page" URIs | (i.e. we remain open to google, but no more open than absolutely | needed). If need be, we can forbid access altogether, or only allow | google. | | b) I'll install a cron job that rotates the logs, to keep them short. | | I shortened the logs manually for now. So if you all agree, we could | activate the wiki again. | | | Ciao, Jürgen Reading this again, I think we should give it a try again. > Guido> I believe that Juergen has fixed the log-growing problem. Should > Guido> we enable the Wiki again and hope for the best? > > With an XS4ALL person at the ready? Perhaps someone can keep a window open > on creosote running something like > > while true ; do > ps auxww | egrep python | sort -r -n -k 5,5 | head -1 > sleep 15 > done > > I'm running out for the next few hours. I'll be happy to run the while loop > when I return. We'll watch it here. I know who to write to have it rebooted. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jul 31 17:43:40 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 12:43:40 -0400 Subject: [Python-Dev] imaplib test failure In-Reply-To: <15688.241.352958.223156@anthem.wooz.org> Message-ID: > Yes, everyone is: > > http://mail.python.org/pipermail/python-dev/2002-July/027056.html > > but no one's stepped up to the plate yet, including pierslauder <1.4 > wink>. I just reverted test_imaplib to rev 1.3, the last version that worked here. From mal@lemburg.com Wed Jul 31 18:02:51 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 31 Jul 2002 19:02:51 +0200 Subject: [Python-Dev] Re: What to do about the Wiki? References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com> Message-ID: <3D48183B.7070306@lemburg.com> Guido van Rossum wrote: >> Guido> Juergen Hermann, Moinmoin's author, said he fixed a few thin= gs, >> Guido> but also said that Moinmoin is essentially vulnerable to >> Guido> "recursive wget" (e.g. someone trying to suck up the entire = Wiki >> Guido> by following links). Apparently this is what brought the si= te >> Guido> down this weekend -- if I understand correctly, an in-memory= log >> Guido> was growing too fast. >> >>I'm a bit confused by these statements. MoinMoin is a CGI script. I d= on't >>understand where "recursive wget" and "in-memory log" would come into p= lay. >>I recently fired up two Wikis on the Mojam server. I never see any >>long-running process which would suggest there's an in-memory log which >>could grow without bound. The MoinMoin package does generate HTTP >>redirects, but while they might coax wget into firing off another reque= st, >>it should be handled by a separate MoinMoin process on the server side.= You >>should see the load grow significantly as the requests pour in, but >>shouldn't see any one MoinMoin process gobbling up all sorts of resourc= es. >>J=FCrgen, can you elaborate on these themes a little more? >=20 >=20 > Juergen seems offline or too busy to respond. Here's what he wrote on > the matter. I guess he's reading the entire log into memory and > updating it there. J=FCrgen is talking about the file event.log which MoinMoin writes. This is not read into memory. New events are simply appended to the file. Now since the Wiki has recursive links such as the "LikePages" links on all pages and history links like the per page info screen, a recursive wget is likely to run for quite a while (even more because the URL level doesn't change much and thus probably doesn't trigger any depth restrictions on wget- like crawlers) and generate lots of events... What was the cause of the break down ? A full disk or a process claiming all resources ? > | Subject: [Pydotorg] wiki > | From: Juergen Hermann > | To: "pydotorg@python.org" > | Date: Mon, 29 Jul 2002 20:32:31 +0200 > | Hi! > |=20 > | I looked into the wiki, and two things killed us: > |=20 > | a) apart from google hits, some $!&%$""$% did a recursive wget. And t= he=20 > | wiki spans a rather wide uri space... > |=20 > | b) the event log grows much faster than I'm used to, thus some=20 > | "simple" algorithms don't hold for this size. > |=20 > |=20 > | Solutions:=20 > |=20 > | a) I just updated the wiki software, the current cvs contains a=20 > | robot/wget filter that forbids any access except to "view page" URIs=20 > | (i.e. we remain open to google, but no more open than absolutely=20 > | needed). If need be, we can forbid access altogether, or only allow=20 > | google. > |=20 > | b) I'll install a cron job that rotates the logs, to keep them short. > |=20 > | I shortened the logs manually for now. So if you all agree, we could=20 > | activate the wiki again. > |=20 > |=20 > | Ciao, J=FCrgen >=20 > Reading this again, I think we should give it a try again. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Wed Jul 31 18:07:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:07:46 -0400 Subject: [Pydotorg] Re: [Python-Dev] Re: What to do about the Wiki? In-Reply-To: <3D48183B.7070306@lemburg.com> Message-ID: [M.-A. Lemburg] > What was the cause of the break down ? A full disk or a process > claiming all resources ? Thomas Wouters told me the process grew so large that it ran out of swapfile space. swapping-rumors-ly y'rs - tim From tim.one@comcast.net Wed Jul 31 18:16:20 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:16:20 -0400 Subject: [Python-Dev] Valgrinding Python In-Reply-To: <3D47491C.B0E9E165@metaslash.com> Message-ID: [Neal Norwitz] > This is good news. I changed ADDRESS_IN_RANGE to a function, > then suppressed it. There were no other uninitialized memory reads. Cool! In if (ADDRESS_IN_RANGE(p, pool->arenaindex)) { it's actually only the pool->arenaindex subexpression that may read uninitialized memory; the ADDRESS_IN_RANGE macro itself doesn't do anything "bad". > Valgrind does report a bunch of problems with pthreads, but > these are likely valgrind's fault. There are some complaints > about memory leaks, but these seem to appear only to occur > when spawning/threading. The leaks are small and short lived. A novel definition for "leak" . From guido@python.org Wed Jul 31 18:24:12 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 13:24:12 -0400 Subject: [Python-Dev] Re: What to do about the Wiki? In-Reply-To: Your message of "Wed, 31 Jul 2002 19:02:51 +0200." <3D48183B.7070306@lemburg.com> References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com> <3D48183B.7070306@lemburg.com> Message-ID: <200207311724.g6VHOCZ02434@odiug.zope.com> > > Juergen seems offline or too busy to respond. Here's what he wrote on > > the matter. I guess he's reading the entire log into memory and > > updating it there. > > Jürgen is talking about the file event.log which MoinMoin writes. > This is not read into memory. New events are simply appended to > the file. > > Now since the Wiki has recursive links such as the "LikePages" > links on all pages and history links like the per page > info screen, a recursive wget is likely to run for quite a > while (even more because the URL level doesn't change much > and thus probably doesn't trigger any depth restrictions on wget- > like crawlers) and generate lots of events... > > What was the cause of the break down ? A full disk or a process > claiming all resources ? A process running out of memory, AFAIK. I just ran a recursive wget on the Wiki, and it completed without bringing the site down, downloading about 1000 files (several views for each Wiki page). I didn't see the Wiki appear in the "top" display. So either Juergen fixed the problem (as he said he did) or there was a different cause. I do wish Juergen responded to his mail. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jul 31 18:26:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:26:12 -0400 Subject: [Python-Dev] Now test_socket fails Message-ID: What's socket.socket() supposed to do without any arguments? Can't work on Windows, because socket.py has if (sys.platform.lower().startswith("win") or (hasattr(os, 'uname') and os.uname()[0] == "BeOS") or sys.platform=="riscos"): _realsocketcall = _socket.socket def socket(family, type, proto=0): return _socketobject(_realsocketcall(family, type, proto)) C:\Code\python\PCbuild>python ../lib/test/test_socket.py Testing for mission critical constants. ... ok Testing default timeout. ... ERROR Testing getservbyname(). ... ok Testing getsockopt(). ... ok Testing hostname resolution mechanisms. ... ok Making sure getnameinfo doesn't crash the interpreter. ... ok testNtoH (__main__.GeneralModuleTests) ... ok Testing reference count for getnameinfo. ... ok testing send() after close() with timeout. ... ok Testing setsockopt(). ... ok Testing getsockname(). ... ok Testing that socket module exceptions. ... ok Testing fromfd(). ... ok Testing receive in chunks over TCP. ... ok Testing recvfrom() in chunks over TCP. ... ok Testing large receive over TCP. ... ok Testing large recvfrom() over TCP. ... ok Testing sendall() with a 2048 byte string over TCP. ... ok Testing shutdown(). ... ok Testing recvfrom() over UDP. ... ok Testing sendto() and Recv() over UDP. ... ok Testing non-blocking accept. ... ok Testing non-blocking connect. ... ok Testing non-blocking recv. ... ok Testing whether set blocking works. ... ok Performing file readline test. ... ok Performing small file read test. ... ok Performing unbuffered file read test. ... ok ====================================================================== ERROR: Testing default timeout. ---------------------------------------------------------------------- Traceback (most recent call last): File "../lib/test/test_socket.py", line 273, in testDefaultTimeout s = socket.socket() TypeError: socket() takes at least 2 arguments (0 given) ---------------------------------------------------------------------- Ran 28 tests in 3.190s FAILED (errors=1) Traceback (most recent call last): File "../lib/test/test_socket.py", line 559, in ? test_main() File "../lib/test/test_socket.py", line 556, in test_main test_support.run_suite(suite) File "C:\CODE\PYTHON\lib\test\test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../lib/test/test_socket.py", line 273, in testDefaultTimeout s = socket.socket() TypeError: socket() takes at least 2 arguments (0 given) From tim.one@comcast.net Wed Jul 31 18:33:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:33:03 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: Message-ID: [me] > What's socket.socket() supposed to do without any arguments? > Can't work on Windows, because socket.py has ... Nevermind; I changed socket.py so this works as intended. From mgilfix@eecs.tufts.edu Wed Jul 31 18:37:11 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 31 Jul 2002 13:37:11 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: ; from tim.one@comcast.net on Wed, Jul 31, 2002 at 01:26:12PM -0400 References: Message-ID: <20020731133711.H26901@eecs.tufts.edu> I'm pretty sure that qualifies as a bug. The problem exists on linux as well (as a fresh cvs update has shown). In general though, the socket call should always take the two arguments. It seems at one point that the 2.3 version of the socket module accepted erroneously just a socket() call, while 2.2 does not. It seems Guido added these lines to integrate default timeout testing. If someone with write priveleges can just fix that to read: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) that should fix the problem. -- Mike On Wed, Jul 31 @ 13:26, Tim Peters wrote: > What's socket.socket() supposed to do without any arguments? Can't work on > Windows, because socket.py has > > if (sys.platform.lower().startswith("win") > or (hasattr(os, 'uname') and os.uname()[0] == "BeOS") > or sys.platform=="riscos"): > > _realsocketcall = _socket.socket > > def socket(family, type, proto=0): > return _socketobject(_realsocketcall(family, type, proto)) > > > C:\Code\python\PCbuild>python ../lib/test/test_socket.py > Testing for mission critical constants. ... ok > Testing default timeout. ... ERROR > Testing getservbyname(). ... ok > Testing getsockopt(). ... ok > Testing hostname resolution mechanisms. ... ok > Making sure getnameinfo doesn't crash the interpreter. ... ok > testNtoH (__main__.GeneralModuleTests) ... ok > Testing reference count for getnameinfo. ... ok > testing send() after close() with timeout. ... ok > Testing setsockopt(). ... ok > Testing getsockname(). ... ok > Testing that socket module exceptions. ... ok > Testing fromfd(). ... ok > Testing receive in chunks over TCP. ... ok > Testing recvfrom() in chunks over TCP. ... ok > Testing large receive over TCP. ... ok > Testing large recvfrom() over TCP. ... ok > Testing sendall() with a 2048 byte string over TCP. ... ok > Testing shutdown(). ... ok > Testing recvfrom() over UDP. ... ok > Testing sendto() and Recv() over UDP. ... ok > Testing non-blocking accept. ... ok > Testing non-blocking connect. ... ok > Testing non-blocking recv. ... ok > Testing whether set blocking works. ... ok > Performing file readline test. ... ok > Performing small file read test. ... ok > Performing unbuffered file read test. ... ok > > ====================================================================== > ERROR: Testing default timeout. > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "../lib/test/test_socket.py", line 273, in testDefaultTimeout > s = socket.socket() > TypeError: socket() takes at least 2 arguments (0 given) > > ---------------------------------------------------------------------- > Ran 28 tests in 3.190s > > FAILED (errors=1) > Traceback (most recent call last): > File "../lib/test/test_socket.py", line 559, in ? > test_main() > File "../lib/test/test_socket.py", line 556, in test_main > test_support.run_suite(suite) > File "C:\CODE\PYTHON\lib\test\test_support.py", line 188, in run_suite > raise TestFailed(err) > test.test_support.TestFailed: Traceback (most recent call last): > File "../lib/test/test_socket.py", line 273, in testDefaultTimeout > s = socket.socket() > TypeError: socket() takes at least 2 arguments (0 given) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev `-> (tim.one) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mgilfix@eecs.tufts.edu Wed Jul 31 18:38:12 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 31 Jul 2002 13:38:12 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: ; from tim.one@comcast.net on Wed, Jul 31, 2002 at 01:33:03PM -0400 References: Message-ID: <20020731133812.I26901@eecs.tufts.edu> Er, I'm not sure that was such a good idea. This doesn't work on linux and shouldn't. It never worked that way in 2.2 I'm not sure what happened to make it work in 2.3. Was prior to my adding the timeout socket changes. -- Mike On Wed, Jul 31 @ 13:33, Tim Peters wrote: > [me] > > What's socket.socket() supposed to do without any arguments? > > Can't work on Windows, because socket.py has ... > > Nevermind; I changed socket.py so this works as intended. > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev `-> (tim.one) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From guido@python.org Wed Jul 31 18:40:34 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 13:40:34 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: Your message of "Wed, 31 Jul 2002 13:26:12 EDT." References: Message-ID: <200207311740.g6VHeYS02538@odiug.zope.com> > What's socket.socket() supposed to do without any arguments? Can't work on > Windows, because socket.py has > > if (sys.platform.lower().startswith("win") > or (hasattr(os, 'uname') and os.uname()[0] == "BeOS") > or sys.platform=="riscos"): > > _realsocketcall = _socket.socket > > def socket(family, type, proto=0): > return _socketobject(_realsocketcall(family, type, proto)) Oops. It's supposed to default to AF_INET, SOCK_STREAM now. Can you test this patch and check it in if it works? *** socket.py 18 Jul 2002 17:08:34 -0000 1.22 --- socket.py 31 Jul 2002 17:35:25 -0000 *************** *** 62,68 **** _realsocketcall = _socket.socket ! def socket(family, type, proto=0): return _socketobject(_realsocketcall(family, type, proto)) if SSL_EXISTS: --- 62,68 ---- _realsocketcall = _socket.socket ! def socket(family=AF_INET, type=SOCK_STREAM, proto=0): return _socketobject(_realsocketcall(family, type, proto)) if SSL_EXISTS: (There's another change we should really make -- instead of a socket function, there should be a class socket whose constructor does the work. That's necessary so that isinstance(s, socket.socket) works on Windows; this currently works on Unix but not on Windows. But I don't have time for that now; the above patch should do what you need.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jul 31 18:45:14 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 13:45:14 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: Your message of "Wed, 31 Jul 2002 13:37:11 EDT." <20020731133711.H26901@eecs.tufts.edu> References: <20020731133711.H26901@eecs.tufts.edu> Message-ID: <200207311745.g6VHjEC02589@odiug.zope.com> > It seems at one point that the 2.3 version of the socket module accepted > erroneously just a socket() call, while 2.2 does not. I added this intentionally. I am tired of typing (AF_INET, SOCK_STREAM) where those are the 99% case. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jul 31 18:44:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:44:50 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: <20020731133711.H26901@eecs.tufts.edu> Message-ID: [Michael Gilfix] > I'm pretty sure that qualifies as a bug. The problem exists on linux > as well (as a fresh cvs update has shown). In general though, the > socket call should always take the two arguments. > > It seems at one point that the 2.3 version of the socket module > accepted erroneously just a socket() call, while 2.2 does not. It seems > Guido added these lines to integrate default timeout testing. If someone > with write priveleges can just fix that to read: > > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > > that should fix the problem. I'll leave this to you and Guido. The test works fine on Windows now. The docstring for _socket.socket claims that all arguments are optional. The code matches the docs: sock_initobj(PyObject *self, PyObject *args, PyObject *kwds) { PySocketSockObject *s = (PySocketSockObject *)self; SOCKET_T fd; int family = AF_INET, type = SOCK_STREAM, proto = 0; static char *keywords[] = {"family", "type", "proto", 0}; ALL ARGS ARE OPTIONAL HERE if (!PyArg_ParseTupleAndKeywords(args, kwds, "|iii:socket", keywords, &family, &type, &proto)) return -1; Py_BEGIN_ALLOW_THREADS fd = socket(family, type, proto); Py_END_ALLOW_THREADS From guido@python.org Wed Jul 31 18:47:34 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 31 Jul 2002 13:47:34 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: Your message of "Wed, 31 Jul 2002 13:38:12 EDT." <20020731133812.I26901@eecs.tufts.edu> References: <20020731133812.I26901@eecs.tufts.edu> Message-ID: <200207311747.g6VHlYr02626@odiug.zope.com> > Er, I'm not sure that was such a good idea. This doesn't work on > linux and shouldn't. It never worked that way in 2.2 I'm not sure what > happened to make it work in 2.3. Was prior to my adding the timeout > socket changes. What do you mean it doesn't work on Linux? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jul 31 18:52:06 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 31 Jul 2002 13:52:06 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: <200207311740.g6VHeYS02538@odiug.zope.com> Message-ID: [Guido] > (There's another change we should really make -- instead of a socket > function, there should be a class socket whose constructor does the > work. That's necessary so that isinstance(s, socket.socket) works on > Windows; this currently works on Unix but not on Windows. http://www.python.org/sf/589262 From mgilfix@eecs.tufts.edu Wed Jul 31 18:57:06 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 31 Jul 2002 13:57:06 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: <200207311745.g6VHjEC02589@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:45:14PM -0400 References: <20020731133711.H26901@eecs.tufts.edu> <200207311745.g6VHjEC02589@odiug.zope.com> Message-ID: <20020731135705.J26901@eecs.tufts.edu> Sounds fair. Found it in the docs so I'm happy. On Wed, Jul 31 @ 13:45, Guido van Rossum wrote: > > It seems at one point that the 2.3 version of the socket module accepted > > erroneously just a socket() call, while 2.2 does not. > > I added this intentionally. I am tired of typing > (AF_INET, SOCK_STREAM) where those are the 99% case. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mal@lemburg.com Wed Jul 31 18:56:49 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 31 Jul 2002 19:56:49 +0200 Subject: [Python-Dev] Re: What to do about the Wiki? References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com> <3D48183B.7070306@lemburg.com> <200207311724.g6VHOCZ02434@odiug.zope.com> Message-ID: <3D4824E1.1090304@lemburg.com> Guido van Rossum wrote: >>>Juergen seems offline or too busy to respond. Here's what he wrote on >>>the matter. I guess he's reading the entire log into memory and >>>updating it there. >> >>J=FCrgen is talking about the file event.log which MoinMoin writes. >>This is not read into memory. New events are simply appended to >>the file. >> >>Now since the Wiki has recursive links such as the "LikePages" >>links on all pages and history links like the per page >>info screen, a recursive wget is likely to run for quite a >>while (even more because the URL level doesn't change much >>and thus probably doesn't trigger any depth restrictions on wget- >>like crawlers) and generate lots of events... >> >>What was the cause of the break down ? A full disk or a process >>claiming all resources ? >=20 >=20 > A process running out of memory, AFAIK. In that case, wouldn't it be better to impose a memoryuse limit on the user which Apache uses for dealing with CGI scripts ? That wouldn't solve any specific Wiki related problem, but prevents the server from going offline because of memory problems. > I just ran a recursive wget on the Wiki, and it completed without > bringing the site down, downloading about 1000 files (several views > for each Wiki page). I didn't see the Wiki appear in the "top" > display. >=20 > So either Juergen fixed the problem (as he said he did) or there was a > different cause. >=20 > I do wish Juergen responded to his mail. It's vacation time in Germany, so he may well be offline for a while. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mgilfix@eecs.tufts.edu Wed Jul 31 18:58:03 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 31 Jul 2002 13:58:03 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: <200207311747.g6VHlYr02626@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:47:34PM -0400 References: <20020731133812.I26901@eecs.tufts.edu> <200207311747.g6VHlYr02626@odiug.zope.com> Message-ID: <20020731135803.K26901@eecs.tufts.edu> On Wed, Jul 31 @ 13:47, Guido van Rossum wrote: > > Er, I'm not sure that was such a good idea. This doesn't work on > > linux and shouldn't. It never worked that way in 2.2 I'm not sure what > > happened to make it work in 2.3. Was prior to my adding the timeout > > socket changes. > > What do you mean it doesn't work on Linux? My fault. It works. I, uh, didn't set my path correctly :) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mgilfix@eecs.tufts.edu Wed Jul 31 19:00:58 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 31 Jul 2002 14:00:58 -0400 Subject: [Python-Dev] Now test_socket fails In-Reply-To: <200207311740.g6VHeYS02538@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:40:34PM -0400 References: <200207311740.g6VHeYS02538@odiug.zope.com> Message-ID: <20020731140057.L26901@eecs.tufts.edu> Would a little trick like this do? class socket: pass class unix_socket(socket): pass class windows_socket(socket): # Old windows stuff And then just do the namespace shuffling that's kinda already done in socket.py. -- Mike On Wed, Jul 31 @ 13:40, Guido van Rossum wrote: > (There's another change we should really make -- instead of a socket > function, there should be a class socket whose constructor does the > work. That's necessary so that isinstance(s, socket.socket) works on > Windows; this currently works on Unix but not on Windows. But I don't > have time for that now; the above patch should do what you need.) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mal@lemburg.com Wed Jul 31 19:04:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 31 Jul 2002 20:04:36 +0200 Subject: [Python-Dev] Re: What to do about the Wiki? References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com> <3D48183B.7070306@lemburg.com> <200207311724.g6VHOCZ02434@odiug.zope.com> <3D4824E1.1090304@lemburg.com> Message-ID: <3D4826B4.4060606@lemburg.com> M.-A. Lemburg wrote: > Guido van Rossum wrote: >>> What was the cause of the break down ? A full disk or a process >>> claiming all resources ? >> A process running out of memory, AFAIK. > > > In that case, wouldn't it be better to impose a memoryuse limit > on the user which Apache uses for dealing with CGI > scripts ? That wouldn't solve any specific Wiki related > problem, but prevents the server from going offline because > of memory problems. Here's how Apache can be configured for this (without having to fiddle with the Apache user account): http://httpd.apache.org/docs/mod/core.html#rlimitmem -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Wed Jul 31 19:53:23 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 31 Jul 2002 20:53:23 +0200 Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook> <200207301946.g6UJkf520799@odiug.zope.com> Message-ID: <0fe601c238c3$8bab1b20$e000a8c0@thomasnotebook> I've changed PEP 298 to incorporate the latest changes. Barry has not yet run pep2html (and I don't want to bother him too much with this), also I don't know if it makes sense to post it again in its full length. So here is the link to view it online in text format: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/peps/pep-0298.txt?rev=1.4 and this is the checkin message: ----- The model exposed by the fixed buffer interface was changed: Retrieving a buffer from an object puts this in a locked state, and a releasebuffer function must be called to unlock the object again. Added releasefixedbuffer function slot, and renamed the get...fixedbuffer functions to acquire...fixedbuffer functions. Renamed the flag from Py_TPFLAG_HAVE_GETFIXEDBUFFER to Py_TPFLAG_HAVE_FIXEDBUFFER. (Is the 'fixed buffer' name still useful, or should we use 'static buffer' instead?) Added posting date (was posted to c.l.p and python-dev). ----- Thomas From skip@pobox.com Wed Jul 31 22:06:26 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 31 Jul 2002 16:06:26 -0500 Subject: [Python-Dev] Re: What to do about the Wiki? In-Reply-To: <15688.3739.1719.207581@anthem.wooz.org> References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <15688.3739.1719.207581@anthem.wooz.org> Message-ID: <15688.20818.999604.113193@localhost.localdomain> BAW> I'm doing this now, but even hitting the wiki it doesn't show up. This is good. ;-) Skip From greg@cosc.canterbury.ac.nz Wed Jul 31 23:31:37 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 01 Aug 2002 10:31:37 +1200 (NZST) Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface In-Reply-To: <20020731063113.74481.qmail@web40110.mail.yahoo.com> Message-ID: <200207312231.g6VMVbt2019712@kuku.cosc.canterbury.ac.nz> > Moreover, if the sensible use cases for locking are few and far > between, then I'm still inclined to leave it out since you can add the > locking semantics at a different level. Are you sure about that? Without the locking, only non-resizable objects would be able to implement the protocol. So any higher level locking would have to be implemented on top of the old, non-safe version. Then you'd have to make sure that all parts of your application accessed the object through the extra layer. The "safe" part would be lost. > Your use of the word *no* is different than mine. :-) I could > similarly claim that the segment count puts no burden on > implementations that don't need it. I think I may have been replying to something other than what was said. But what I said is still true -- it imposes no extra burden on *implementers* of the interface which don't use the extra feature. I acknowledge that it complicates things slightly for *users* of the interface, but not as much as the seg count stuff does (there's no need for any testing or exception raising). > I believe it will be a no-op in enough places that extension writers > will do it wrong without even knowing. Well, there's not much that can be done about extension writers who fail to read the documentation, or wilfully ignore it. > Which exception? Would you introduce a standard exception that should > be raised when the user tries to do an operation that currently isn't > allowed because the buffer is locked? Maybe. It doesn't matter. The important thing is that the interpeter does not crash. > I still believe the locking can be added on top of the simpler > interface as needed. But it can't, since as I pointed out above, resizable objects won't be able to provide the simpler interface! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From ark@research.att.com Wed Jul 31 23:35:21 2002 From: ark@research.att.com (Andrew Koenig) Date: Wed, 31 Jul 2002 18:35:21 -0400 (EDT) Subject: [Python-Dev] split('') revisited Message-ID: <200207312235.g6VMZL218546@europa.research.att.com> Back in February, there was a thread in comp.lang.python (and, I think, also on Python-Dev) that asked whether the following behavior: >>> 'abcde'.split('') Traceback (most recent call last): File "", line 1, in ? ValueError: empty separator was a bug or a feature. The prevailing opinion at the time seemed to be that there was not a sensible, unique way of defining this operation, so rejecting it was a feature. That answer didn't bother me particularly at the time, but since then I have learned a new fact (or perhaps an old fact that I didn't notice at the time) that has changed my mind: Section 4.2.4 of the library reference says that the 'split' method of a regular expression object is defined as Identical to the split() function, using the compiled pattern. This claim does not appear to be correct: >>> import re >>> re.compile('').split('abcde') ['abcde'] This result differs from the result of using the string split method. In other words, the documentation doesn't match the actual behavior, so the status quo is broken. It seems to me that there are four reasonable courses of action: 1) Do nothing -- the problem is too trivial to worry about. 2) Change string split (and its documentation) to match regexp split. 3) Change regexp split (and its documentation) to match string split. 4) Change both string split and regexp split to do something else :-) My first impulse was to argue that (4) is right, and that the behavior should be as follows >>> 'abcde'.split('') ['a', 'b', 'c', 'd', 'e'] >>> import re >>> re.compile('').split('abcde') ['a', 'b', 'c', 'd', 'e'] When this discussion came up last time, I think there was an objection that s.split('') was ambiguous: What argument is there in favor of 'abcde'.split('') being ['a', 'b', 'c', 'd', 'e'] instead of, say, ['', 'a', 'b', 'c', 'd', 'e', ''] or, for that matter, ['', 'a', '', 'b', '', 'c', '', 'd', '', 'e', '']? I made the counterargument that one could disambiguate by adding the rule that no element of the result could be equal to the delimiter. Therefore, if s is a string, s.split('') cannot contain any empty strings. However, looking at the behavior of regular expression splitting more closely, I become more confused. Can someone explain the following behavior to me? >>> re.compile('a|(x?)').split('abracadabra') ['', None, 'br', None, 'c', None, 'd', None, 'br', None, '']